Improving Model Safety Behavior with Rule-Based Rewards
We've developed and applied a new method leveraging Rule-Based Rewards (RBRs) that aligns models to behave safely without extensive human data collection.
We've developed and applied a new method leveraging Rule-Based Rewards (RBRs) that aligns models to behave safely without extensive human data collection.
OpenAI is committed to making intelligence as broadly accessible as possible. Today, we're announcing GPT‑4o mini, our most cost-efficient small model. We expect GPT‑4o mini will significantly expand the range of applications built with AI by making...
Compliance API integrations, SCIM, and GPT controls to support compliance programs, data security, and user access at scale
Discover how prover-verifier games improve the legibility of language model outputs, making AI solutions clearer, easier to verify, and more trustworthy for both humans and machines.
OpenAI and Los Alamos National Laboratory are working to develop safety evaluations to assess and measure biological capabilities and risks associated with frontier models.
CriticGPT, a model based on GPT-4, writes critiques of ChatGPT responses to help human trainers spot mistakes during RLHF
We’re partnering with TIME and its 101 years of archival content to enhance responses and provide links to stories on Time.com
Highlighting innovative research and AI integration in cybersecurity
We present a holistic approach to building a robust and useful natural language classification system for real-world content moderation.
Consistency models are a nascent family of generative models that can sample high quality data in one step without the need for adversarial training.
Diffusion models have significantly advanced the fields of image, audio, and video generation, but they depend on an iterative sampling process that causes slow generation.