GPT-5.1 System Card
What this is
GPT-5.1 Instant and GPT-5.1 Thinking are the next iteration of OpenAI's GPT-5 models, with a system card addendum dated November 12, 2025. GPT-5.1 Instant is described as more conversational than the earlier chat model, with improved instruction following and an adaptive reasoning capability that decides when to think before responding. GPT-5.1 Thinking adapts thinking time more precisely to each question. This document is an addendum to the GPT-5 System Card providing updated baseline safety metrics; comprehensive safety mitigations are stated to be "largely the same" as those described in that prior card.
Capabilities
The card does not report capability benchmarks. GPT-5.1 Instant is described as having "improved instruction following and an adaptive reasoning capability that lets it decide when to think before responding." GPT-5.1 Auto routes each query to whichever model is best suited, so that "in most cases, the user does not need to choose a model at all."
Evaluation methodology
Safety evaluations use Production Benchmarks, described as a new, more challenging set built around cases where existing models were not giving ideal responses, drawn from production data; these are "deliberately created to be difficult" and error rates are stated to be "not representative of average production traffic." Jailbreak robustness is measured via an adaptation of the academic StrongReject eval, which inserts known jailbreak prompts into disallowed-content examples and grades output against OpenAI policy. Online A/B testing provides early signal on prevalence of undesired responses in deployment, though with "wide error bars" due to the low prevalence of such responses and small test sizes. The primary offline metric is not_unsafe, checking that model output is not disallowed under relevant OpenAI policy.
Safety testing
Disallowed content evaluations span 13 categories including harassment, hate, sexual content, self-harm, and two new categories — mental health (0.684 for gpt-5.1-thinking, 0.883 for gpt-5.1-instant) and emotional reliance (0.785, 0.945). gpt-5.1-thinking shows "light regressions" relative to gpt-5-thinking on harassment (0.747 vs. 0.815), hate (0.839 vs. 0.883), and sexual content (0.895 vs. 0.906). On jailbreak resistance via StrongReject, gpt-5.1-instant scores 0.976 not_unsafe, outperforming gpt-5-instant-aug15 (0.683) and gpt-5-instant-oct3 (0.850); gpt-5.1-thinking scores 0.967, on par with gpt-5-thinking (0.974). Under the Preparedness Framework, GPT-5.1 is treated as High risk in the Biological and Chemical domain; cybersecurity and AI self-improvement evaluations indicate the models "do not have a plausible chance of reaching a High threshold."
Mitigations
Safety mitigations are stated to be "largely the same as described in the GPT-5 System Card." GPT-5.1 is treated as High risk in the Biological and Chemical domain with corresponding Preparedness Framework safeguards applied. Routing to specific safer models is named as a potential post-launch mitigation if online measurements signal further issues, particularly for mental health and emotional reliance categories.
Deployment and access
GPT-5.1 Instant and GPT-5.1 Thinking are accessible through ChatGPT; GPT-5.1 Auto routes queries automatically so users typically need not select a model. The card does not specify API availability, license terms, pricing, or access tiers beyond what was described in the original GPT-5 System Card.
Limitations
gpt-5.1-thinking shows acknowledged regressions on harassment, hate, and sexual content relative to gpt-5-thinking, with improvements described as in progress. gpt-5.1-instant shows slight regressions on sexual content, violent content, mental health, and emotional reliance relative to gpt-5-instant-oct3. A regression in gpt-5.1-thinking on self-harm prompts with image inputs is noted, with further improvements described as ongoing. Online measurements for mental health, emotional reliance, and self-harm carry low statistical confidence due to the low prevalence of relevant events and small A/B test sizes; OpenAI states it will continue post-launch investigation.
What's new
This addendum introduces updated baseline safety metrics for GPT-5.1 models compared to their GPT-5 predecessors. Two evaluation categories are newly reported: mental health, covering signs of isolated delusions, psychosis, or mania, and emotional reliance, covering unhealthy emotional dependence or attachment to ChatGPT; both were first introduced in a prior GPT-5 system card update on sensitive conversations. Production Benchmarks are described as a new, harder evaluation set introduced to address saturation of earlier standard evaluations.