GPT-5.1 System Card

model card1,347 words·6 min read·Mar 31, 2026·Source

Summary

GPT-5.1 System Card

A 645-word brief of a 1,347-word document. Published by OpenAI. Version dated Mar 31, 2026.

What this is

GPT-5.1 Instant and GPT-5.1 Thinking are the next iteration of OpenAI's GPT-5 models, with a system card addendum dated November 12, 2025. GPT-5.1 Instant is described as more conversational than the earlier chat model, with improved instruction following and an adaptive reasoning capability that decides when to think before responding. GPT-5.1 Thinking adapts thinking time more precisely to each question. This document is an addendum to the GPT-5 System Card providing updated baseline safety metrics; comprehensive safety mitigations are stated to be "largely the same" as those described in that prior card.

Capabilities

The card does not report capability benchmarks. GPT-5.1 Instant is described as having "improved instruction following and an adaptive reasoning capability that lets it decide when to think before responding." GPT-5.1 Auto routes each query to whichever model is best suited, so that "in most cases, the user does not need to choose a model at all."

Evaluation methodology

Safety evaluations use Production Benchmarks, described as a new, more challenging set built around cases where existing models were not giving ideal responses, drawn from production data; these are "deliberately created to be difficult" and error rates are stated to be "not representative of average production traffic." Jailbreak robustness is measured via an adaptation of the academic StrongReject eval, which inserts known jailbreak prompts into disallowed-content examples and grades output against OpenAI policy. Online A/B testing provides early signal on prevalence of undesired responses in deployment, though with "wide error bars" due to the low prevalence of such responses and small test sizes. The primary offline metric is not_unsafe, checking that model output is not disallowed under relevant OpenAI policy.

Safety testing

Disallowed content evaluations span 13 categories including harassment, hate, sexual content, self-harm, and two new categories — mental health (0.684 for gpt-5.1-thinking, 0.883 for gpt-5.1-instant) and emotional reliance (0.785, 0.945). gpt-5.1-thinking shows "light regressions" relative to gpt-5-thinking on harassment (0.747 vs. 0.815), hate (0.839 vs. 0.883), and sexual content (0.895 vs. 0.906). On jailbreak resistance via StrongReject, gpt-5.1-instant scores 0.976 not_unsafe, outperforming gpt-5-instant-aug15 (0.683) and gpt-5-instant-oct3 (0.850); gpt-5.1-thinking scores 0.967, on par with gpt-5-thinking (0.974). Under the Preparedness Framework, GPT-5.1 is treated as High risk in the Biological and Chemical domain; cybersecurity and AI self-improvement evaluations indicate the models "do not have a plausible chance of reaching a High threshold."

Mitigations

Safety mitigations are stated to be "largely the same as described in the GPT-5 System Card." GPT-5.1 is treated as High risk in the Biological and Chemical domain with corresponding Preparedness Framework safeguards applied. Routing to specific safer models is named as a potential post-launch mitigation if online measurements signal further issues, particularly for mental health and emotional reliance categories.

Deployment and access

GPT-5.1 Instant and GPT-5.1 Thinking are accessible through ChatGPT; GPT-5.1 Auto routes queries automatically so users typically need not select a model. The card does not specify API availability, license terms, pricing, or access tiers beyond what was described in the original GPT-5 System Card.

Limitations

gpt-5.1-thinking shows acknowledged regressions on harassment, hate, and sexual content relative to gpt-5-thinking, with improvements described as in progress. gpt-5.1-instant shows slight regressions on sexual content, violent content, mental health, and emotional reliance relative to gpt-5-instant-oct3. A regression in gpt-5.1-thinking on self-harm prompts with image inputs is noted, with further improvements described as ongoing. Online measurements for mental health, emotional reliance, and self-harm carry low statistical confidence due to the low prevalence of relevant events and small A/B test sizes; OpenAI states it will continue post-launch investigation.

What's new

This addendum introduces updated baseline safety metrics for GPT-5.1 models compared to their GPT-5 predecessors. Two evaluation categories are newly reported: mental health, covering signs of isolated delusions, psychosis, or mania, and emotional reliance, covering unhealthy emotional dependence or attachment to ChatGPT; both were first introduced in a prior GPT-5 system card update on sensitive conversations. Production Benchmarks are described as a new, harder evaluation set introduced to address saturation of earlier standard evaluations.

Extracted Evaluations(100 results)

Sort by:0/100 rows fully reproducible (0%)

Benchmark	Category	State	Score	Setup	Source
Image Input Evaluations/ attack_planning	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
Image Input Evaluations/ attack_planning	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ extremism	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ extremism	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
Image Input Evaluations/ attack_planning	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
Image Input Evaluations/ attack_planning	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
Image Input Evaluations/ attack_planning	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
Image Input Evaluations/ illicit	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ personal_data	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ personal_data	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ personal_data	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
Image Input Evaluations/ harms_erotic	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
Image Input Evaluations/ harms_erotic	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
Image Input Evaluations/ extremism	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
Image Input Evaluations/ harms_erotic	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
Image Input Evaluations/ illicit	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
Image Input Evaluations/ extremism	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
Image Input Evaluations/ hate	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
Image Input Evaluations/ illicit	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
Image Input Evaluations/ extremism	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
Image Input Evaluations/ harms_erotic	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
Image Input Evaluations/ hate	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
Image Input Evaluations/ harms_erotic	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ extremism	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ emotional_reliance	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
Image Input Evaluations/ illicit	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
Image Input Evaluations/ extremism	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
Image Input Evaluations/ extremism	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
Image Input Evaluations/ hate	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
Image Input Evaluations/ self_harm	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
Image Input Evaluations/ hate	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
Image Input Evaluations/ illicit	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
Image Input Evaluations/ hate	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ self_harm_instructions	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ extremism	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
Image Input Evaluations/ self_harm	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
Image Input Evaluations/ self_harm	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ personal_data	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ personal_data	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ sexual_minors	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
Image Input Evaluations/ self_harm	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ self_harm_intent	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ self_harm_intent	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ sexual_minors	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ illicit_violent	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ violence	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ sexual_minors	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ sexual	other	scored	1.0 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ self_harm_instructions	other	scored	0.9 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ self_harm_instructions	other	scored	0.9 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ violence	other	scored	0.9 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ emotional_reliance	other	scored	0.9 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ mental_health	other	scored	0.9 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ self_harm_instructions	other	scored	0.9 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ violence	other	scored	0.9 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
Image Input Evaluations/ self_harm	other	scored	0.9 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ illicit_violent	other	scored	0.9 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ violence	other	scored	0.9 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ extremism	other	scored	0.9 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ illicit_violent	other	scored	0.9 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ sexual	other	scored	0.9 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ self_harm_intent	other	scored	0.9 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ sexual	other	scored	0.9 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ sexual_minors	other	scored	0.9 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ hate	other	scored	0.9 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ sexual	other	scored	0.9 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ self_harm_intent	other	scored	0.9 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ self_harm_intent	other	scored	0.9 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ mental_health	other	scored	0.9 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ hate	other	scored	0.9 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ illicit_non_violent	other	scored	0.9 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ sexual_minors	other	scored	0.9 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ illicit_violent	other	scored	0.9 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ illicit_non_violent	other	scored	0.9 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ self_harm_instructions	other	scored	0.9 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ illicit_non_violent	other	scored	0.9 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
	other	scored	0.8 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ hate	other	scored	0.8 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ harassment	other	scored	0.8 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ violence	other	scored	0.8 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ harassment	other	scored	0.8 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ emotional_reliance	other	scored	0.8 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ illicit_non_violent	other	scored	0.8 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ hate	other	scored	0.8 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ emotional_reliance	other	scored	0.8 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ illicit_violent	other	scored	0.8 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ sexual	other	scored	0.8 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ harassment	other	scored	0.7 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ harassment	other	scored	0.7 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ hate	other	scored	0.7 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ illicit_non_violent	other	scored	0.7 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ emotional_reliance	other	scored	0.7 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ mental_health	other	scored	0.7 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
	other	scored	0.7 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ harassment	other	scored	0.7 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ mental_health	other	scored	0.5 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
/ mental_health	other	scored	0.3 not unsafe	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported