Model Cards / Anthropic

Claude Sonnet 4.6 System Card

model card28,358 words·123 min read·Mar 31, 2026·Source
Summary

Claude Sonnet 4.6 System Card

A 742-word brief of a 28,358-word document. Published by Anthropic. Version dated Mar 31, 2026.
01

What this is

Claude Sonnet 4.6 is a large language model from Anthropic, released February 17, 2026, succeeding Claude Sonnet 4.5. It is designed as a capable mid-tier model covering coding, agentic tasks, reasoning, multimodal, computer use, and mathematical abilities. On several evaluations it approaches or matches Claude Opus 4.6, Anthropic's frontier model. The card was completed and published under Anthropic's Responsible Scaling Policy prior to public deployment.

02

Capabilities

Claude Sonnet 4.6 achieves 79.6% on SWE-bench Verified, 72.5% on OSWorld-Verified (within 0.2% of Opus 4.6's state-of-the-art 72.7%), 89.9% on GPQA Diamond, 60.42% on ARC-AGI-2 at high effort, and 95.6% on AIME 2025. On BrowseComp it scores 74.72% single-agent and 82.62% in a multi-agent configuration. The model processes text and vision inputs, supports computer use via mouse and keyboard interaction, and accepts context windows up to 1M tokens. Two thinking modes are offered: extended thinking and adaptive thinking, the latter allowing context-dependent reasoning depth via an "effort" parameter controllable by developers.

03

Evaluation methodology

All evaluations were conducted on the final deployed model snapshot unless otherwise noted, with results averaged over 10 trials (25 for SWE-bench) using adaptive thinking and max effort. Multiple intermediate training snapshots—including a helpful-only model with safeguards removed—were tested; the highest score across any snapshot was used for dangerous capability assessments. The card notes that many benchmarks may be contaminated by training data and references decontamination methods described in the Claude Opus 4.5 System Card. Some evaluations were conducted by named external third parties.

04

Safety testing

Anthropic evaluated the model against its Responsible Scaling Policy preliminary assessment protocol, testing CBRN, autonomy (AI R&D), and cyber risk domains across multiple training snapshots. On biological risk, Sonnet 4.6 did not cross the ASL-4 rule-out threshold on short-horizon computational biology tasks. On autonomy, the model "crossed most of the rule-out thresholds we use as early proxies for AI R&D-4 capability," though Anthropic states "we still do not believe that our models fully qualify for AI R&D-4." On cyber risk, the model is close to saturating current evaluations, and the card repeats that "the saturation of our evaluation infrastructure means we can no longer use current benchmarks to track capability progression." Single-turn violative request evaluations across seven languages returned a 99.38% harmless response rate, and alignment audits found "no signs of major concerns around high-stakes forms of misalignment."

05

Mitigations

Sonnet 4.6 is deployed under the ASL-3 Standard and ASL-3 Security Standard for model weights, the same tier as Claude Opus 4.6. Anthropic proactively implemented AI R&D-4 safety measures—publishing a risk case and applying ASL-3 security protections—despite concluding the threshold has not been crossed. System prompt mitigations for claude.ai address suicide and self-harm interactions, directing the model to offer crisis resources without delay and avoid language that validates reluctance to seek professional help. Localized crisis resource banners are surfaced when suicide or self-harm is detected on claude.ai, and developers are encouraged to adopt recommended system prompt language for API deployments serving vulnerable populations.

06

Deployment and access

Claude Sonnet 4.6 is available via the Anthropic API and as part of the claude.ai consumer product, which is restricted to users aged 18 or above. Anthropic Ireland, Limited is the designated provider in the European Economic Area. Enterprise customers deploying the model to minors must comply with additional safeguards under Anthropic's Usage Policy, which governs prohibited uses and requirements for high-risk scenarios.

07

Limitations

The card states that "confidently ruling out" the AI R&D-4 and CBRN-4 thresholds "is becoming increasingly difficult" due to model performance approaching rule-out proxies and fundamental epistemic uncertainty in measurement. Current cyber benchmarks are near-saturated, providing no meaningful capability-progression signal for future models. The internal Real-World Finance evaluation lacks independent third-party validation and does not cover all finance domains; AIME 2025 scores may be inflated by training data contamination. GraphWalks and some long-context MRCR results are not reproducible via the public API because problems exceed its 1M token limit. Performance on low-resource African languages degrades by up to 16.2 percentage points from the model's English baseline.

08

What's new

Relative to Claude Sonnet 4.5, this card introduces the adaptive thinking mode (effort parameter), previously documented only in the Opus 4.6 system card. For the first time in a Sonnet system card, multilingual performance evaluations covering 42 languages (GMMLU) and 10 Indic languages (MILU) are included. New alignment-focused evaluations include external testing with Andon Labs via Vending-Bench 2 and a cross-developer safety comparison using the Petri framework against Gemini 3 Pro, GPT-5.2, Grok 4.1 Fast, and Kimi K2.5. Expanded finance and life sciences capability sections are also new to this Sonnet card.

Generated by Claude sonnet from the cleaned source on Apr 23, 2026. Passages in double quotes are verbatim from the source; other text is neutral paraphrase. For citation, use the original: original document · source SHA 41cc9a90beac.

Extracted Evaluations(46 results)

Sort by:46 evals
BenchmarkCategoryStateScoreVariantSource
OSWorldagentscored72.5-self-reported
Agentic Codingcodingscored100.0without mitigationsself-reported
SWE-benchcodingscored90.0pass@1self-reported
SWE-benchcodingscored80.2prompt modificationself-reported
SWE-benchcodingscored79.6-self-reported
MMLUgeneral_knowledgescored89.3-self-reported
MMMUmultimodalscored75.6with toolsself-reported
MMMUmultimodalscored74.5no toolsself-reported
GDPval-AAotherscored1633.0-self-reported
Child Safety Single-turn Violativeotherscored100.0-self-reported
Suicide and Self-harm Single-turn Violativeotherscored99.7-self-reported
Political Bias Evenhandednessotherscored98.4with system promptself-reported
Suicide and Self-harm Multi-turnotherscored98.0-self-reported
GraphWalks Parents 256K subset of 1Motherscored97.9max contextself-reported
tau2-Bench (Telecom)otherscored97.9-self-reported
GraphWalks Parents 256K subset of 1Motherscored96.964k contextself-reported
Child Safety Multi-turnotherscored95.0-self-reported
tau2-bench Retailotherscored91.7-self-reported
DeepSearchQAotherscored91.1multi-agentself-reported
DeepSearchQAotherscored89.2single-agentself-reported
BrowseComp multi-agentotherscored82.6multi-agent, max effortself-reported
GraphWalks BFS 256K subset of 1Motherscored74.5max contextself-reported
GraphWalks BFS 1Motherscored73.8max contextself-reported
GraphWalks BFS 256K subset of 1Motherscored72.864k contextself-reported
ARC-AGI-2 (Verified)otherscored69.2prior scoreself-reported
GraphWalks BFS 1Motherscored68.464k contextself-reported
Terminal-bench 2.0otherscored65.4-self-reported
Finance Agentotherscored63.3Max Thinkingself-reported
MCP Atlasotherscored62.7prior scoreself-reported
Finance Agentotherscored61.4High Thinkingself-reported
MCP Atlasotherscored61.3-self-reported
Finance Agentotherscored60.0-self-reported
Terminal-bench 2.0otherscored59.1default thinkingself-reported
ARC-AGI-2 (Verified)otherscored58.3-self-reported
HLEotherscored49.0with toolsself-reported
HLEotherscored33.2no toolsself-reported
Political Bias Opposing Perspectivesotherscored32.1with system promptself-reported
Political Bias Refusalsotherscored4.5with system promptself-reported
Higher-difficulty Benign Request Evaluationotherscored0.2overallself-reported
Higher-difficulty Benign Request Evaluationotherscored0.2extended thinkingself-reported
Higher-difficulty Benign Request Evaluationotherscored0.2default thinkingself-reported
Suicide and Self-harm Single-turn Benignotherscored0.2-self-reported
Child Safety Single-turn Benignotherscored0.1-self-reported
GPQA-Diamondreasoningscored89.9-self-reported
BBQsafetyscored97.5ambiguousself-reported
BBQsafetyscored88.1disambiguatedself-reported