Claude 3.5 Model Card Addendum
What this is
Claude 3.5 Sonnet is a large language model released by Anthropic, documented in an addendum to the Claude 3 Model Card rather than a new standalone card. It supersedes Claude 3 Opus as Anthropic's most capable publicly released model while operating faster and at lower cost. The document version date is 2026-03-31.
Capabilities
Claude 3.5 Sonnet scores 59.4% on GPQA Diamond (0-shot CoT), 92.0% on HumanEval (0-shot), and 90.4% on MMLU (5-shot CoT), outperforming Claude 3 Opus on all benchmarks tested. On five vision benchmarks it achieves 95.2% on DocVQA, 90.8% on ChartQA, and 94.7% on AI2D. On an internal agentic coding evaluation it solves 64% of problems versus 38% for Claude 3 Opus, with tasks drawn from real pull requests to open source codebases. The model supports a 200k token context window, achieving 99.7% average recall on the Needle in a Haystack evaluation at that length.
Evaluation methodology
Anthropic tested the model on industry-standard benchmarks covering reasoning, math, coding, reading comprehension, and vision, including GPQA, MMLU, HumanEval, MATH, MathVista, ChartQA, DocVQA, and BIG-Bench Hard. Human preference win rates were gathered by asking raters to chat with models and score them on task-specific criteria, using Claude 3 Opus as the 50% baseline. Refusal calibration was measured using the Wildchat and XSTest datasets. The agentic coding evaluation ran in a secure sandboxed environment without internet access, with acceptance tests not visible to the model.
Safety testing
Anthropic conducted evaluations in CBRN, cybersecurity, and autonomous capabilities, with quantitative thresholds of concern defined for each domain that, if exceeded, would have triggered a council review and potential ASL escalation. The UK AISI conducted pre-deployment testing on a near-final model and shared results with the US AI Safety Institute under a bilateral Memorandum of Understanding. METR performed "an initial exploration of the model's autonomy-relevant capabilities." The card states "we observed an increase in capabilities in risk-relevant areas compared to Claude 3 Opus" but that Claude 3.5 Sonnet did not exceed any preset thresholds. The model does not trigger the 4x effective compute threshold that would require Anthropic's full RSP evaluation protocol, though Anthropic states it conducts safety testing voluntarily regardless.
Mitigations
Claude 3.5 Sonnet is trained with Helpful, Honest, and Harmless (HHH) alignment, yielding a 96.4% correct refusal rate on toxic Wildchat prompts and only a 1.7% incorrect refusal rate on XSTest. The card states the model "refused ASL-3 harmful queries adequately to substantially reduce its usefulness for clearly harmful queries." Anthropic classifies the model at ASL-2 under its Responsible Scaling Policy, indicating it does not pose risk of catastrophic harm. No additional deployment-time classifiers or tiered access controls are described in this document.
Deployment and access
The card does not disclose licensing terms, API pricing structure, or specific access restrictions. It notes the model is available at faster speed and lower cost than Claude 3 Opus. No further deployment details are provided in this document.
Limitations
The card acknowledges that HHH alignment training "may cause it to hedge concerning answers," meaning capability evaluations could underestimate an equivalent helpful-only model's performance. Anthropic states "we still have a lot to learn about the science of evaluations and ideal cadences for performing these tests." No other performance limitations or failure modes are flagged by the lab in this document.
What's new
This document is an addendum to the Claude 3 Model Card and does not contain a versioned changelog. Relative to Claude 3 Opus, the key reported deltas are a 26-percentage-point gain on the internal agentic coding evaluation (64% vs. 38%), improved scores across all five vision benchmarks tested, and substantially better refusal calibration on XSTest (1.7% incorrect refusals vs. 8.3% for Opus). Human raters preferred Claude 3.5 Sonnet over Opus across all evaluated task categories, with domain-expert win rates reaching 82% in Law and 73% in Finance and Philosophy.