Model Card Explorer

Summary

Gemini 3 Pro Model Card

A 622-word brief of a 2,179-word document. Published by Google DeepMind. Version dated Mar 31, 2026.

What this is

Gemini 3 Pro is Google DeepMind's most advanced model for complex tasks, released November 2025, superseding Gemini 2.5 Pro. It is a sparse mixture-of-experts (MoE) transformer with native multimodal support for text, audio, images, video, and entire code repositories. It is not a modification or fine-tune of a prior model. An optional Deep Think mode is available at inference time to enhance complex problem-solving performance.

Capabilities

The model accepts text, images, audio, and video inputs with a 1M token context window and produces up to 64K tokens of text output. It is designed for agentic performance, advanced coding, long-context understanding, algorithmic development, and multilingual tasks. Benchmark tables are referenced in the card but numeric scores are not reproduced in the source text; the lab states Gemini 3 Pro "significantly outperforms Gemini 2.5 Pro across a range of benchmarks requiring enhanced reasoning and multimodal capabilities."

Evaluation methodology

The model was evaluated across reasoning, multimodal, agentic tool use, multilingual, and long-context benchmarks, with full methodology published separately at deepmind.com/models/evals-methodology/gemini-3-pro. Safety evaluations combined continuous automated pipelines during and after training with human red teaming conducted by specialist teams external to the model development team. The lab notes that updated evaluation methods mean results are not directly comparable to scores reported in previous Gemini model cards.

Safety testing

Testing included human red teaming by external specialist teams, automated red teaming at scale, continuous training-phase evaluations, and pre-release Ethics and Safety Reviews aligned with Google's AI Principles and Frontier Safety Framework (FSF, September 2025). FSF evaluations found no critical capability levels (CCLs) reached across CBRN, cybersecurity, harmful manipulation, ML R&D, and misalignment domains. On cybersecurity, the model solved 11/12 v1 hard challenges but 0/13 v2 challenges end-to-end; the alert threshold was met but the CCL was not reached. On CBRN, the model "provides accurate and occasionally actionable information but generally fails to offer novel or sufficiently complete and detailed instructions to significantly enhance the capabilities of low to medium resourced threat actors." Deep Think mode evaluations yielded results consistent with the base model assessment across both safety and FSF domains.

Mitigations

Mitigations span the full lifecycle: dataset filtering, conditional pre-training, supervised fine-tuning, reinforcement learning from human and critic feedback, safety policies and desiderata, and product-level safety filtering. Automated safety evaluations show a -10.4% regression in text-to-text safety versus Gemini 2.5 Pro, though manual review confirmed losses were "overwhelmingly either a) false positives or b) not egregious." Child safety evaluations satisfied required launch thresholds developed by expert teams. The main acknowledged residual risks are jailbreak vulnerability ("improved compared to Gemini 2.5 Pro but still an open research problem") and possible degradation in multi-turn conversations.

Deployment and access

Gemini 3 Pro is distributed via the Gemini App, Google Cloud/Vertex AI, Google AI Studio, the Gemini API, Google AI Mode, and Google Antigravity. No specific hardware or software is required to use the model. Use is governed by the Gemini API Additional Terms of Service, the Google Cloud Platform Terms of Service, and Google's Generative AI Prohibited Use Policy, which bars integration into systems engaging in dangerous, illicit, harmful, or misinformation activities.

Limitations

The lab flags hallucinations and occasional slowness or timeout issues as known limitations of the model. The knowledge cutoff date is January 2025. Jailbreak vulnerability and possible degradation in multi-turn conversations are identified as the main open risks.

What's new

Gemini 3 Pro (released November 2025, model card updated December 2025) supersedes Gemini 2.5 Pro and is not a modification or fine-tune of a prior model. The release introduces Deep Think mode, an optional inference-time setting for enhanced complex problem-solving. Compared to Gemini 2.5 Pro, the scope of red teaming was expanded to cover more potential issues outside strict policies, and this card provides more detail on training dataset composition and distribution than prior cards in the Gemini series.

Benchmark	Category	State	Score	Setup	Source
key skills benchmark/ v1_hard	other	scored	11.0	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
Misalignment/ situational_awareness	other	scored	3.0	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
Misalignment/ stealth	other	scored	1.0	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
	other	scored	0.2	Averagemissing: shot countmissing: methodmissing: training state	self-reported
key skills benchmark/ v2	other	scored	0.0	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
	other	mentioned	—	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported

Gemini 3 Pro Model Card

Gemini 3 Pro Model Card

What this is

Capabilities

Evaluation methodology

Safety testing

Mitigations

Deployment and access

Limitations

What's new

Extracted Evaluations(6 results)