Gemini 2.0 Flash Model Card
What this is
Gemini 2.0 Flash is a multimodal language model released by Google DeepMind, with its model card published April 15, 2025. It is a member of the Gemini 2.0 series, designed to power agentic systems, and improves upon Gemini 1.5 Flash with enhanced quality at comparable speeds. It is positioned as an upgrade path for Gemini 1.5 Flash users seeking better quality and for Gemini 1.5 Pro users who require lower latency.
Capabilities
Gemini 2.0 Flash accepts text, images, audio, and video inputs within a 1,048,576-token context window and produces text outputs up to 8,192 tokens; image outputs are experimental as of the card's publication date. It scores 77.6% on MMLU-Pro, 90.9% on MATH, 60.1% on GPQA Diamond, 71.7% on MMMU, and 29.9% on SimpleQA, outperforming Gemini 1.5 Pro on most of these benchmarks. The model also supports a Multimodal Live API enabling low-latency bidirectional voice and video interaction, and shows improvements in coding, complex instruction following, and function calling.
Evaluation methodology
Gemini 2.0 Flash was evaluated against a suite of public performance benchmarks, with results compared directly to Gemini 1.5 Flash, Gemini 1.5 Pro, and Gemini 2.0 Flash-Lite. Internal safety evaluations during training report scores as absolute percentage change relative to Gemini 1.5 Pro 002, where a decrease indicates reduced violation rates and a positive increase in tone indicates improvement. Assurance evaluations use held-out prompt sets, kept separate from the model team, to prevent overfitting and preserve their value for release decision-making.
Safety testing
Safety evaluation included human red teaming by specialist teams, automated red teaming at scale, assurance evaluations conducted by teams outside the model development group, and Frontier Safety Framework (FSF) evaluations per Google DeepMind's FSF. Google DeepMind's Responsibility and Safety Council (RSC) reviewed ethics and safety assessments and made release decisions. Automated safety results versus Gemini 1.5 Pro 002 show text-to-text safety at -1.0% (lower violations), multilingual safety at -1.0%, and image-to-text at +1.50%, indicating a small regression in that modality, though overall violation rates remained low. The card does not report specific CBRN, cyber, or autonomy-risk evaluation results.
Mitigations
Safety and responsibility mitigations were applied across the full training and deployment lifecycle. These include dataset filtering, conditional pre-training, supervised fine-tuning, reinforcement learning from human and critic feedback, safety policies and desiderata, and product-level safety filtering. The Gemini 2.0 family displays lower violation rates across most modalities than Gemini 1.5 Pro, which was itself described as a significant improvement over Gemini 1.0.
Deployment and access
Gemini 2.0 Flash is generally available (GA) as of the card's publication date. It is accessible via Google's Gemini API and is intended for real-time streaming and daily task use cases. The card does not specify a license type or explicit access restrictions beyond Google's standard content policies.
Limitations
The card flags hallucinations, limited causal understanding, complex logical deduction, and counterfactual reasoning as known general limitations of the model. The knowledge cutoff date is June 2024. The main identified safety limitations are over-refusals — where the model refuses answering benign prompts — and a refusal tone that can still come across as "preachy," though tone has improved relative to Gemini 1.5.
What's new
Gemini 2.0 Flash introduces refined architectural design and novel optimization methods on top of the sparse Mixture-of-Experts Transformer used in Gemini 1.5, yielding improvements in training stability and computational efficiency. The Multimodal Live API, enabling low-latency bidirectional voice and video interactions, is new to this generation. Experimental image output capability is introduced, not present in Gemini 1.5 Flash.