Model Card Explorer

Summary

Mistral Large 2 Release

A 499-word brief of a 979-word document. Published by Mistral AI. Version dated Apr 17, 2026.

What this is

Mistral Large 2 is Mistral AI's second-generation flagship language model, announced July 24, 2024. It supersedes Mistral Large and carries 123 billion parameters, designed for single-node inference with long-context applications. Its stated purpose is to advance cost-efficient performance in code generation, mathematics, reasoning, and multilingual use cases.

Capabilities

The pretrained model achieves 84.0% accuracy on MMLU, which Mistral claims sets a new point on the performance/cost Pareto front among open models. It supports a 128k context window, dozens of natural languages including French, German, Arabic, Chinese, Japanese, and Korean, and 80+ coding languages including Python, Java, C++, and Bash. On code and math benchmarks (MultiPL-E, GSM8K 8-shot, MATH 0-shot no CoT), Mistral reports performance on par with GPT-4o, Claude 3 Opus, and Llama 3 405B. The model also supports parallel and sequential function calling for use in complex agentic pipelines.

Evaluation methodology

Mistral evaluated the model across general, code, math, alignment, and multilingual benchmarks using a shared internal evaluation pipeline; comparisons against external models were run through the same pipeline except where the source notes a "paper" row in MultiPL-E results. Alignment was measured on MT-Bench, Wild Bench, and Arena Hard. Math reasoning was assessed on GSM8K (8-shot) and MATH (0-shot, no chain-of-thought). Multilingual performance was measured on multilingual MMLU against the base pretrained model. No details on contamination controls or external auditors are disclosed.

Safety testing

The card does not discuss red-teaming, catastrophic-risk evaluations, or CBRN/cyber/autonomy assessments. The document's only safety-adjacent claim is that training was designed to reduce hallucinations and that the model was fine-tuned to acknowledge when it lacks sufficient information.

Mitigations

Mistral states the model was fine-tuned to be "more cautious and discerning in its responses" to reduce hallucination. The model is trained to surface uncertainty rather than generate confident but incorrect outputs. No classifier thresholds, content filters, ASL/FSF tiers, or refusal-training details are disclosed.

Deployment and access

Mistral Large 2 is available immediately on la Plateforme under the API identifier mistral-large-2407 and on the le Chat interface. Instruct-model weights are available for download and hosted on HuggingFace. The model is released under the Mistral Research License for non-commercial use; commercial self-deployment requires a separate Mistral Commercial License. Cloud access is available through Google Cloud Vertex AI (Managed API), Azure AI Studio, Amazon Bedrock, and IBM watsonx.ai. Fine-tuning on la Plateforme is extended to Mistral Large, Mistral Nemo, and Codestral starting on the release date.

Limitations

The card does not explicitly enumerate unresolved limitations or failure modes. The only acknowledged gap is a prior tendency to hallucinate, which training partially addressed; whether the mitigation is complete is not quantified.

What's new

Mistral Large 2 is versioned 24.07 under Mistral's YY.MM scheme, replacing the prior Mistral Large. Key deltas over its predecessor include substantially higher code benchmark scores, improved multilingual MMLU results, enhanced instruction-following in long multi-turn conversations, and added parallel and sequential function-calling support. Mistral also announces a platform consolidation to two general-purpose models (Mistral Nemo and Mistral Large) and two specialist models (Codestral and Embed), with older Apache models remaining available for self-hosted deployment and fine-tuning.

Benchmark	Category	State	Score	Setup	Source
	knowledge	scored	84.0% accuracy	pretrainedmissing: shot countmissing: methodmissing: language	self-reported
/ multilingual	knowledge	mentioned	— accuracy	Averagepretrainedmissing: shot countmissing: method	self-reported
	math	mentioned	— accuracy	8-shotmissing: methodmissing: languagemissing: training state	self-reported
	math	mentioned	— accuracy	0-shotmissing: methodmissing: languagemissing: training state	self-reported
	other	mentioned	—	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
	other	mentioned	— accuracy	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
	other	mentioned	—	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported
	other	mentioned	—	missing: shot countmissing: methodmissing: languagemissing: training state	self-reported

Mistral Large 2 Release

Mistral Large 2 Release

What this is

Capabilities

Evaluation methodology

Safety testing

Mitigations

Deployment and access

Limitations

What's new

Extracted Evaluations(8 results)