Model Cards / Anthropic

Claude's Character

constitution1,731 words·8 min read·May 3, 2026·Source
Version History
Chaptered summary is still being generated for this document. Showing a heuristic brief in the meantime.
Summary
1,731-word document condensed to 71 words. Anthropic · May 3, 2026
TL;DR

Companies developing AI models generally train them to avoid saying harmful things and to avoid assisting with harmful tasks. The goal of this is to train models to behave in ways that are "harmless".

Capability claim
  • We trained these traits into Claude using a "character" variant of our [Constitutional AI](https://arxiv.org/abs/2212.08073) training.
Deployment scope
  • available to us. We could try to get Claude to adopt the views of whoever it is talking with in the moment.

Every italicized passage is a verbatim substring of the source document (checked deterministically after extraction). Field selection is heuristic — some quotes may lack surrounding context and some claims may be absent if no matching pattern appeared. For citation, open the source: original constitution · source SHA dbd50d107e11 · version dated May 3, 2026.