Anthropic shipped Claude Sonnet 5 (model id: claude-sonnet-5) on June 30, 2026 — and it is not an incremental update. It ships with a 1M-token context window (5x the 200k window of the previous Sonnet generation), adaptive thinking on by default, and agentic coding performance that Anthropic says previously required larger, more expensive models — all at Sonnet's speed and price. Here is a factual breakdown.
What Is Claude Sonnet 5 and How It Fits in the Lineup
The current Claude lineup has four tiers: Claude Fable 5 (Anthropic's most capable widely released model), Claude Opus 4.8 (complex agentic coding and enterprise work), Claude Sonnet 5 (the balance of speed and intelligence), and Claude Haiku 4.5 (fastest, near-frontier intelligence). Sonnet 5 replaces the previous Sonnet generation as the default choice for most production AI applications — chatbots, document processing, code assistants, RAG pipelines — because it now matches reasoning quality that used to require Opus-class models, at Sonnet-level pricing.
Key Improvements Over the Previous Sonnet Generation
1M-token context window. This is the headline change. Claude Sonnet 5 ships with a 1,000,000-token context window — up from 200k tokens on the previous Sonnet generation — meaning it can hold roughly 750,000 words (about 3.4 million characters) in context at once. For RAG pipelines and document-processing workloads, this changes the calculus: far more source material can be passed directly in-context instead of relying entirely on retrieval and chunking.
Adaptive thinking. Sonnet 5 uses adaptive thinking by default (thinking: {type: "adaptive"}) instead of the older manual extended-thinking mode with a fixed budget_tokens. Claude itself decides whether and how much to reason before responding, based on the complexity of the request — and you can steer that with an effort parameter (low, medium, high, max, xhigh) rather than guessing a token budget. Manual thinking: {type: "enabled", budget_tokens: N} is no longer accepted on Sonnet 5 and is rejected with a 400 error.
Coding and agentic workflows. This is where Anthropic is positioning Sonnet 5 most aggressively — as its most agentic Sonnet yet. It handles sustained, multi-step coding tasks (multi-file refactors, debugging across a real codebase, tool-use chains) with less need for human intervention mid-task, and reasons through complex, messy technical contexts more reliably than the previous generation.
Browser and computer use. Sonnet 5 builds on Sonnet's earlier lead in computer-use capabilities, handling browser-based workflows (competitive research, procurement, customer onboarding) with better accuracy and reliability.
API Usage: What Changes
The model identifier is claude-sonnet-5. If you are calling the Anthropic API directly:
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-sonnet-5",
max_tokens=16000,
thinking={"type": "adaptive"},
messages=[{"role": "user", "content": "Your prompt here"}]
)
If you were using manual budget_tokens on a previous Sonnet model, that parameter is rejected on Sonnet 5 — switch to thinking: {"type": "adaptive"} and use the effort parameter to control reasoning depth instead.
Pricing
Claude Sonnet 5 launched at an introductory price of $2 per million input tokens and $10 per million output tokens through August 31, 2026, moving to standard pricing of $3/$15 per MTok after that — the same standard rate as the previous Sonnet generation. Prompt caching can cut costs up to 90%, and batch processing up to 50%, on top of that. US-only inference (for data residency requirements) is available at 1.1x the standard pricing.
How We Are Using It at DIGIT
We migrated our RAG pipelines and document-processing workflows to Sonnet 5 on launch day. The 1M-token context window is the immediate practical win — for mid-sized knowledge bases, we can now pass significantly more retrieved context directly into a single call before needing more aggressive chunking or re-ranking. For our coding-assistant integrations, Sonnet 5's adaptive thinking means we no longer have to hand-tune a budget_tokens value per task type — the model allocates reasoning depth on its own, and the effort parameter gives us a simple lever when we want to bias it explicitly.
Should You Upgrade?
If you are running Claude Sonnet 4.5 or 4.6 in production: yes, upgrade — the 1M-token context window alone materially changes what's possible for RAG and document-heavy workloads, and adaptive thinking removes a class of manual tuning. If you are on an older or larger model because Sonnet previously wasn't good enough for a task: test Sonnet 5 first, since Anthropic is explicitly positioning it as matching performance that used to require larger models.
If you want help integrating Claude Sonnet 5 into a product, building a RAG pipeline that takes advantage of the 1M-token context window, or migrating an existing LLM workflow, reach out at info@digit.com.pk.