Claude Sonnet Pricing 2026

Claude Sonnet 4.6: $3.00/MTok input — with 90% prompt caching discount (cache reads at $0.30/MTok). The flagship model for production AI apps. Complete pricing, comparison, and savings calculator.

Claude Sonnet 4.6 — Full Pricing Breakdown

Token Type Price (per 1M tokens) Notes
Input (standard) $3.00 Your prompt + context
Output $15.00 Model-generated response
Cache write $3.75 1.25× input price, one-time per TTL window
Cache read 90% off $0.30 10% of standard input price — the key saving

Claude Sonnet vs Flagship Model Alternatives

Model Input Output Cache Read Context
Gemini 2.0 Flash $0.10 $0.40 $0.025 1M tokens
GPT-4o-mini $0.15 $0.60 $0.075 128k tokens
GPT-4o $2.50 $10.00 $1.25 (50% off) 128k tokens
Claude Sonnet 4.6 Best cache savings $3.00 $15.00 $0.30 (90% off) 200k tokens
Gemini 1.5 Pro $3.50 $10.50 2M tokens
Claude Opus 4.7 $15.00 $75.00 $1.50 (90% off) 200k tokens

When Claude Sonnet Beats GPT-4o on Total Cost

Claude Sonnet has a slightly higher sticker price than GPT-4o — but its 90% caching discount (vs GPT-4o's 50%) flips the equation for production apps with repeated context.

Example: A coding assistant with a 5,000-token system prompt, handling 30,000 daily queries (average 1,000 tokens user input, 500 tokens output):

Model System Prompt Cost/day (cached) User input + output/day Monthly Total
GPT-4o (50% cache) $18.75 (cache read) $225 input + $150 output ~$11,800
Claude Sonnet 4.6 (90% cache) Winner $4.50 (cache read) $90 input + $225 output ~$9,600

The 90% caching discount saves ~$2,200/month vs GPT-4o for this scenario — despite Sonnet's slightly higher standard input rate.

Best Use Cases for Claude Sonnet 4.6

Production AI applications

Claude Sonnet is Anthropic's workhorse model — optimized for reliability, quality, and cost. The default choice for any production workload where Haiku underperforms and Opus's cost isn't justified. Most developers start with Sonnet and never need to upgrade.

Complex code generation

Sonnet consistently outperforms GPT-4o on SWE-bench coding benchmarks. Strong at multi-file refactors, API integration, test generation, and debugging. Supports 200k context for large codebase tasks. Pairs with prompt caching for repeated code scaffolding.

RAG and document analysis

200k context + 90% cache discount makes Sonnet ideal for RAG applications. Cache your document corpus once, then answer thousands of questions at $0.30/MTok instead of $3.00/MTok. Sonnet excels at synthesizing information across long documents.

Multi-step agents

Sonnet's instruction-following fidelity makes it reliable for tool-use agent loops. Better than Haiku at resolving ambiguous instructions and handling edge cases. Lower hallucination rate than GPT-4o on structured task completion. Supports Claude's computer-use tools.

Claude Sonnet Cost Calculator — Quick Reference

Scenario Input Tokens Output Tokens Cost/Call (standard) Cost/Call (cached input)
Code review 1,500 600 $0.0135 $0.0045 (cached sys prompt)
Document Q&A 4,000 400 $0.018 $0.0072
Agent tool call 2,000 200 $0.009 $0.003
Long-form writing 500 2,000 $0.0315 $0.0315

Calculate Your Actual Claude Sonnet Cost

Paste your real prompt and see exact token costs across Claude Sonnet, Opus, Haiku, GPT-4o, and Gemini — with cache savings and monthly volume projections.

Open the LLM Pricing Calculator →

Frequently Asked Questions

How much does Claude Sonnet cost per token?

Claude Sonnet 4.6 costs $3.00 per million input tokens and $15.00 per million output tokens. That's $0.003 per 1,000 input tokens and $0.015 per 1,000 output tokens. With prompt caching enabled, cache reads cost $0.30/MTok — a 90% discount that makes Sonnet highly competitive for repeated-context production workloads.

Is Claude Sonnet the same as Claude 3.5 Sonnet?

Claude Sonnet 4.6 is the current generation, succeeding the Claude 3.5 Sonnet series. The 4.x generation offers improved reasoning, better instruction following, and enhanced computer-use capabilities. Pricing is similar to the 3.5 generation: $3.00/MTok input, $15.00/MTok output. If you're migrating from claude-3-5-sonnet, update your model ID to claude-sonnet-4-6 to access the latest capabilities.

Should I use Claude Sonnet or Claude Haiku?

Start with Claude Haiku ($0.80/MTok) for high-volume tasks where speed and cost matter most. Upgrade to Sonnet ($3.00/MTok) when: output quality isn't meeting requirements, tasks require complex reasoning or nuanced instruction following, or you need better code generation. Many production apps use Haiku for 80% of traffic and Sonnet for complex or high-stakes queries — a routing pattern that cuts costs significantly.

Does Claude Sonnet support extended thinking?

Yes, Claude Sonnet 4.6 supports extended thinking mode — where the model reasons step-by-step before generating its response. Extended thinking tokens are billed as output tokens ($15.00/MTok), so they add cost but improve accuracy on complex reasoning tasks like math, logic puzzles, and multi-step planning. For most production apps, standard mode is sufficient and more cost-effective.

What is Claude Sonnet's context window?

Claude Sonnet 4.6 has a 200,000-token context window — about 150,000 words or roughly 600 pages of text. This is significantly larger than GPT-4o's 128k context. Combined with prompt caching (cache your document corpus at 90% discount), Sonnet handles large-context RAG applications, codebase analysis, and long-document Q&A efficiently.

Also see: Claude Haiku Pricing · Claude Opus Pricing · GPT-4o vs Claude Cost · Gemini API Pricing · LLM Cost Comparison 2026