Complete cost breakdown for Opus 4.7, Sonnet 4.6 & Haiku 4.5 — including prompt caching, batch API discounts, and real-world cost estimates.
All prices per million tokens (MTok). Prompt caching requires adding cache control headers to your API calls.
| Model | Input ($/MTok) | Output ($/MTok) | Cache Write ($/MTok) | Cache Read ($/MTok) | Context Window |
|---|---|---|---|---|---|
| Claude Haiku 4.5BUDGET | $0.80 | $4.00 | $1.00 | $0.08 | 200K tokens |
| Claude Sonnet 4.6BEST VALUE | $3.00 | $15.00 | $3.75 | $0.30 | 200K tokens |
| Claude Opus 4.7FRONTIER | $15.00 | $75.00 | $18.75 | $1.50 | 200K tokens |
Batch API processes requests asynchronously, typically within 24 hours. Ideal for classification, document processing, and any non-real-time workload.
| Model | Batch Input ($/MTok) | Batch Output ($/MTok) | Savings vs Standard |
|---|---|---|---|
| Claude Haiku 4.5 | $0.40 | $2.00 | 50% off |
| Claude Sonnet 4.6 | $1.50 | $7.50 | 50% off |
| Claude Opus 4.7 | $7.50 | $37.50 | 50% off |
Approximate costs for common tasks. Assumes ~200 token output for simple tasks, ~800 tokens for complex ones.
| Task | Approx Tokens | Haiku 4.5 | Sonnet 4.6 | Opus 4.7 |
|---|---|---|---|---|
| Short classification (100 in / 20 out) | 120 tokens | $0.000096 | $0.00036 | $0.0018 |
| Chat turn (300 in / 200 out) | 500 tokens | $0.0010 | $0.0039 | $0.0195 |
| Code review (2K in / 500 out) | 2,500 tokens | $0.0036 | $0.0135 | $0.0675 |
| Document summary (10K in / 1K out) | 11,000 tokens | $0.013 | $0.045 | $0.225 |
| Long analysis (50K in / 2K out) | 52,000 tokens | $0.048 | $0.180 | $0.900 |
A chatbot with a 2,000-token system prompt, 10,000 daily conversations, 300-token average user message, and 200-token average response:
| Model | Cost Without Caching/day | Cost With Caching/day | Savings/day |
|---|---|---|---|
| Haiku 4.5 | $20.80 | $3.68 | $17.12 (82%) |
| Sonnet 4.6 | $78.00 | $13.80 | $64.20 (82%) |
| Opus 4.7 | $390.00 | $69.00 | $321.00 (82%) |
| Use Case | Recommended Model | Why |
|---|---|---|
| Classification / extraction | Haiku 4.5 | Fast, cheap, sufficient quality |
| Customer support chatbot | Haiku 4.5 (cached) | Repeated system prompt = 82% cost savings |
| Production code generation | Sonnet 4.6 | Best quality/cost balance for coding |
| Complex reasoning / research | Sonnet 4.6 or Opus 4.7 | Depends on task complexity |
| Large document analysis | Sonnet 4.6 (batch) | 200K context + 50% batch discount |
| Frontier performance required | Opus 4.7 | Top benchmark scores, 5× Sonnet cost |
| Data pipeline at scale | Haiku 4.5 (batch) | $0.40/MTok — lowest available rate |
Paste any prompt to see real token counts and costs across all Claude 4 models — plus GPT-4o and Gemini for comparison.
Claude Haiku 4.5: $0.80/MTok input, $4.00/MTok output. Claude Sonnet 4.6: $3.00/MTok input, $15.00/MTok output. Claude Opus 4.7: $15.00/MTok input, $75.00/MTok output. With prompt caching, cache reads cost 10% of the standard input rate — a 90% discount for repeated context.
Claude Sonnet 4.6 ($3.00/MTok) is slightly pricier than GPT-4o ($2.50/MTok) at standard rates. However, Claude's prompt caching is generally more powerful for long-context apps — Sonnet cache reads at $0.30/MTok vs GPT-4o's cached rate. Claude Haiku 4.5 is more expensive than GPT-4o-mini without caching, but competitive with caching enabled.
Add "cache_control": {"type": "ephemeral"} to the content blocks you want cached (typically your system prompt and large document context). The cache has a 5-minute TTL — any request within the window reads from cache at 10% of the normal input price. Cache writes cost 125% of the input rate once per TTL window.
Claude 4 models (Haiku 4.5, Sonnet 4.6, Opus 4.7) are available via the API with pay-as-you-go pricing — there is no persistent free tier for API access. Claude.ai (the consumer app) offers a free tier using Claude Sonnet, but with rate limits. Claude Code includes Claude usage in the Max plan ($100/mo) or via API key.
Cheapest option: Claude Haiku 4.5 with the Batch API = $0.40/MTok input, $2.00/MTok output. Add prompt caching on top and cache reads drop to $0.04/MTok — the absolute floor. For real-time applications: Haiku 4.5 with caching at $0.80 standard / $0.08 cached input.