Complete Anthropic vs OpenAI API cost comparison — per token, per call, per month, with caching (May 2025)
| Claude Model | Input | Output | Cache Read | Context | vs GPT Equivalent |
|---|---|---|---|---|---|
| Claude Haiku 3.5 | $0.80/M | $4.00/M | $0.08/M | 200k | GPT-4o-mini ($0.15/M in, $0.60/M out) |
| Claude Sonnet 3.5 | $3.00/M | $15.00/M | $0.30/M | 200k | GPT-4o ($5.00/M in, $15.00/M out) |
| Claude Opus 3.7 | $15.00/M | $75.00/M | $1.50/M | 200k | GPT-4-turbo ($10.00/M in, $30.00/M out) |
| GPT Model | Input | Output | Cache Read | Context | vs Claude Equivalent |
|---|---|---|---|---|---|
| GPT-4o-mini | $0.15/M | $0.60/M | $0.075/M (auto) | 128k | Claude Haiku 3.5 ($0.80/M in, $4.00/M out) |
| GPT-4o | $5.00/M | $15.00/M | $2.50/M (auto) | 128k | Claude Sonnet 3.5 ($3.00/M in, $15.00/M out) |
| GPT-4-turbo | $10.00/M | $30.00/M | N/A | 128k | Claude Opus 3.7 ($15.00/M in, $75.00/M out) |
| Use Case | Winner | Why |
|---|---|---|
| High-volume simple tasks (classification, extraction) | GPT-4o-mini | Cheapest input at $0.15/M; 5× cheaper than Claude Haiku |
| Input-heavy workloads (RAG, doc analysis) | Claude Sonnet | $3/M input vs $5/M for GPT-4o — 40% cheaper on input |
| Multi-turn agents with large system prompts | Claude (any) | 90% cache read discount vs 50% for OpenAI auto-cache |
| Long document processing (>128k tokens) | Claude Sonnet/Opus | 200k context vs GPT-4o's 128k — fits longer docs without chunking |
| Vision/image understanding | Comparable | Both GPT-4o and Claude Sonnet handle images well; GPT-4o has more ecosystem integrations |
| Coding and reasoning tasks | Claude Sonnet/Opus | Leads most coding benchmarks (SWE-bench, HumanEval) at cheaper price than GPT-4o |
| Cheapest flagship model | Claude Sonnet | $3/M input vs $5/M for GPT-4o; equivalent output quality tier |
Stop estimating — paste your real prompt and see the exact per-call and monthly cost across Claude and GPT models simultaneously.
Open the LLM Cost Calculator →Claude Sonnet 3.5 is cheaper on input: $3/M vs $5/M for GPT-4o (40% savings). Output cost is the same at $15/M. For low-complexity tasks, GPT-4o-mini ($0.15/M input) is cheaper than Claude Haiku ($0.80/M) at standard rates, though Claude's prompt caching can close this gap for agent workloads.
Yes, for workloads with large, repeated context. Claude's cache reads cost only 0.1× the input rate (90% off) vs OpenAI's automatic 50% discount. For a 10,000-token system prompt reused across 1,000 calls: Claude cache cost = $0.30 vs OpenAI = $25.00 — an 80× difference.
Claude Sonnet 3.5 offers better price-quality for most use cases: 40% cheaper input, comparable output quality, 200k context vs 128k, and deeper caching discounts. GPT-4o has advantages in ecosystem integrations, plugin ecosystem, and some vision workloads.
GPT-4o-mini wins on raw price ($0.15/M input vs $0.80/M for Haiku). Choose Claude Haiku when: (1) you have large repeated system prompts (cache reads at $0.08/M become cheaper), (2) you need the full 200k context, or (3) you prefer explicit caching control for predictable billing.
Also see: GPT-4o vs Claude Side-by-Side · OpenAI API Cost Calculator · Gemini API Pricing · LLM Cost Comparison 2026 · Claude Haiku Pricing