Side-by-side API pricing per token — GPT-4o, GPT-4o-mini, Claude Sonnet, Claude Haiku, Claude Opus (May 2025)
| Model | Provider | Input (per 1M) | Output (per 1M) | Cache Read |
|---|---|---|---|---|
| GPT-4o | OpenAI | $5.00 | $15.00 | $2.50 (auto) |
| Claude Sonnet 3.5 40% cheaper input | Anthropic | $3.00 | $15.00 | $0.30 (explicit) |
| GPT-4o-mini | OpenAI | $0.15 | $0.60 | $0.075 (auto) |
| Claude Haiku 3.5 | Anthropic | $0.80 | $4.00 | $0.08 (explicit) |
| GPT-4o (128k context) | OpenAI | $5.00 | $15.00 | N/A |
| Claude Opus 3.7 Most capable | Anthropic | $15.00 | $75.00 | $1.50 (explicit) |
Claude Sonnet 3.5 beats GPT-4o on input cost by 40% ($3 vs $5/M tokens). For input-heavy apps like RAG or document analysis, Sonnet saves money at equivalent quality.
GPT-4o-mini wins on sticker price: $0.15/M input vs Claude Haiku's $0.80/M. For simple classification or extraction tasks with no repeated context, GPT-4o-mini is the most affordable.
Claude's explicit prompt caching (90% off cache reads) beats OpenAI's automatic 50% cache discount when you have large, repeated system prompts — e.g. multi-turn agents, RAG with static context.
Assume: 500 input tokens + 200 output tokens per call = 5M input + 2M output tokens/day.
| Model | Daily Cost | Monthly Cost | Annual Cost |
|---|---|---|---|
| GPT-4o | $55.00 | $1,650 | $19,800 |
| Claude Sonnet 3.5 | $45.00 | $1,350 | $16,200 |
| GPT-4o-mini | $1.95 | $58.50 | $702 |
| Claude Haiku 3.5 | $12.00 | $360 | $4,320 |
Paste your real prompt text and see the exact cost across all models — GPT-4o, Claude, and Gemini — with monthly projections.
Open the LLM Cost Calculator →Claude Sonnet 3.5 is cheaper on input tokens ($3/M vs $5/M for GPT-4o). Output prices are the same ($15/M). For input-heavy workloads, Claude Sonnet saves ~40% on input costs. For simple tasks at scale, GPT-4o-mini ($0.15/M input) is significantly cheaper than Claude Haiku ($0.80/M input).
OpenAI applies automatic 50% caching for repeated context. Anthropic requires explicit cache control markers but gives a larger 90% discount on cache reads (0.1× the input price). For agents and chatbots reusing large system prompts, Claude's explicit caching is more cost-effective when you can control the cache keys.
For maximum cost efficiency: GPT-4o-mini or Claude Haiku with caching enabled. For production apps where quality matters: Claude Sonnet 3.5 (cheaper input than GPT-4o, same output quality tier). For long documents with repeated context: Claude with explicit prompt caching dramatically reduces costs.
This page reflects May 2025 pricing from Anthropic and OpenAI public pricing pages. Use the interactive calculator at prompt-pricing.vercel.app for the most current figures — it shows real-time costs with your own prompt text.
Also see: OpenAI API Cost Calculator · Claude vs GPT Pricing Deep Dive · Gemini API Pricing · LLM Cost Comparison 2026 · Claude Haiku Pricing