LLM API Pricing Comparison 2026

Compare Claude, GPT-4o, Gemini, DeepSeek, and Mistral — input/output costs, cached rates, context windows, and best-value picks for every use case.

Full Model Pricing Table

All prices per million tokens (MTok). Cached rates apply after the first request within the cache TTL window.

Model Provider Input ($/MTok) Output ($/MTok) Cached Input Context
Gemini 2.0 Flash Google $0.10 $0.40 $0.025 1M tokens
GPT-4o-mini OpenAI $0.15 $0.60 $0.075 128K tokens
DeepSeek-V3 DeepSeek $0.27 $1.10 $0.07 128K tokens
Mistral Medium Mistral $0.40 $2.00 128K tokens
Claude Haiku 4.5 Anthropic $0.80 $4.00 $0.08 200K tokens
Gemini 2.5 Pro Google $1.25 $5.00 $0.31 1M tokens
GPT-4o OpenAI $2.50 $10.00 $1.25 128K tokens
Claude Sonnet 4.6 Anthropic $3.00 $15.00 $0.30 200K tokens
OpenAI o4-mini OpenAI $1.10 $4.40 $0.275 200K tokens
OpenAI o3 OpenAI $10.00 $40.00 $2.50 200K tokens
Claude Opus 4.7 Anthropic $15.00 $75.00 $1.50 200K tokens

Best-Value Picks by Use Case

Cheapest capable model

Gemini 2.0 Flash — $0.10/MTok input, 1M context. Best raw tokens per dollar for non-critical workloads.

Best for chatbots with large system prompts

Claude Haiku 4.5 with caching — $0.08/MTok cached reads. Repeated context makes it the cheapest option at scale.

Best quality/cost balance

Claude Sonnet 4.6 — $3.00/MTok input. Consistently the top performer on coding and reasoning benchmarks at non-frontier pricing.

Best for long documents (1M+ context)

Gemini 2.5 Pro — $1.25/MTok input, 1M context window. 4× cheaper than Sonnet for long-document work.

Best open-weight / self-hostable

DeepSeek-V3 — $0.27/MTok API or self-host for near-zero marginal cost. Strong benchmark performance relative to price.

Best for complex reasoning / math

OpenAI o4-mini — $1.10/MTok input for reasoning tasks. Better than GPT-4o on hard math/coding at lower price.

Batch API Discounts

Anthropic and OpenAI both offer 50% batch discounts for async workloads (processing within 24h). Google Gemini does not yet have a batch discount tier as of 2026.

Model Standard Input Batch Input Batch Output
Claude Haiku 4.5$0.80/MTok$0.40/MTok$2.00/MTok
Claude Sonnet 4.6$3.00/MTok$1.50/MTok$7.50/MTok
GPT-4o-mini$0.15/MTok$0.075/MTok$0.30/MTok
GPT-4o$2.50/MTok$1.25/MTok$5.00/MTok

Get exact token counts and costs for your prompt

Paste any prompt into the calculator to see what it costs across all major LLM APIs — including cached rates and monthly projections.

Frequently Asked Questions

What is the cheapest LLM API in 2026?

Gemini 2.0 Flash at $0.10/MTok input is the cheapest capable hosted model. DeepSeek-V3 at $0.27/MTok offers stronger benchmark performance at very low cost. For instruction-following with strong quality guarantees, GPT-4o-mini ($0.15/MTok) and Claude Haiku 4.5 with caching ($0.08/MTok cached) are the best budget options.

How does Claude Sonnet compare to GPT-4o on price?

Claude Sonnet 4.6 ($3.00/MTok input, $15.00/MTok output) is slightly more expensive than GPT-4o ($2.50/MTok input, $10.00/MTok output) at standard rates. For long-context apps with caching, Sonnet cache reads ($0.30/MTok) are cheaper than GPT-4o's cached rate. Both models are competitive on quality for most tasks.

Is Gemini 2.5 Pro a good alternative to Claude Sonnet?

Yes — Gemini 2.5 Pro ($1.25/MTok input) is 58% cheaper than Claude Sonnet 4.6 ($3.00/MTok) with a 1M token context window vs Sonnet's 200K. For long-document analysis and cost-sensitive apps, Gemini 2.5 Pro is a strong alternative. Claude Sonnet still leads on instruction-following precision and agentic task quality for many teams.

Which LLM API has the best prompt caching?

Claude (Anthropic) has the most aggressive caching pricing: cache reads at 10% of the standard input rate (e.g., $0.08/MTok for Haiku). OpenAI's cached input rate is 50% of standard. Google Gemini offers context caching with an hourly storage charge model. For apps with large, fixed system prompts, Claude's caching gives the biggest savings.

Are LLM API prices still falling in 2026?

Yes — frontier model prices have dropped 10-100× over the past two years. GPT-4-level capability (as of 2023) now costs under $0.50/MTok via several providers. Competition from Google, Anthropic, DeepSeek, and Mistral continues to push prices down. Budget at today's rates but expect further decreases as scale increases.