Anthropic's Message Batches API cuts all Claude model prices in half — for async workloads like data pipelines, labeling, and evaluation runs.
Every Claude model is 50% cheaper via the Batch API. The trade-off: results within 24 hours rather than real-time.
| Model | Standard Input | Batch Input | Standard Output | Batch Output | Savings |
|---|---|---|---|---|---|
| Claude Haiku 4.5 | $0.25/MTok | $0.125/MTok | $1.25/MTok | $0.625/MTok | 50% off |
| Claude Sonnet 4.6 | $3.00/MTok | $1.50/MTok | $15.00/MTok | $7.50/MTok | 50% off |
| Claude Opus 4.7 | $15.00/MTok | $7.50/MTok | $75.00/MTok | $37.50/MTok | 50% off |
| Use Case | Volume/Month | Standard Cost | Batch Cost | Monthly Saving |
|---|---|---|---|---|
| Document classification (Haiku) | 100M tokens | $25 | $12.50 | $12.50 |
| Data labeling pipeline (Sonnet) | 1B tokens | $18,000 | $9,000 | $9,000 |
| Eval suite runs (Sonnet) | 500M tokens | $9,000 | $4,500 | $4,500 |
| Research analysis (Opus) | 100M tokens | $9,000 | $4,500 | $4,500 |
| Optimization | Discount | Best For | Latency Impact |
|---|---|---|---|
| Prompt caching | 90% on cache reads | Repeated system prompts / context | None (real-time) |
| Batch API | 50% on everything | Any async/offline workload | Up to 24 hours |
| Both combined | Up to 95% on cached input | Large-batch pipelines with shared context | Up to 24 hours |
Each batch can contain up to 100,000 individual requests. Each request supports the full model context window (up to 200K tokens). Typical batch completion time ranges from a few minutes (small batches) to a few hours (large batches close to 100K requests).
The Batch API is available in all regions where the standard Claude API is available. Batch results are stored for 29 days after completion; download results before expiry.
Yes. Use the client.messages.batches.cancel(batch_id) method to cancel an in-progress batch. Requests already processed are billed; pending requests are not.