๐Ÿฆž OpenClaw
Blog โ€บ Pricing Guide

AI Model Pricing Comparison 2026 GPT vs Claude vs Gemini vs DeepSeek

Every major LLM provider, compared by cost. Input & output pricing per 1M tokens, caching discounts, batch rates, and real optimization strategies. Updated for March 2026.

Updated March 21, 2026 ~15 min read Prices verified from official sources
๐Ÿ“‘ Table of Contents
  1. TL;DR โ€” Quick Comparison Table
  2. OpenAI Pricing
  3. Anthropic (Claude) Pricing
  4. Google (Gemini) Pricing
  5. DeepSeek Pricing
  6. Mistral Pricing
  7. Open Source Options
  8. Cost Optimization Tips
  9. Which Model Should You Use?
  10. Interactive Calculator

AI model pricing changes fast. New models launch monthly, prices drop, old models get deprecated. This guide cuts through the noise with verified, side-by-side pricing for every major LLM provider as of March 2026 โ€” so you can pick the right model for your budget and use case.

๐Ÿ’ก Pricing moves fast. All prices on this page are sourced from official provider pricing pages and verified as of March 21, 2026. We update this guide regularly. Bookmark it.

TL;DR โ€” Quick Comparison Table

All prices per 1 million tokens. Sorted by input cost, cheapest first.

Model Provider Input / 1M Output / 1M Cached Input Context
DeepSeek V3.2 DeepSeek $0.28 $0.42 $0.028 128K
GPT-5.4 nano OpenAI $0.20 $1.25 $0.02 270K
Gemini 2.5 Flash Google $0.30 $2.50 $0.03 1M
Gemini 3 Flash Google $0.50 $3.00 $0.05 1M
Mistral Small 3.1 Mistral $0.10 $0.30 โ€” 128K
GPT-5.4 mini OpenAI $0.75 $4.50 $0.075 270K
Llama 4 Maverick Meta / Hosted $0.20 $0.60 varies 1M
Claude Haiku 4.5 Anthropic $1.00 $5.00 $0.10 200K
Gemini 2.5 Pro Google $1.25 $10.00 $0.13 1M
Gemini 3 Pro Google $2.00 $12.00 $0.20 1M
Magistral Medium Mistral $2.00 $6.00 โ€” 40K
GPT-5.4 OpenAI $2.50 $15.00 $0.25 270K
Claude Sonnet 4.6 Anthropic $3.00 $15.00 $0.30 1M
Claude Opus 4.6 Anthropic $5.00 $25.00 $0.50 1M

๐Ÿ† Cheapest overall: DeepSeek V3.2 at $0.28 input / $0.42 output โ€” nearly 10ร— cheaper than frontier models from OpenAI and Anthropic. But cheapest โ‰  best. Read on.

Don't want to do the math?

Use our free AI Model Cost Calculator to estimate your monthly costs based on actual usage patterns.

Try the Calculator โ†’

OpenAI Pricing Breakdown

OpenAI's current lineup centers on the GPT-5.4 family, which replaced GPT-4o and o-series models. The three tiers cover everything from lightweight classification to frontier-class reasoning.

Model Input / 1M Cached Input Output / 1M Best For
GPT-5.4 $2.50 $0.25 $15.00 Complex reasoning, professional work
GPT-5.4 mini $0.75 $0.075 $4.50 Coding, agents, sub-tasks
GPT-5.4 nano $0.20 $0.02 $1.25 High-volume, simple tasks

Key Details

โš ๏ธ Legacy note: GPT-4o, o1, o3, and o3-mini have been deprecated in favor of the unified GPT-5.4 family. If you're still on older models, migration is recommended โ€” the new models are both cheaper and more capable.

Anthropic (Claude) Pricing Breakdown

Anthropic's current generation is the Claude 4.6 series (Opus and Sonnet) plus Claude Haiku 4.5 as the fast/cheap option. Claude is known for strong coding, long-context, and instruction following.

Model Input / 1M Cached Input Output / 1M Context Max Output
Claude Opus 4.6 $5.00 $0.50 $25.00 1M 128K
Claude Sonnet 4.6 $3.00 $0.30 $15.00 1M 64K
Claude Haiku 4.5 $1.00 $0.10 $5.00 200K 64K

Key Details

Legacy Models Still Available

Model Input / 1M Output / 1M Status
Claude Sonnet 4.5 $3.00 $15.00 Active (legacy)
Claude Opus 4.5 $5.00 $25.00 Active (legacy)
Claude Sonnet 4 $3.00 $15.00 Active (legacy)
Claude Opus 4 $15.00 $75.00 Active (legacy)

๐Ÿ’ก Pro tip: Claude Opus 4 and 4.1 at $15/$75 are 3ร— more expensive than Opus 4.6 at $5/$25. If you're still on older Opus models, upgrading saves money and gets you a better model.

Google (Gemini) Pricing Breakdown

Google runs two current generations: the new Gemini 3 series (preview) and the stable Gemini 2.5 series. Pricing is through Google AI Studio (consumer) or Vertex AI (enterprise). Prices below are Vertex AI standard tier.

Model Input / 1M Cached Input Output / 1M Context
Gemini 3.1 Pro (preview) $2.00 $0.20 $12.00 1M+
Gemini 3 Pro (preview) $2.00 $0.20 $12.00 1M+
Gemini 3 Flash (preview) $0.50 $0.05 $3.00 1M+
Gemini 3.1 Flash-Lite (preview) $0.25 $0.03 $1.50 1M+
Gemini 2.5 Pro $1.25 $0.13 $10.00 1M
Gemini 2.5 Flash $0.30 $0.03 $2.50 1M

Key Details

๐Ÿ” Note: Gemini 2.5 Flash at $0.30/$2.50 is one of the best value-for-money models available. Huge 1M context window, strong reasoning, and output includes thinking tokens at the same rate.

DeepSeek Pricing Breakdown

DeepSeek continues to be the price disruptor of the LLM market. Their V3.2 model unifies both chat and reasoning under a single endpoint, with the cheapest per-token pricing of any frontier-class model.

Model Input / 1M Cached Input Output / 1M Context
DeepSeek V3.2 (deepseek-chat) $0.28 $0.028 $0.42 128K
DeepSeek V3.2 Reasoning (deepseek-reasoner) $0.28 $0.028 $0.42 128K

Key Details

โš ๏ธ Caveat: DeepSeek's pricing is unbeatable, but consider latency, rate limits, and reliability for production workloads. The API can be slow or unavailable during peak hours. Many teams use DeepSeek for batch/offline work and a more reliable provider for real-time.

Mistral Pricing Breakdown

Mistral has rebranded around their Magistral (reasoning), Mistral Small (general), and Devstral (code) families. They offer competitive pricing, especially at the small model tier.

Model Input / 1M Output / 1M Context Best For
Magistral Medium 1.2 $2.00 $6.00 40K Complex reasoning
Magistral Small $0.50 $1.50 40K Lightweight reasoning
Mistral Small 3.1 $0.10 $0.30 128K General purpose, agents
Devstral Small 2 $0.50 $1.50 128K Code generation, agents
Ministral 8B $0.10 $0.10 128K Edge, classification
Ministral 3B $0.04 $0.04 128K Ultra-cheap, simple tasks

Key Details

Open Source Models (Llama & Others)

Open-source models don't have a single "official" price โ€” it depends on your hosting provider. Here's what major inference providers charge for Meta's Llama 4 and other popular open models:

Model Provider Input / 1M Output / 1M
Llama 4 Maverick (400B MoE) Together $0.20 $0.60
Llama 4 Maverick (400B MoE) Fireworks $0.22 $0.88
Llama 4 Scout (109B MoE) Together $0.10 $0.30
Llama 4 Scout (109B MoE) Fireworks $0.15 $0.60
Llama 3.3 70B Together $0.10 $0.30
DeepSeek R1 (hosted) Together $0.17 $0.51
Qwen 3 235B MoE Fireworks $0.20 $0.60

Self-Hosting vs. Inference APIs

๐Ÿ’ก Tip: Providers like Groq offer extremely fast inference (LPU hardware) with competitive pricing. If latency matters more than cost, check their current rates for Llama and Mistral models.

Cost Optimization Tips

The model you choose is only half the equation. How you use it matters just as much. These strategies can cut your LLM costs by 50โ€“90%:

1. Prompt Caching

Every major provider now offers automatic prompt caching. If your requests share a common system prompt or context prefix, cached tokens are 90% cheaper. This is the single biggest cost lever for most applications.

Example: Chatbot with 4K system prompt

Without caching: 4,000 tokens ร— $3.00/1M = $0.012 per request

With caching: 4,000 tokens ร— $0.30/1M = $0.0012 per request

Savings: 90% on system prompt tokens โ€” adds up fast at scale.

2. Model Routing

Don't use a $15/1M output model for tasks a $0.30/1M model can handle. Route requests to the cheapest capable model for each task:

3. Batch Processing

If you don't need real-time responses, use batch APIs. OpenAI, Anthropic, and Google all offer 50% off for async batch jobs:

4. Prompt Engineering

5. Avoid Over-Thinking

Models with reasoning/thinking (Claude extended thinking, Gemini's thinking, GPT-5.4 reasoning) produce internal reasoning tokens billed at output rates. For simple tasks, disable thinking mode to avoid paying for unnecessary chain-of-thought tokens.

See exactly how much you'll spend

Plug in your actual usage numbers and compare costs across all providers side by side.

Open the Cost Calculator โ†’

Which Model Should You Use?

There's no single "best" model. It depends on what you're building, how much you're willing to spend, and what trade-offs matter. Here's a decision framework:

Use Case Budget Pick Balanced Pick Premium Pick
๐Ÿ’ฌ Chatbot / Customer Support Gemini 2.5 Flash
$0.30/$2.50
Claude Haiku 4.5
$1.00/$5.00
Claude Sonnet 4.6
$3.00/$15.00
๐Ÿง‘โ€๐Ÿ’ป Code Generation DeepSeek V3.2
$0.28/$0.42
Claude Sonnet 4.6
$3.00/$15.00
Claude Opus 4.6
$5.00/$25.00
๐Ÿ“Š Data Extraction / Classification Mistral Small 3.1
$0.10/$0.30
GPT-5.4 nano
$0.20/$1.25
GPT-5.4 mini
$0.75/$4.50
๐Ÿ”ฌ Research / Complex Reasoning DeepSeek V3.2 Reasoner
$0.28/$0.42
Gemini 2.5 Pro
$1.25/$10.00
Claude Opus 4.6
$5.00/$25.00
๐Ÿ“ Long Document Processing Gemini 2.5 Flash
1M context
Claude Sonnet 4.6
1M context
Gemini 3 Pro
1M+ context
๐Ÿค– Agentic / Multi-Step Llama 4 Scout
$0.10/$0.30
GPT-5.4 mini
$0.75/$4.50
GPT-5.4
$2.50/$15.00
๐Ÿ“ฑ Edge / On-Device Ministral 3B
$0.04/$0.04
Mistral Small 3.1
$0.10/$0.30
Llama 4 Scout
self-host
๐Ÿ’ฐ Maximum Savings (Batch) DeepSeek V3.2
$0.28/$0.42
Gemini 2.5 Flash Batch
$0.15/$1.25
Claude Sonnet Batch
$1.50/$7.50

Decision Cheat Sheet

๐Ÿ’ธ "I want the cheapest possible" โ†’ DeepSeek V3.2 or Mistral Small 3.1

โšก "I need the fastest" โ†’ Gemini 2.5 Flash or Groq-hosted Llama

๐Ÿง  "I need the smartest" โ†’ Claude Opus 4.6 or GPT-5.4

๐Ÿ“ "I have huge documents" โ†’ Gemini (1M native) or Claude Sonnet/Opus (1M)

๐Ÿ”’ "I need data privacy" โ†’ Self-host Llama 4 or Mistral Small (open weights)

โš–๏ธ "Best bang for buck" โ†’ Claude Sonnet 4.6 or Gemini 2.5 Pro โ€” strong quality at mid-range prices

Interactive AI Cost Calculator

Comparing prices in a table only gets you so far. Real costs depend on your specific usage patterns: how many requests per day, average prompt size, output length, caching hit rate, and batch vs. real-time split.

๐Ÿฆž AI Model Cost Calculator

Enter your usage. Get instant cost estimates across every provider.

Free. No signup. No tracking.

Use the Calculator โ†’

Methodology & Sources

All pricing data is sourced directly from official provider pricing pages and API documentation. Prices are for standard (pay-as-you-go) tier unless otherwise noted. Prices may vary by region, commitment level, or cloud marketplace.

Last verified: March 21, 2026. Prices can change without notice. Always check the official pricing page before making decisions. Open-source model hosting prices are approximate and vary by provider, plan, and GPU availability.