Compare AI Model Costs: Comprehensive LLM API Pricing Breakdown (2025)

ScriptByAI

ScriptByAI

The latest API pricing for popular AI models like GPT-4.1, o4-mini, o3, Claude 4, Gemini 2.5, DeepSeek v3/R1, and more.

Compare AI Model Costs: Comprehensive LLM API Pricing Breakdown (1)

The rise of powerful AI models like GPT, Gemini, Claude, Mistral, Llama, and others has opened doors for AI developers, entrepreneurs, and startups. But with so many options available, choosing the right model for your project can feel overwhelming, especially when considering cost.

This post lists the pricing structures of popular AI models, comparing their sizes, context window, and costs per million tokens to help you make informed decisions about your AI projects. We obtain the price information of these AI APIs from their official websites and will update it frequently to maintain its accuracy.

TL;DR

ModelContext WindowInput/1M TokensOutput/1M Tokens
GPT-4.11M$2.00$8.00
o4-mini200K$1.10$4.40
o3200K$2.00$8.00
Claude 4 Sonnet200K$3.00$15.00
Gemini 2.5 Pro200K$1.25$10.00
Gemini 2.5 Flash1M$0.15 (text/image/video)
$1.00 (audio)
Non-thinking: $0.60
Thinking: $3.50
Grok 4256,000$3.00$15.00
Llama 4 Maverick10M$0.20$0.60
Mistral Medium 3128K$0.40$2.00
DeepSeek-V364K$0.27$1.10

Table Of Contents

  • OpenAI GPT Models
  • OpenAI Reasoning Models
  • OpenAI Image Generation
  • Claude 4
  • Gemini
  • Gemini 2.5 Flash Native Audio
  • Gemini Embedding
  • Google Veo 2
  • DeepSeek
  • Qwen
  • Grok
  • Mistral (Premier Models)
  • Mistral (Open Models)
  • PPLX
  • Cohere

OpenAI GPT Models

ModelContext WindowInput/1M TokensOutput/1M Tokens
gpt-4.11M$2.00$8.00
gpt-4.1-mini1M$0.40$1.60
gpt-4.1-nano1M$0.10$0.40
gpt-4.5200K$75.00$150.00
gpt-4o128K$2.50$10.00
gpt-4o-mini128K$0.15$0.60
gpt-4o-realtime-preview (Text)128K$5.00$20.00
gpt-4o-audio-preview (Text)128K$2.50$10.00
gpt-4o-realtime-preview(Audio)128K$100$200
gpt-4o-audio-preview (Audio)128K$100$200
gpt-4o-mini-audio-preview (Text)128K$0.15$0.60
gpt-4o-mini-audio-preview (Audio)128K$10.00$20.00
gpt-4-turbo128K$10.00$30.00
gpt-3.5-turbo16,385$3.00$6.00

OpenAI Reasoning Models

ModelContext WindowInput/1M TokensOutput/1M Tokens
o4-mini200K$1.10$4.40
o4-mini-deep-research200K$2.00$8.00
o3-pro200K$20.00$80.00
o3200K$2.00$8.00
o3-deep-research200K$10.00$40.00
o3-mini200K$1.10$4.40
o1-pro200K$150.00$600.00
o1200K$15.00$60.00
o1-mini128K$1.10$4.40

OpenAI Image Generation

Model1024×10241024×15361536×1024
GPT Image 1 (low quality)$0.011$0.016$0.016
GPT Image 1 (medium quality)$0.042$0.063$0.063
GPT Image 1 (high quality)$0.167$0.25$0.25

Claude 4

ModelContext WindowInput/1M TokensOutput/1M Tokens
Claude 4 Opus200K$15.00$75.00
Claude 4 Sonnet200K$3.00$15.00
Claude 3.5 Haiku200K$0.80$4.00

Gemini

ModelContext WindowInput/1M TokensOutput/1M Tokens
Gemini 2.5 Pro>200K$2.50$15.00
Gemini 2.5 Pro200K$1.25$10.00
Gemini 2.5 Flash1M$0.30 (text/image/video)
$1.00 (audio)
$2.50
Gemini 2.5 Flash-Lite1M$0.10 (text/image/video)
$0.50 (audio)
$0.40
Gemini 2.0 Flash1M$0.10
$0.70 (audio)

$0.40
Gemini 2.0 Flash-Lite1M$0.075$0.30
Gemini 1.5 Pro>128K$2.50$10.00
Gemini 1.5 Pro128K$1.25$5.00
Gemini 1.5 Flash>128K$0.15$0.60
Gemini 1.5 Flash128K$0.075$0.30
Gemini 1.5 Flash-8B>128K$0.075$0.30
Gemini 1.5 Flash-8B128K$0.0375$0.15
Gemini 1.0 Pro32K$0.50$1.50

Gemini 2.5 Flash Native Audio

ModelFree TierInput/1M TokensOutput/1M Tokens
Gemini 2.5 Flash Native AudioNot available$0.50 (text)
$3.00 (audio / video)
$2.00 (text)
$12.00 (audio)

Google Imagen 4 & 3

ModelPaid Tier, per Image in USD
Imagen 4 Standard$0.04
Imagen 4 Ultra$0.06
Imagen 3$0.03

Gemini Embedding

ModelPaid Tier, per 1M tokens in USD
gemini-embedding-001$0.15

Google Veo 2

ModelPaid Tier, per second in USD
Veo 2$0.35

DeepSeek

ModelContext WindowInput/1M TokensOutput/1M Tokens
DeepSeek-V364K$0.07 (Cached)
$0.27 (Cached)
$0.035 (Discount+Cached)
$0.135 (Discount)
$1.10
$0.55 (Discount)
DeepSeek-R164K$0.14 (Cached)
$0.55 (Cached)
$0.035 (Discount+Cached)
$0.135 (Discount)
$2.19
$0.55 (Discount)

Qwen

ModelContext WindowInput/1M TokensOutput/1M Tokens
Qwen-Max32,768$1.60$6.40
Qwen-Plus131,072$0.40$0.12
Qwen-Turbo1,008,192$0.05$0.20

Grok

ModelContext WindowInput/1M TokensOutput/1M Tokens
Grok 4256,000$3.00$15.00
Grok 3131,072$3.00$15.00
Grok 3 Fast131,072$5.00$25.00
Grok 3 Mini131,072$0.30$0.50
Grok 3 Mini Fast131,072$0.60$4.00
Grok 2 Vision32,768$2.00$10.00
Grok 2 Image131,0720.07/image0.07/image

Mistral (Premier Models)

ModelContext WindowInput/1M TokensOutput/1M Tokens
Mistral Large128K$2.00$6.00
Pixtral Large128K$2.00$6.00
Mistral Saba128K$0.20$0.60
Mistral Medium 3128K$0.40$2.00
Magistral Medium128K$2.00$5.00
Devstral Medium128K$0.40$2.00
Codestral32K$0.20$0.60
Document AI & OCROCR: $1/1000 pages
Annotations: $3/1000 pages
Voxtral Mini TranscribeAudio Input/min
$0.002
Mistral Embed32k$0.10
Mistral Moderation 24.1132k$0.10
Ministral 8B 24.10128K$0.10$0.10
Ministral 3B 24.10128K$0.04$0.04

Mistral (Open Models)

ModelContext WindowInput/1M TokensOutput/1M Tokens
Pixtral 12B128K$0.15$0.15
Mistral Nemo128K$0.15$0.15
Mistral Small 3.1128K$0.10$0.30
Magistral Small128K$0.50$1.50
Devstral Small128K$0.10$0.30
Voxtral Mini$0.001 (audio)
$0.04 (text)
$0.04
Voxtral Small$0.001 (audio)
$0.04 (text)
$0.04
Mistral 7B32K$0.25$0.25
Mixtral 8x7B32K$0.70$0.70
Mixtral 8x22B64K$2.00$6.00

Llama 4 & 3

ModelContext WindowInput/1M TokensOutput/1M Tokens
Llama 4 Scout10M$0.11$0.34
Llama 4 Maverick10M$0.20$0.60
Llama 3.3 70B Versatile128K$0.59$0.79
Llama 3.3 70B SpecDec8192$0.59$0.99
Llama 3.3 70b Instruct128K$0.23$0.40
Llama 3.3 70b Instruct-Turbo128K$0.13$0.40
Llama 3.2 90b Vision-Instruct128K$0.35$0.40
Llama 3.2 11b Vision-Instruct128K$0.055$0.055
Llama 3.1 405B128K$1.79$1.79
Llama 3.1 70B128K$0.35$0.40
Llama 3.1 8B128K$0.09$0.09

PPLX

ModelContext WindowInput/1M TokensOutput/1M Tokens
pplx-70b-online4K$1.00$1.00
pplx-7b-online4K-$0.20$0.20

Cohere

ModelContext WindowInput/1M TokensOutput/1M Tokens
Command A256K$2.50$10.00
Command R+128K$2.50$10.00
Command R128K IN/4K OUT$0.15$0.60
Command R7B128K$0.0375$0.15

See Also:

  • What Is The Max Token Limit In OpenAI ChatGPT
  • What Are The Rate Limits For OpenAI API?
  • Compare AI Costs: Free LLM API Price Calculator

Changelog:

07/15/2025

  • Added Voxtral model family

07/14/2025

  • Added Gemini Embedding

07/11/2025

  • Added Grok 4

06/26/2025

  • Added o3-deep-research and o4-mini-deep-research

06/25/2025

  • Added Imagen 4 Ultra and Imagen 4 Standard

06/20/2025

  • Added Gemini 2.5 Flash-Lite

06/10/2025

  • Added o3-pro
  • Added Mistral Magistral models.

06/10/2025

  • OpenAI dropped the price of o3 by 80%

05/22/2025

  • Updated for Claude 4

05/21/2025

  • Updated Mistral models

05/21/2025

  • Added Gemini 2.5 Flash Native Audio

05/08/2025

  • Added Mistral Medium 3

04/23/2025

  • Added OpenAI Image Generation

04/18/2025

  • Added Gemini 2.5 Flash
  • Updated Cohere models

04/16/2025

  • Added o4-mini

04/14/2025

  • Added gpt-4.1 family

04/11/2025

  • Updated DeepSeek

04/11/2025

  • Added Google Imagen 3 and Veo 2.

04/10/2025

  • Added Qwen models
  • Added Grok 3

04/05/2025

  • Added Gemini Pro 2.5

03/19/2025

  • Added o1-pro

02/28/2025

  • Added GPT-4.5

02/24/2025

  • Added Claude 3.7

02/21/2025

  • Added Grok

02/06/2025

  • Added Gemini 2.0 Flash and Gemini 2.0 Flash-Lite

02/02/2025

  • Added o3-mini

01/31/2025

  • Added DeepSeek v3 and DeepSeek R1.

12/18/2024

  • o1 in the API comes with support for function calling, developer messages, Structured Outputs, and vision capabilities.

12/07/2024

  • Added Llama 3.3

11/05/2024

  • Added Claude Haiku 3.5

11/03/2024

  • Added Gemini 1.5 Flash-8B

10/04/2024

  • Added Llama 3.2
  • Added gpt-4o-realtime-preview

09/25/2024

  • Updated Google Gemini

09/13/2024

  • Added OpenAI’s latest model: o1.

08/07/2024

  • Added gpt-4o-2024-08-06, the latest gpt-4o snapshot that supports Structured Outputs

07/25/2024

  • Updated Mistral Large 2

07/24/2024

  • Added Llama 3.1 405B

07/20/2024

  • Added GPT-4o-mini
  • Updated prices

07/13/2024

  • Updated
  • Added Cohere’s Command API

Tags

# Claude# Gemini

More Like This

Enable Computer Automation With Claude Via Agent.exe

  • Featured, Productivity

Gemini-powered AI Search Engine Inspired By Perplexity AI – Gemini Search

  • Productivity

Automate Your Workflow with Clevrr Computer’s AI Agents

  • Productivity

  • Free AI Resources

Enhance Claude’s Reasoning with Deeper Thinking – Thinking Claude

  • Prompt

Free Gemini Chatbot with Vintage Terminal Interface – RetroTerminal

  • Chatbot

Trending now

Free Local AI File Manager – WisFile

Free Outfit Rating Tool with Instant Feedback – FitCheck AI

Free AI-Generated Startup Ideas from Hacker News – HN Slop

Free AI Turns Text or Images into Coloring Pages – iColoring

Get the latest & top AI tools sent directly to your email.

Subscribe now to explore the latest & top AI tools and resources, all in one convenient newsletter. No spam, we promise!

Compare AI Model Costs: Comprehensive LLM API Pricing Breakdown (2025)

References

Top Articles
Latest Posts
Recommended Articles
Article information

Author: Merrill Bechtelar CPA

Last Updated:

Views: 6587

Rating: 5 / 5 (70 voted)

Reviews: 85% of readers found this page helpful

Author information

Name: Merrill Bechtelar CPA

Birthday: 1996-05-19

Address: Apt. 114 873 White Lodge, Libbyfurt, CA 93006

Phone: +5983010455207

Job: Legacy Representative

Hobby: Blacksmithing, Urban exploration, Sudoku, Slacklining, Creative writing, Community, Letterboxing

Introduction: My name is Merrill Bechtelar CPA, I am a clean, agreeable, glorious, magnificent, witty, enchanting, comfortable person who loves writing and wants to share my knowledge and understanding with you.