🤖

AI Model Comparison Table

Compare proprietary and open-weight AI models in one up-to-date table covering API pricing, context, best-fit use cases, and official docs.

Loading interactive tool...

Share this tool

How to Use the AI Model Comparison Table

1
Enter your values
Open the AI Model Comparison Table and fill in the required input fields with your numbers or selections.
2
Review the calculation
The tool automatically computes the result as you type. Double-check your inputs to ensure accuracy.
3
Interpret your results
Review the calculated output along with any breakdowns, charts, or explanations provided to understand what the numbers mean for your situation.

Frequently Asked Questions

What is the AI Model Comparison Table?

The AI Model Comparison Table is a free browser-based tool on ConvertCrunch. Compare proprietary and open-weight AI models in one up-to-date table covering API pricing, context, best-fit use cases, and official docs. It requires no downloads, installations, or account signups. Simply enter your values and get instant, accurate results. The tool is designed for quick calculations and informed decision-making, whether you are a professional, student, or everyday user looking for reliable answers.

How do I use the AI Model Comparison Table?

Using the AI Model Comparison Table is straightforward. Enter your values into the input fields provided on the page. Results update instantly as you type or adjust parameters, so there is no need to press a calculate button. You can modify any input to explore different scenarios and compare outcomes. All processing happens in your browser, so your data stays private and is never sent to a server. Most users get the answers they need in under a minute.

Is the AI Model Comparison Table free to use?

Yes, the AI Model Comparison Table is completely free to use with no limitations. There is no account signup, no email required, and no premium tier. The tool runs entirely in your browser using JavaScript, which means it works offline after the page loads and your data never leaves your device. You can use it as many times as you need for personal, educational, or professional purposes.

How accurate is the AI Model Comparison Table?

The AI Model Comparison Table uses current publicly available pricing data from major AI providers and industry benchmarks. AI pricing changes frequently as providers update their models and pricing tiers, so costs shown reflect rates at the time of calculation. Actual costs may vary based on usage volume, negotiated enterprise agreements, and provider-specific discounts. Check provider pricing pages for the most up-to-date rates.

Explore Related Resources

Go deeper with workflow guides, side-by-side comparisons, and reusable embeds connected to this tool.

Guides

Embed This Tool

Add this calculator to your website with a simple iframe.

Continue Through Topic Hubs

This tool is part of larger workflows. Open a hub to continue with the next relevant tools.

Business and AI Economics Hub

Cost, ROI, and automation decision workflows for AI adoption and growth planning.

ai economybusiness calculatorspricing calculators

Recommended Next Steps

Continue your workflow with these tools from the same playbook.

🤖

AI vs Human Cost Calculator

Compare the cost of completing a task with AI tools versus hiring a human. Includes API costs, time savings, and ROI.

🤖

AI Automation Savings Calculator

Estimate time and cost savings from automating recurring business tasks with AI and workflow tools.

📈

Customer Lifetime Value Calculator

Estimate customer lifetime value using monthly revenue, gross margin, and churn. Includes LTV:CAC ratio.

Related Tools

🤖

AI Agent Cost Estimator

Estimate monthly AI agent cost from model usage, tools, orchestration, and support overhead.

🤖

AI Automation Savings Calculator

Estimate time and cost savings from automating recurring business tasks with AI and workflow tools.

🤖

AI Business Case Generator

Generate a structured AI implementation business case with ROI, payback, risks, and timeline.

🤖

AI Carbon Footprint Calculator

Estimate annual carbon footprint of model usage based on query volume and usage patterns.

🤖

AI Chatbot ROI Calculator

Calculate expected ROI from deploying an AI chatbot for support, lead capture, and customer engagement.

🤖

AI Compliance Cost Calculator

Estimate AI compliance and governance costs by use case risk level and organizational scope.

🤖

AI Content Cost Calculator

Calculate the cost of producing content with AI versus hiring writers. Compare per-article costs and quality trade-offs.

🤖

AI Job Impact Analyzer

Analyze how AI will affect your job across 27 roles and 15 industries. See task-level automation risk, AI tools to learn, emerging roles, and a personalized action plan.

Next Step

Continue with AI vs Human Cost Calculator

Open →

AI Model Comparison

27 models from 8 providers — 16 proprietary, 11 open weight. Prices are API costs per 1M tokens from official sources. Last verified: 2026-06-20.

Lowest API Cost

Mistral Mistral Small 4

$0.40

1M input + 1M output tokens

Highest API Cost

OpenAI GPT-5.5 Pro

$210.00

1M input + 1M output tokens

Open-Weight Options

11 listed models

Self-hosting costs vary; hosted API prices appear where official

Provider	Model	Type	Best For	Input	Output	Total	Context	Provider Notes	Links
Mistral	Mistral Small 4	Open Weight	Low-cost open-weight routing, extraction, and multimodal tasks	$0.10	$0.30	$0.40 per 2M tokens	Not listed	Mistral's Apache 2.0 open model for lightweight multimodal and multilingual agentic workloads. Listed at $0.10 input / $0.30 output per 1M tokens on Mistral's pricing page.	Docs Pricing
DeepSeek	DeepSeek V4 Flash	Open Weight	Ultra-low-cost production traffic and large-context experimentation	$0.14	$0.28	$0.42 per 2M tokens	1M	DeepSeek's cost-focused V4 model with thinking and non-thinking modes plus MIT-licensed weights. Hosted API cache-miss input price shown. Cache-hit input is $0.0028 per 1M tokens.	Docs Pricing
DeepSeek	DeepSeek V4 Pro	Open Weight	Very low-cost open-weight reasoning and long-context API workloads	$0.43	$0.87	$1.30 per 2M tokens	1M	DeepSeek's higher-concurrency V4 model with thinking mode, JSON output, tool calls, and MIT-licensed weights. Hosted API cache-miss input price shown. Cache-hit input is $0.003625 per 1M tokens.	Docs Pricing
Google	Gemini 3.1 Flash-Lite	Proprietary	Budget Gemini routing, translation, and lightweight extraction	$0.25	$1.50	$1.75 per 2M tokens	Gemini 3 family	Google's most cost-efficient Gemini 3 model for high-volume agentic tasks, translation, and simple data processing. Standard paid-tier text/image/video rate. Audio input is $0.50 per 1M tokens.	Docs Pricing
Mistral	Mistral Large 3	Open Weight	Open-weight production use with low hosted API cost	$0.50	$1.50	$2.00 per 2M tokens	Not listed	Mistral describes Large 3 as an open-weight, general-purpose, flagship multimodal and multilingual model. Hosted API rate for mistral-large-latest; self-hosting costs depend on infrastructure.	Docs Pricing
xAI	Grok Build 0.1	Proprietary	Software-building workflows and lower-cost xAI usage	$1.00	$2.00	$3.00 per 2M tokens	256K	xAI's build-focused model trained specifically for agentic coding workflows. Includes $0.20 cached input pricing.	Docs Pricing
Google	Gemini 3 Flash Preview	Proprietary	Cost-efficient Gemini 3 apps that need speed and multimodal support	$0.50	$3.00	$3.50 per 2M tokens	Gemini 3 family	Google's speed-focused Gemini 3 preview model with text, image, video, and audio input pricing. Standard paid-tier rate for text/image/video input. Audio input is $1 per 1M tokens.	Docs Pricing
xAI	Grok 4.3	Proprietary	Low-cost frontier-style reasoning with a large context window	$1.25	$2.50	$3.75 per 2M tokens	1M	xAI's current general model with agentic tool calling, minimal hallucinations, and configurable reasoning. Includes $0.20 cached input pricing. Same listed rate as Grok 4.20 reasoning and non-reasoning variants.	Docs Pricing
OpenAI	GPT-5.4 Mini	Proprietary	High-volume OpenAI workloads that still need strong general quality	$0.75	$4.50	$5.25 per 2M tokens	400K	OpenAI's strongest mini model for lower-latency, lower-cost workloads. Standard API rate. Batch and Flex reduce this to $0.375 input / $2.25 output per 1M tokens.	Docs Pricing
Anthropic	Claude Haiku 4.5	Proprietary	Fast, high-volume Claude tasks, routing, chat, and extraction	$1.00	$5.00	$6.00 per 2M tokens	200K	Anthropic lists Haiku 4.5 as the fastest Claude model with near-frontier intelligence. Lowest current Claude text model rate in the latest comparison table.	Docs Pricing
Mistral	Magistral Medium	Proprietary	Reasoning-heavy Mistral workloads with lower output cost than many frontier models	$2.00	$5.00	$7.00 per 2M tokens	Not listed	Mistral's premier thinking model for domain-specific, transparent, and multilingual reasoning. Listed as magistral-medium-latest on Mistral's pricing page.	Docs Pricing
Mistral	Mistral Medium 3.5	Open Weight	European open-weight option for reasoning, coding, and agent workflows	$1.50	$7.50	$9.00 per 2M tokens	Not listed	Mistral's current open model for state-of-the-art performance, enterprise deployment, reasoning, coding, and agents. Listed at $1.50 input / $7.50 output per 1M tokens on Mistral's pricing page.	Docs Pricing
Google	Gemini 3.5 Flash	Proprietary	Fast multimodal apps, search-grounded work, and high-throughput reasoning	$1.50	$9.00	$10.50 per 2M tokens	Gemini 3 family	Google's most intelligent speed-focused stable Gemini 3 model with frontier intelligence and grounding. Standard paid-tier rate. Priority is $2.70 input / $16.20 output per 1M tokens.	Docs Pricing
Google	Gemini 2.5 Pro	Proprietary	Stable long-context Gemini workloads and complex reasoning	$1.25	$10.00	$11.25 per 2M tokens	1M	Google still lists Gemini 2.5 Pro as an advanced model for complex tasks, deep reasoning, and coding. Rate for prompts <= 200K tokens. Prompts > 200K are $2.50 input / $15 output per 1M tokens.	Docs Pricing
Google	Gemini 3.1 Pro Preview	Proprietary	Long-context multimodal reasoning and Google ecosystem integrations	$2.00	$12.00	$14.00 per 2M tokens	1M	Google's Gemini 3.1 Pro preview for multimodal understanding, agentic capability, and coding. Standard rate for prompts <= 200K tokens. Prompts > 200K are $4 input / $18 output per 1M tokens.	Docs Pricing
OpenAI	GPT-5.4	Proprietary	Lower-cost OpenAI frontier workloads and production agents	$2.50	$15.00	$17.50 per 2M tokens	1M	OpenAI's more affordable frontier model for coding and professional work. Standard short-context rate. Long-context rate is $5 input / $22.50 output per 1M tokens.	Docs Pricing
Anthropic	Claude Sonnet 4.6	Proprietary	Production agents, coding assistants, analysis, and quality writing	$3.00	$15.00	$18.00 per 2M tokens	1M	Anthropic describes Sonnet 4.6 as the best combination of speed and intelligence. Current Claude mid-tier flagship rate. Supports extended thinking and priority tier.	Docs Pricing
Anthropic	Claude Opus 4.8	Proprietary	Premium coding, deep reasoning, and enterprise knowledge work	$5.00	$25.00	$30.00 per 2M tokens	1M	Anthropic's most capable Opus-tier model for complex reasoning and agentic coding. Model table lists $5 input / $25 output per million tokens. Microsoft Foundry context may be 200K.	Docs Pricing
OpenAI	GPT-5.5	Proprietary	Frontier OpenAI reasoning, agent workflows, and high-quality generation	$5.00	$30.00	$35.00 per 2M tokens	1M	OpenAI's flagship model for complex reasoning, coding, and professional work. Standard short-context rate. Long-context rate is $10 input / $45 output per 1M tokens.	Docs Pricing
Anthropic	Claude Fable 5	Proprietary	Highest-end Anthropic reasoning, coding, and long-running agent tasks	$10.00	$50.00	$60.00 per 2M tokens	1M	Anthropic's most capable widely released Claude model for demanding reasoning and long-horizon agentic work. Generally available on the Claude API as of June 9, 2026. Mythos 5 has the same listed rate but limited availability.	Docs Pricing
OpenAI	GPT-5.5 Pro	Proprietary	Most demanding OpenAI tasks where quality matters more than cost	$30.00	$180.00	$210.00 per 2M tokens	Short / long	Highest-priced OpenAI Pro-tier frontier model on the official API pricing page. Standard short-context rate. Long-context rate is $60 input / $270 output per 1M tokens.	Docs Pricing
OpenAI	gpt-oss-120b	Open Weight	Customizable open-weight reasoning on H100-class infrastructure	Self-host	Self-host	Self-host	131K	OpenAI's most powerful open-weight reasoning model; 117B parameters with 5.1B active parameters and Apache 2.0 licensing. Self-hosted or third-party-hosted cost depends on infrastructure. Official OpenAI pricing page does not list a token API rate for this model.	Docs
OpenAI	gpt-oss-20b	Open Weight	Local and lower-infrastructure open-weight reasoning experiments	Self-host	Self-host	Self-host	131K	Smaller OpenAI open-weight reasoning model released with the gpt-oss family under Apache 2.0 licensing. Self-hosted or third-party-hosted cost depends on infrastructure. Official OpenAI pricing page does not list a token API rate for this model.	Docs
Google	Gemma 4 31B	Open Weight	Open-weight Google model for local or private-cloud reasoning and coding	Self-host	Self-host	Self-host	256K	Google DeepMind's 31B dense Gemma 4 open-weight model for coding, reasoning, and multimodal understanding. Self-hosted or Vertex/third-party deployment cost depends on infrastructure; Gemma 4 is distributed as open weights.	Docs
Meta	Llama 4 Scout	Open Weight	Very long-context open-weight multimodal workloads	Self-host	Self-host	Self-host	10M	Meta's 17B-active, 109B-total open-weight multimodal MoE model with a very long advertised context window. Self-hosted or partner-hosted cost depends on infrastructure. Distributed under the Llama 4 Community License.	Docs
Meta	Llama 4 Maverick	Open Weight	High-quality open-weight multimodal chat, reasoning, and coding	Self-host	Self-host	Self-host	1M	Meta's 17B-active, 400B-total open-weight multimodal MoE model focused on strong quality-to-cost performance. Self-hosted or partner-hosted cost depends on infrastructure. Distributed under the Llama 4 Community License.	Docs
Qwen	Qwen3-235B-A22B-Instruct-2507	Open Weight	Open-weight multilingual instruction following, coding, and tool-use workloads	Self-host	Self-host	Self-host	256K	Updated Qwen3 235B MoE instruct model with 22B active parameters, Apache 2.0 licensing, and improved long-context understanding. Self-hosted or third-party-hosted cost depends on infrastructure.	Docs

Provider

Model

Type

Best For

Input

Output

Total

Context

Provider Notes

Links

Mistral

Mistral Small 4

Open Weight

Low-cost open-weight routing, extraction, and multimodal tasks

$0.10

$0.30

$0.40

per 2M tokens

Not listed

Mistral's Apache 2.0 open model for lightweight multimodal and multilingual agentic workloads.

Listed at $0.10 input / $0.30 output per 1M tokens on Mistral's pricing page.

Docs Pricing

DeepSeek

DeepSeek V4 Flash

Open Weight

Ultra-low-cost production traffic and large-context experimentation

$0.14

$0.28

$0.42

per 2M tokens

DeepSeek's cost-focused V4 model with thinking and non-thinking modes plus MIT-licensed weights.

Hosted API cache-miss input price shown. Cache-hit input is $0.0028 per 1M tokens.

Docs Pricing

DeepSeek

DeepSeek V4 Pro

Open Weight

Very low-cost open-weight reasoning and long-context API workloads

$0.43

$0.87

$1.30

per 2M tokens

DeepSeek's higher-concurrency V4 model with thinking mode, JSON output, tool calls, and MIT-licensed weights.

Hosted API cache-miss input price shown. Cache-hit input is $0.003625 per 1M tokens.

Docs Pricing

Google

Gemini 3.1 Flash-Lite

Proprietary

Budget Gemini routing, translation, and lightweight extraction

$0.25

$1.50

$1.75

per 2M tokens

Gemini 3 family

Google's most cost-efficient Gemini 3 model for high-volume agentic tasks, translation, and simple data processing.

Standard paid-tier text/image/video rate. Audio input is $0.50 per 1M tokens.

Docs Pricing

Mistral

Mistral Large 3

Open Weight

Open-weight production use with low hosted API cost

$0.50

$1.50

$2.00

per 2M tokens

Not listed

Mistral describes Large 3 as an open-weight, general-purpose, flagship multimodal and multilingual model.

Hosted API rate for mistral-large-latest; self-hosting costs depend on infrastructure.

Docs Pricing

xAI

Grok Build 0.1

Proprietary

Software-building workflows and lower-cost xAI usage

$1.00

$2.00

$3.00

per 2M tokens

256K

xAI's build-focused model trained specifically for agentic coding workflows.

Includes $0.20 cached input pricing.

Docs Pricing

Google

Gemini 3 Flash Preview

Proprietary

Cost-efficient Gemini 3 apps that need speed and multimodal support

$0.50

$3.00

$3.50

per 2M tokens

Gemini 3 family

Google's speed-focused Gemini 3 preview model with text, image, video, and audio input pricing.

Standard paid-tier rate for text/image/video input. Audio input is $1 per 1M tokens.

Docs Pricing

xAI

Grok 4.3

Proprietary

Low-cost frontier-style reasoning with a large context window

$1.25

$2.50

$3.75

per 2M tokens

xAI's current general model with agentic tool calling, minimal hallucinations, and configurable reasoning.

Includes $0.20 cached input pricing. Same listed rate as Grok 4.20 reasoning and non-reasoning variants.

Docs Pricing

OpenAI

GPT-5.4 Mini

Proprietary

High-volume OpenAI workloads that still need strong general quality

$0.75

$4.50

$5.25

per 2M tokens

400K

OpenAI's strongest mini model for lower-latency, lower-cost workloads.

Standard API rate. Batch and Flex reduce this to $0.375 input / $2.25 output per 1M tokens.

Docs Pricing

Anthropic

Claude Haiku 4.5

Proprietary

Fast, high-volume Claude tasks, routing, chat, and extraction

$1.00

$5.00

$6.00

per 2M tokens

200K

Anthropic lists Haiku 4.5 as the fastest Claude model with near-frontier intelligence.

Lowest current Claude text model rate in the latest comparison table.

Docs Pricing

Mistral

Magistral Medium

Proprietary

Reasoning-heavy Mistral workloads with lower output cost than many frontier models

$2.00

$5.00

$7.00

per 2M tokens

Not listed

Mistral's premier thinking model for domain-specific, transparent, and multilingual reasoning.

Listed as magistral-medium-latest on Mistral's pricing page.

Docs Pricing

Mistral

Mistral Medium 3.5

Open Weight

European open-weight option for reasoning, coding, and agent workflows

$1.50

$7.50

$9.00

per 2M tokens

Not listed

Mistral's current open model for state-of-the-art performance, enterprise deployment, reasoning, coding, and agents.

Listed at $1.50 input / $7.50 output per 1M tokens on Mistral's pricing page.

Docs Pricing

Google

Gemini 3.5 Flash

Proprietary

Fast multimodal apps, search-grounded work, and high-throughput reasoning

$1.50

$9.00

$10.50

per 2M tokens

Gemini 3 family

Google's most intelligent speed-focused stable Gemini 3 model with frontier intelligence and grounding.

Standard paid-tier rate. Priority is $2.70 input / $16.20 output per 1M tokens.

Docs Pricing

Google

Gemini 2.5 Pro

Proprietary

Stable long-context Gemini workloads and complex reasoning

$1.25

$10.00

$11.25

per 2M tokens

Google still lists Gemini 2.5 Pro as an advanced model for complex tasks, deep reasoning, and coding.

Rate for prompts <= 200K tokens. Prompts > 200K are $2.50 input / $15 output per 1M tokens.

Docs Pricing

Google

Gemini 3.1 Pro Preview

Proprietary

Long-context multimodal reasoning and Google ecosystem integrations

$2.00

$12.00

$14.00

per 2M tokens

Google's Gemini 3.1 Pro preview for multimodal understanding, agentic capability, and coding.

Standard rate for prompts <= 200K tokens. Prompts > 200K are $4 input / $18 output per 1M tokens.

Docs Pricing

OpenAI

GPT-5.4

Proprietary

Lower-cost OpenAI frontier workloads and production agents

$2.50

$15.00

$17.50

per 2M tokens

OpenAI's more affordable frontier model for coding and professional work.

Standard short-context rate. Long-context rate is $5 input / $22.50 output per 1M tokens.

Docs Pricing

Anthropic

Claude Sonnet 4.6

Proprietary

Production agents, coding assistants, analysis, and quality writing

$3.00

$15.00

$18.00

per 2M tokens

Anthropic describes Sonnet 4.6 as the best combination of speed and intelligence.

Current Claude mid-tier flagship rate. Supports extended thinking and priority tier.

Docs Pricing

Anthropic

Claude Opus 4.8

Proprietary

Premium coding, deep reasoning, and enterprise knowledge work

$5.00

$25.00

$30.00

per 2M tokens

Anthropic's most capable Opus-tier model for complex reasoning and agentic coding.

Model table lists $5 input / $25 output per million tokens. Microsoft Foundry context may be 200K.

Docs Pricing

OpenAI

GPT-5.5

Proprietary

Frontier OpenAI reasoning, agent workflows, and high-quality generation

$5.00

$30.00

$35.00

per 2M tokens

OpenAI's flagship model for complex reasoning, coding, and professional work.

Standard short-context rate. Long-context rate is $10 input / $45 output per 1M tokens.

Docs Pricing

Anthropic

Claude Fable 5

Proprietary

Highest-end Anthropic reasoning, coding, and long-running agent tasks

$10.00

$50.00

$60.00

per 2M tokens

Anthropic's most capable widely released Claude model for demanding reasoning and long-horizon agentic work.

Generally available on the Claude API as of June 9, 2026. Mythos 5 has the same listed rate but limited availability.

Docs Pricing

OpenAI

GPT-5.5 Pro

Proprietary

Most demanding OpenAI tasks where quality matters more than cost

$30.00

$180.00

$210.00

per 2M tokens

Short / long

Highest-priced OpenAI Pro-tier frontier model on the official API pricing page.

Standard short-context rate. Long-context rate is $60 input / $270 output per 1M tokens.

Docs Pricing

OpenAI

gpt-oss-120b

Open Weight

Customizable open-weight reasoning on H100-class infrastructure

Self-host

131K

OpenAI's most powerful open-weight reasoning model; 117B parameters with 5.1B active parameters and Apache 2.0 licensing.

Self-hosted or third-party-hosted cost depends on infrastructure. Official OpenAI pricing page does not list a token API rate for this model.

Docs

OpenAI

gpt-oss-20b

Open Weight

Local and lower-infrastructure open-weight reasoning experiments

Self-host

131K

Smaller OpenAI open-weight reasoning model released with the gpt-oss family under Apache 2.0 licensing.

Self-hosted or third-party-hosted cost depends on infrastructure. Official OpenAI pricing page does not list a token API rate for this model.

Docs

Google

Gemma 4 31B

Open Weight

Open-weight Google model for local or private-cloud reasoning and coding

Self-host

256K

Google DeepMind's 31B dense Gemma 4 open-weight model for coding, reasoning, and multimodal understanding.

Self-hosted or Vertex/third-party deployment cost depends on infrastructure; Gemma 4 is distributed as open weights.

Docs

Meta

Llama 4 Scout

Open Weight

Very long-context open-weight multimodal workloads

Self-host

10M

Meta's 17B-active, 109B-total open-weight multimodal MoE model with a very long advertised context window.

Self-hosted or partner-hosted cost depends on infrastructure. Distributed under the Llama 4 Community License.

Docs

Meta

Llama 4 Maverick

Open Weight

High-quality open-weight multimodal chat, reasoning, and coding

Self-host

Meta's 17B-active, 400B-total open-weight multimodal MoE model focused on strong quality-to-cost performance.

Self-hosted or partner-hosted cost depends on infrastructure. Distributed under the Llama 4 Community License.

Docs

Qwen

Qwen3-235B-A22B-Instruct-2507

Open Weight

Open-weight multilingual instruction following, coding, and tool-use workloads

Self-host

256K

Updated Qwen3 235B MoE instruct model with 22B active parameters, Apache 2.0 licensing, and improved long-context understanding.

Self-hosted or third-party-hosted cost depends on infrastructure.

Docs

How to Use the AI Model Comparison Table

Enter your values

Open the AI Model Comparison Table and fill in the required input fields with your numbers or selections.

Review the calculation

The tool automatically computes the result as you type. Double-check your inputs to ensure accuracy.

Interpret your results

Review the calculated output along with any breakdowns, charts, or explanations provided to understand what the numbers mean for your situation.

Frequently Asked Questions

What is the AI Model Comparison Table?

How do I use the AI Model Comparison Table?

Is the AI Model Comparison Table free to use?

How accurate is the AI Model Comparison Table?

AI Model Comparison

27 models from 8 providers — 16 proprietary, 11 open weight. Prices are API costs per 1M tokens from official sources. Last verified: 2026-06-20.

Lowest API Cost

Mistral Mistral Small 4

$0.40

1M input + 1M output tokens

Highest API Cost

OpenAI GPT-5.5 Pro

$210.00

1M input + 1M output tokens

Open-Weight Options

11 listed models

Self-hosting costs vary; hosted API prices appear where official

Provider	Model	Type	Best For	Input	Output	Total	Context	Provider Notes	Links
Mistral	Mistral Small 4	Open Weight	Low-cost open-weight routing, extraction, and multimodal tasks	$0.10	$0.30	$0.40 per 2M tokens	Not listed	Mistral's Apache 2.0 open model for lightweight multimodal and multilingual agentic workloads. Listed at $0.10 input / $0.30 output per 1M tokens on Mistral's pricing page.	Docs Pricing
DeepSeek	DeepSeek V4 Flash	Open Weight	Ultra-low-cost production traffic and large-context experimentation	$0.14	$0.28	$0.42 per 2M tokens	1M	DeepSeek's cost-focused V4 model with thinking and non-thinking modes plus MIT-licensed weights. Hosted API cache-miss input price shown. Cache-hit input is $0.0028 per 1M tokens.	Docs Pricing
DeepSeek	DeepSeek V4 Pro	Open Weight	Very low-cost open-weight reasoning and long-context API workloads	$0.43	$0.87	$1.30 per 2M tokens	1M	DeepSeek's higher-concurrency V4 model with thinking mode, JSON output, tool calls, and MIT-licensed weights. Hosted API cache-miss input price shown. Cache-hit input is $0.003625 per 1M tokens.	Docs Pricing
Google	Gemini 3.1 Flash-Lite	Proprietary	Budget Gemini routing, translation, and lightweight extraction	$0.25	$1.50	$1.75 per 2M tokens	Gemini 3 family	Google's most cost-efficient Gemini 3 model for high-volume agentic tasks, translation, and simple data processing. Standard paid-tier text/image/video rate. Audio input is $0.50 per 1M tokens.	Docs Pricing
Mistral	Mistral Large 3	Open Weight	Open-weight production use with low hosted API cost	$0.50	$1.50	$2.00 per 2M tokens	Not listed	Mistral describes Large 3 as an open-weight, general-purpose, flagship multimodal and multilingual model. Hosted API rate for mistral-large-latest; self-hosting costs depend on infrastructure.	Docs Pricing
xAI	Grok Build 0.1	Proprietary	Software-building workflows and lower-cost xAI usage	$1.00	$2.00	$3.00 per 2M tokens	256K	xAI's build-focused model trained specifically for agentic coding workflows. Includes $0.20 cached input pricing.	Docs Pricing
Google	Gemini 3 Flash Preview	Proprietary	Cost-efficient Gemini 3 apps that need speed and multimodal support	$0.50	$3.00	$3.50 per 2M tokens	Gemini 3 family	Google's speed-focused Gemini 3 preview model with text, image, video, and audio input pricing. Standard paid-tier rate for text/image/video input. Audio input is $1 per 1M tokens.	Docs Pricing
xAI	Grok 4.3	Proprietary	Low-cost frontier-style reasoning with a large context window	$1.25	$2.50	$3.75 per 2M tokens	1M	xAI's current general model with agentic tool calling, minimal hallucinations, and configurable reasoning. Includes $0.20 cached input pricing. Same listed rate as Grok 4.20 reasoning and non-reasoning variants.	Docs Pricing
OpenAI	GPT-5.4 Mini	Proprietary	High-volume OpenAI workloads that still need strong general quality	$0.75	$4.50	$5.25 per 2M tokens	400K	OpenAI's strongest mini model for lower-latency, lower-cost workloads. Standard API rate. Batch and Flex reduce this to $0.375 input / $2.25 output per 1M tokens.	Docs Pricing
Anthropic	Claude Haiku 4.5	Proprietary	Fast, high-volume Claude tasks, routing, chat, and extraction	$1.00	$5.00	$6.00 per 2M tokens	200K	Anthropic lists Haiku 4.5 as the fastest Claude model with near-frontier intelligence. Lowest current Claude text model rate in the latest comparison table.	Docs Pricing
Mistral	Magistral Medium	Proprietary	Reasoning-heavy Mistral workloads with lower output cost than many frontier models	$2.00	$5.00	$7.00 per 2M tokens	Not listed	Mistral's premier thinking model for domain-specific, transparent, and multilingual reasoning. Listed as magistral-medium-latest on Mistral's pricing page.	Docs Pricing
Mistral	Mistral Medium 3.5	Open Weight	European open-weight option for reasoning, coding, and agent workflows	$1.50	$7.50	$9.00 per 2M tokens	Not listed	Mistral's current open model for state-of-the-art performance, enterprise deployment, reasoning, coding, and agents. Listed at $1.50 input / $7.50 output per 1M tokens on Mistral's pricing page.	Docs Pricing
Google	Gemini 3.5 Flash	Proprietary	Fast multimodal apps, search-grounded work, and high-throughput reasoning	$1.50	$9.00	$10.50 per 2M tokens	Gemini 3 family	Google's most intelligent speed-focused stable Gemini 3 model with frontier intelligence and grounding. Standard paid-tier rate. Priority is $2.70 input / $16.20 output per 1M tokens.	Docs Pricing
Google	Gemini 2.5 Pro	Proprietary	Stable long-context Gemini workloads and complex reasoning	$1.25	$10.00	$11.25 per 2M tokens	1M	Google still lists Gemini 2.5 Pro as an advanced model for complex tasks, deep reasoning, and coding. Rate for prompts <= 200K tokens. Prompts > 200K are $2.50 input / $15 output per 1M tokens.	Docs Pricing
Google	Gemini 3.1 Pro Preview	Proprietary	Long-context multimodal reasoning and Google ecosystem integrations	$2.00	$12.00	$14.00 per 2M tokens	1M	Google's Gemini 3.1 Pro preview for multimodal understanding, agentic capability, and coding. Standard rate for prompts <= 200K tokens. Prompts > 200K are $4 input / $18 output per 1M tokens.	Docs Pricing
OpenAI	GPT-5.4	Proprietary	Lower-cost OpenAI frontier workloads and production agents	$2.50	$15.00	$17.50 per 2M tokens	1M	OpenAI's more affordable frontier model for coding and professional work. Standard short-context rate. Long-context rate is $5 input / $22.50 output per 1M tokens.	Docs Pricing
Anthropic	Claude Sonnet 4.6	Proprietary	Production agents, coding assistants, analysis, and quality writing	$3.00	$15.00	$18.00 per 2M tokens	1M	Anthropic describes Sonnet 4.6 as the best combination of speed and intelligence. Current Claude mid-tier flagship rate. Supports extended thinking and priority tier.	Docs Pricing
Anthropic	Claude Opus 4.8	Proprietary	Premium coding, deep reasoning, and enterprise knowledge work	$5.00	$25.00	$30.00 per 2M tokens	1M	Anthropic's most capable Opus-tier model for complex reasoning and agentic coding. Model table lists $5 input / $25 output per million tokens. Microsoft Foundry context may be 200K.	Docs Pricing
OpenAI	GPT-5.5	Proprietary	Frontier OpenAI reasoning, agent workflows, and high-quality generation	$5.00	$30.00	$35.00 per 2M tokens	1M	OpenAI's flagship model for complex reasoning, coding, and professional work. Standard short-context rate. Long-context rate is $10 input / $45 output per 1M tokens.	Docs Pricing
Anthropic	Claude Fable 5	Proprietary	Highest-end Anthropic reasoning, coding, and long-running agent tasks	$10.00	$50.00	$60.00 per 2M tokens	1M	Anthropic's most capable widely released Claude model for demanding reasoning and long-horizon agentic work. Generally available on the Claude API as of June 9, 2026. Mythos 5 has the same listed rate but limited availability.	Docs Pricing
OpenAI	GPT-5.5 Pro	Proprietary	Most demanding OpenAI tasks where quality matters more than cost	$30.00	$180.00	$210.00 per 2M tokens	Short / long	Highest-priced OpenAI Pro-tier frontier model on the official API pricing page. Standard short-context rate. Long-context rate is $60 input / $270 output per 1M tokens.	Docs Pricing
OpenAI	gpt-oss-120b	Open Weight	Customizable open-weight reasoning on H100-class infrastructure	Self-host	Self-host	Self-host	131K	OpenAI's most powerful open-weight reasoning model; 117B parameters with 5.1B active parameters and Apache 2.0 licensing. Self-hosted or third-party-hosted cost depends on infrastructure. Official OpenAI pricing page does not list a token API rate for this model.	Docs
OpenAI	gpt-oss-20b	Open Weight	Local and lower-infrastructure open-weight reasoning experiments	Self-host	Self-host	Self-host	131K	Smaller OpenAI open-weight reasoning model released with the gpt-oss family under Apache 2.0 licensing. Self-hosted or third-party-hosted cost depends on infrastructure. Official OpenAI pricing page does not list a token API rate for this model.	Docs
Google	Gemma 4 31B	Open Weight	Open-weight Google model for local or private-cloud reasoning and coding	Self-host	Self-host	Self-host	256K	Google DeepMind's 31B dense Gemma 4 open-weight model for coding, reasoning, and multimodal understanding. Self-hosted or Vertex/third-party deployment cost depends on infrastructure; Gemma 4 is distributed as open weights.	Docs
Meta	Llama 4 Scout	Open Weight	Very long-context open-weight multimodal workloads	Self-host	Self-host	Self-host	10M	Meta's 17B-active, 109B-total open-weight multimodal MoE model with a very long advertised context window. Self-hosted or partner-hosted cost depends on infrastructure. Distributed under the Llama 4 Community License.	Docs
Meta	Llama 4 Maverick	Open Weight	High-quality open-weight multimodal chat, reasoning, and coding	Self-host	Self-host	Self-host	1M	Meta's 17B-active, 400B-total open-weight multimodal MoE model focused on strong quality-to-cost performance. Self-hosted or partner-hosted cost depends on infrastructure. Distributed under the Llama 4 Community License.	Docs
Qwen	Qwen3-235B-A22B-Instruct-2507	Open Weight	Open-weight multilingual instruction following, coding, and tool-use workloads	Self-host	Self-host	Self-host	256K	Updated Qwen3 235B MoE instruct model with 22B active parameters, Apache 2.0 licensing, and improved long-context understanding. Self-hosted or third-party-hosted cost depends on infrastructure.	Docs

Provider

Model

Type

Best For

Input

Output

Total

Context

Provider Notes

Links

Mistral

Mistral Small 4

Open Weight

Low-cost open-weight routing, extraction, and multimodal tasks

$0.10

$0.30

$0.40

per 2M tokens

Not listed

Mistral's Apache 2.0 open model for lightweight multimodal and multilingual agentic workloads.

Listed at $0.10 input / $0.30 output per 1M tokens on Mistral's pricing page.

Docs Pricing

DeepSeek

DeepSeek V4 Flash

Open Weight

Ultra-low-cost production traffic and large-context experimentation

$0.14

$0.28

$0.42

per 2M tokens

DeepSeek's cost-focused V4 model with thinking and non-thinking modes plus MIT-licensed weights.

Hosted API cache-miss input price shown. Cache-hit input is $0.0028 per 1M tokens.

Docs Pricing

DeepSeek

DeepSeek V4 Pro

Open Weight

Very low-cost open-weight reasoning and long-context API workloads

$0.43

$0.87

$1.30

per 2M tokens

DeepSeek's higher-concurrency V4 model with thinking mode, JSON output, tool calls, and MIT-licensed weights.

Hosted API cache-miss input price shown. Cache-hit input is $0.003625 per 1M tokens.

Docs Pricing

Google

Gemini 3.1 Flash-Lite

Proprietary

Budget Gemini routing, translation, and lightweight extraction

$0.25

$1.50

$1.75

per 2M tokens

Gemini 3 family

Google's most cost-efficient Gemini 3 model for high-volume agentic tasks, translation, and simple data processing.

Standard paid-tier text/image/video rate. Audio input is $0.50 per 1M tokens.

Docs Pricing

Mistral

Mistral Large 3

Open Weight

Open-weight production use with low hosted API cost

$0.50

$1.50

$2.00

per 2M tokens

Not listed

Mistral describes Large 3 as an open-weight, general-purpose, flagship multimodal and multilingual model.

Hosted API rate for mistral-large-latest; self-hosting costs depend on infrastructure.

Docs Pricing

xAI

Grok Build 0.1

Proprietary

Software-building workflows and lower-cost xAI usage

$1.00

$2.00

$3.00

per 2M tokens

256K

xAI's build-focused model trained specifically for agentic coding workflows.

Includes $0.20 cached input pricing.

Docs Pricing

Google

Gemini 3 Flash Preview

Proprietary

Cost-efficient Gemini 3 apps that need speed and multimodal support

$0.50

$3.00

$3.50

per 2M tokens

Gemini 3 family

Google's speed-focused Gemini 3 preview model with text, image, video, and audio input pricing.

Standard paid-tier rate for text/image/video input. Audio input is $1 per 1M tokens.

Docs Pricing

xAI

Grok 4.3

Proprietary

Low-cost frontier-style reasoning with a large context window

$1.25

$2.50

$3.75

per 2M tokens

xAI's current general model with agentic tool calling, minimal hallucinations, and configurable reasoning.

Includes $0.20 cached input pricing. Same listed rate as Grok 4.20 reasoning and non-reasoning variants.

Docs Pricing

OpenAI

GPT-5.4 Mini

Proprietary

High-volume OpenAI workloads that still need strong general quality

$0.75

$4.50

$5.25

per 2M tokens

400K

OpenAI's strongest mini model for lower-latency, lower-cost workloads.

Standard API rate. Batch and Flex reduce this to $0.375 input / $2.25 output per 1M tokens.

Docs Pricing

Anthropic

Claude Haiku 4.5

Proprietary

Fast, high-volume Claude tasks, routing, chat, and extraction

$1.00

$5.00

$6.00

per 2M tokens

200K

Anthropic lists Haiku 4.5 as the fastest Claude model with near-frontier intelligence.

Lowest current Claude text model rate in the latest comparison table.

Docs Pricing

Mistral

Magistral Medium

Proprietary

Reasoning-heavy Mistral workloads with lower output cost than many frontier models

$2.00

$5.00

$7.00

per 2M tokens

Not listed

Mistral's premier thinking model for domain-specific, transparent, and multilingual reasoning.

Listed as magistral-medium-latest on Mistral's pricing page.

Docs Pricing

Mistral

Mistral Medium 3.5

Open Weight

European open-weight option for reasoning, coding, and agent workflows

$1.50

$7.50

$9.00

per 2M tokens

Not listed

Mistral's current open model for state-of-the-art performance, enterprise deployment, reasoning, coding, and agents.

Listed at $1.50 input / $7.50 output per 1M tokens on Mistral's pricing page.

Docs Pricing

Google

Gemini 3.5 Flash

Proprietary

Fast multimodal apps, search-grounded work, and high-throughput reasoning

$1.50

$9.00

$10.50

per 2M tokens

Gemini 3 family

Google's most intelligent speed-focused stable Gemini 3 model with frontier intelligence and grounding.

Standard paid-tier rate. Priority is $2.70 input / $16.20 output per 1M tokens.

Docs Pricing

Google

Gemini 2.5 Pro

Proprietary

Stable long-context Gemini workloads and complex reasoning

$1.25

$10.00

$11.25

per 2M tokens

Google still lists Gemini 2.5 Pro as an advanced model for complex tasks, deep reasoning, and coding.

Rate for prompts <= 200K tokens. Prompts > 200K are $2.50 input / $15 output per 1M tokens.

Docs Pricing

Google

Gemini 3.1 Pro Preview

Proprietary

Long-context multimodal reasoning and Google ecosystem integrations

$2.00

$12.00

$14.00

per 2M tokens

Google's Gemini 3.1 Pro preview for multimodal understanding, agentic capability, and coding.

Standard rate for prompts <= 200K tokens. Prompts > 200K are $4 input / $18 output per 1M tokens.

Docs Pricing

OpenAI

GPT-5.4

Proprietary

Lower-cost OpenAI frontier workloads and production agents

$2.50

$15.00

$17.50

per 2M tokens

OpenAI's more affordable frontier model for coding and professional work.

Standard short-context rate. Long-context rate is $5 input / $22.50 output per 1M tokens.

Docs Pricing

Anthropic

Claude Sonnet 4.6

Proprietary

Production agents, coding assistants, analysis, and quality writing

$3.00

$15.00

$18.00

per 2M tokens

Anthropic describes Sonnet 4.6 as the best combination of speed and intelligence.

Current Claude mid-tier flagship rate. Supports extended thinking and priority tier.

Docs Pricing

Anthropic

Claude Opus 4.8

Proprietary

Premium coding, deep reasoning, and enterprise knowledge work

$5.00

$25.00

$30.00

per 2M tokens

Anthropic's most capable Opus-tier model for complex reasoning and agentic coding.

Model table lists $5 input / $25 output per million tokens. Microsoft Foundry context may be 200K.

Docs Pricing

OpenAI

GPT-5.5

Proprietary

Frontier OpenAI reasoning, agent workflows, and high-quality generation

$5.00

$30.00

$35.00

per 2M tokens

OpenAI's flagship model for complex reasoning, coding, and professional work.

Standard short-context rate. Long-context rate is $10 input / $45 output per 1M tokens.

Docs Pricing

Anthropic

Claude Fable 5

Proprietary

Highest-end Anthropic reasoning, coding, and long-running agent tasks

$10.00

$50.00

$60.00

per 2M tokens

Anthropic's most capable widely released Claude model for demanding reasoning and long-horizon agentic work.

Generally available on the Claude API as of June 9, 2026. Mythos 5 has the same listed rate but limited availability.

Docs Pricing

OpenAI

GPT-5.5 Pro

Proprietary

Most demanding OpenAI tasks where quality matters more than cost

$30.00

$180.00

$210.00

per 2M tokens

Short / long

Highest-priced OpenAI Pro-tier frontier model on the official API pricing page.

Standard short-context rate. Long-context rate is $60 input / $270 output per 1M tokens.

Docs Pricing

OpenAI

gpt-oss-120b

Open Weight

Customizable open-weight reasoning on H100-class infrastructure

Self-host

131K

OpenAI's most powerful open-weight reasoning model; 117B parameters with 5.1B active parameters and Apache 2.0 licensing.

Self-hosted or third-party-hosted cost depends on infrastructure. Official OpenAI pricing page does not list a token API rate for this model.

Docs

OpenAI

gpt-oss-20b

Open Weight

Local and lower-infrastructure open-weight reasoning experiments

Self-host

131K

Smaller OpenAI open-weight reasoning model released with the gpt-oss family under Apache 2.0 licensing.

Self-hosted or third-party-hosted cost depends on infrastructure. Official OpenAI pricing page does not list a token API rate for this model.

Docs

Google

Gemma 4 31B

Open Weight

Open-weight Google model for local or private-cloud reasoning and coding

Self-host

256K

Google DeepMind's 31B dense Gemma 4 open-weight model for coding, reasoning, and multimodal understanding.

Self-hosted or Vertex/third-party deployment cost depends on infrastructure; Gemma 4 is distributed as open weights.

Docs

Meta

Llama 4 Scout

Open Weight

Very long-context open-weight multimodal workloads

Self-host

10M

Meta's 17B-active, 109B-total open-weight multimodal MoE model with a very long advertised context window.

Self-hosted or partner-hosted cost depends on infrastructure. Distributed under the Llama 4 Community License.

Docs

Meta

Llama 4 Maverick

Open Weight

High-quality open-weight multimodal chat, reasoning, and coding

Self-host

Meta's 17B-active, 400B-total open-weight multimodal MoE model focused on strong quality-to-cost performance.

Self-hosted or partner-hosted cost depends on infrastructure. Distributed under the Llama 4 Community License.

Docs

Qwen

Qwen3-235B-A22B-Instruct-2507

Open Weight

Open-weight multilingual instruction following, coding, and tool-use workloads

Self-host

256K

Updated Qwen3 235B MoE instruct model with 22B active parameters, Apache 2.0 licensing, and improved long-context understanding.

Self-hosted or third-party-hosted cost depends on infrastructure.

Docs

AI Model Comparison Table

How to Use the AI Model Comparison Table

Frequently Asked Questions

Explore Related Resources

Embed This Tool

Continue Through Topic Hubs

Business and AI Economics Hub

Recommended Next Steps

Related Tools

AI Model Comparison Table