Compare proprietary and open-weight AI models in one up-to-date table covering API pricing, context, best-fit use cases, and official docs.
Enter your values
Open the AI Model Comparison Table and fill in the required input fields with your numbers or selections.
Review the calculation
The tool automatically computes the result as you type. Double-check your inputs to ensure accuracy.
Interpret your results
Review the calculated output along with any breakdowns, charts, or explanations provided to understand what the numbers mean for your situation.
Go deeper with workflow guides, side-by-side comparisons, and reusable embeds connected to this tool.
Add this calculator to your website with a simple iframe.
This tool is part of larger workflows. Open a hub to continue with the next relevant tools.
Continue your workflow with these tools from the same playbook.
AI vs Human Cost Calculator
Compare the cost of completing a task with AI tools versus hiring a human. Includes API costs, time savings, and ROI.
AI Automation Savings Calculator
Estimate time and cost savings from automating recurring business tasks with AI and workflow tools.
Customer Lifetime Value Calculator
Estimate customer lifetime value using monthly revenue, gross margin, and churn. Includes LTV:CAC ratio.
AI Agent Cost Estimator
Estimate monthly AI agent cost from model usage, tools, orchestration, and support overhead.
AI Automation Savings Calculator
Estimate time and cost savings from automating recurring business tasks with AI and workflow tools.
AI Business Case Generator
Generate a structured AI implementation business case with ROI, payback, risks, and timeline.
AI Carbon Footprint Calculator
Estimate annual carbon footprint of model usage based on query volume and usage patterns.
AI Chatbot ROI Calculator
Calculate expected ROI from deploying an AI chatbot for support, lead capture, and customer engagement.
AI Compliance Cost Calculator
Estimate AI compliance and governance costs by use case risk level and organizational scope.
AI Content Cost Calculator
Calculate the cost of producing content with AI versus hiring writers. Compare per-article costs and quality trade-offs.
AI Job Impact Analyzer
Analyze how AI will affect your job across 27 roles and 15 industries. See task-level automation risk, AI tools to learn, emerging roles, and a personalized action plan.
Next Step
Continue with AI vs Human Cost Calculator
27 models from 8 providers — 16 proprietary, 11 open weight. Prices are API costs per 1M tokens from official sources. Last verified: 2026-06-20.
Lowest API Cost
Mistral Mistral Small 4
$0.40
1M input + 1M output tokens
Highest API Cost
OpenAI GPT-5.5 Pro
$210.00
1M input + 1M output tokens
Open-Weight Options
11 listed models
11
Self-hosting costs vary; hosted API prices appear where official
| Provider | Model | Type | Best For | Input | Output | Total | Context | Provider Notes | Links |
|---|---|---|---|---|---|---|---|---|---|
| Mistral | Mistral Small 4 | Open Weight | Low-cost open-weight routing, extraction, and multimodal tasks | $0.10 | $0.30 | $0.40 per 2M tokens | Not listed | Mistral's Apache 2.0 open model for lightweight multimodal and multilingual agentic workloads. Listed at $0.10 input / $0.30 output per 1M tokens on Mistral's pricing page. | |
| DeepSeek | DeepSeek V4 Flash | Open Weight | Ultra-low-cost production traffic and large-context experimentation | $0.14 | $0.28 | $0.42 per 2M tokens | 1M | DeepSeek's cost-focused V4 model with thinking and non-thinking modes plus MIT-licensed weights. Hosted API cache-miss input price shown. Cache-hit input is $0.0028 per 1M tokens. | |
| DeepSeek | DeepSeek V4 Pro | Open Weight | Very low-cost open-weight reasoning and long-context API workloads | $0.43 | $0.87 | $1.30 per 2M tokens | 1M | DeepSeek's higher-concurrency V4 model with thinking mode, JSON output, tool calls, and MIT-licensed weights. Hosted API cache-miss input price shown. Cache-hit input is $0.003625 per 1M tokens. | |
| Gemini 3.1 Flash-Lite | Proprietary | Budget Gemini routing, translation, and lightweight extraction | $0.25 | $1.50 | $1.75 per 2M tokens | Gemini 3 family | Google's most cost-efficient Gemini 3 model for high-volume agentic tasks, translation, and simple data processing. Standard paid-tier text/image/video rate. Audio input is $0.50 per 1M tokens. | ||
| Mistral | Mistral Large 3 | Open Weight | Open-weight production use with low hosted API cost | $0.50 | $1.50 | $2.00 per 2M tokens | Not listed | Mistral describes Large 3 as an open-weight, general-purpose, flagship multimodal and multilingual model. Hosted API rate for mistral-large-latest; self-hosting costs depend on infrastructure. | |
| xAI | Grok Build 0.1 | Proprietary | Software-building workflows and lower-cost xAI usage | $1.00 | $2.00 | $3.00 per 2M tokens | 256K | xAI's build-focused model trained specifically for agentic coding workflows. Includes $0.20 cached input pricing. | |
| Gemini 3 Flash Preview | Proprietary | Cost-efficient Gemini 3 apps that need speed and multimodal support | $0.50 | $3.00 | $3.50 per 2M tokens | Gemini 3 family | Google's speed-focused Gemini 3 preview model with text, image, video, and audio input pricing. Standard paid-tier rate for text/image/video input. Audio input is $1 per 1M tokens. | ||
| xAI | Grok 4.3 | Proprietary | Low-cost frontier-style reasoning with a large context window | $1.25 | $2.50 | $3.75 per 2M tokens | 1M | xAI's current general model with agentic tool calling, minimal hallucinations, and configurable reasoning. Includes $0.20 cached input pricing. Same listed rate as Grok 4.20 reasoning and non-reasoning variants. | |
| OpenAI | GPT-5.4 Mini | Proprietary | High-volume OpenAI workloads that still need strong general quality | $0.75 | $4.50 | $5.25 per 2M tokens | 400K | OpenAI's strongest mini model for lower-latency, lower-cost workloads. Standard API rate. Batch and Flex reduce this to $0.375 input / $2.25 output per 1M tokens. | |
| Anthropic | Claude Haiku 4.5 | Proprietary | Fast, high-volume Claude tasks, routing, chat, and extraction | $1.00 | $5.00 | $6.00 per 2M tokens | 200K | Anthropic lists Haiku 4.5 as the fastest Claude model with near-frontier intelligence. Lowest current Claude text model rate in the latest comparison table. | |
| Mistral | Magistral Medium | Proprietary | Reasoning-heavy Mistral workloads with lower output cost than many frontier models | $2.00 | $5.00 | $7.00 per 2M tokens | Not listed | Mistral's premier thinking model for domain-specific, transparent, and multilingual reasoning. Listed as magistral-medium-latest on Mistral's pricing page. | |
| Mistral | Mistral Medium 3.5 | Open Weight | European open-weight option for reasoning, coding, and agent workflows | $1.50 | $7.50 | $9.00 per 2M tokens | Not listed | Mistral's current open model for state-of-the-art performance, enterprise deployment, reasoning, coding, and agents. Listed at $1.50 input / $7.50 output per 1M tokens on Mistral's pricing page. | |
| Gemini 3.5 Flash | Proprietary | Fast multimodal apps, search-grounded work, and high-throughput reasoning | $1.50 | $9.00 | $10.50 per 2M tokens | Gemini 3 family | Google's most intelligent speed-focused stable Gemini 3 model with frontier intelligence and grounding. Standard paid-tier rate. Priority is $2.70 input / $16.20 output per 1M tokens. | ||
| Gemini 2.5 Pro | Proprietary | Stable long-context Gemini workloads and complex reasoning | $1.25 | $10.00 | $11.25 per 2M tokens | 1M | Google still lists Gemini 2.5 Pro as an advanced model for complex tasks, deep reasoning, and coding. Rate for prompts <= 200K tokens. Prompts > 200K are $2.50 input / $15 output per 1M tokens. | ||
| Gemini 3.1 Pro Preview | Proprietary | Long-context multimodal reasoning and Google ecosystem integrations | $2.00 | $12.00 | $14.00 per 2M tokens | 1M | Google's Gemini 3.1 Pro preview for multimodal understanding, agentic capability, and coding. Standard rate for prompts <= 200K tokens. Prompts > 200K are $4 input / $18 output per 1M tokens. | ||
| OpenAI | GPT-5.4 | Proprietary | Lower-cost OpenAI frontier workloads and production agents | $2.50 | $15.00 | $17.50 per 2M tokens | 1M | OpenAI's more affordable frontier model for coding and professional work. Standard short-context rate. Long-context rate is $5 input / $22.50 output per 1M tokens. | |
| Anthropic | Claude Sonnet 4.6 | Proprietary | Production agents, coding assistants, analysis, and quality writing | $3.00 | $15.00 | $18.00 per 2M tokens | 1M | Anthropic describes Sonnet 4.6 as the best combination of speed and intelligence. Current Claude mid-tier flagship rate. Supports extended thinking and priority tier. | |
| Anthropic | Claude Opus 4.8 | Proprietary | Premium coding, deep reasoning, and enterprise knowledge work | $5.00 | $25.00 | $30.00 per 2M tokens | 1M | Anthropic's most capable Opus-tier model for complex reasoning and agentic coding. Model table lists $5 input / $25 output per million tokens. Microsoft Foundry context may be 200K. | |
| OpenAI | GPT-5.5 | Proprietary | Frontier OpenAI reasoning, agent workflows, and high-quality generation | $5.00 | $30.00 | $35.00 per 2M tokens | 1M | OpenAI's flagship model for complex reasoning, coding, and professional work. Standard short-context rate. Long-context rate is $10 input / $45 output per 1M tokens. | |
| Anthropic | Claude Fable 5 | Proprietary | Highest-end Anthropic reasoning, coding, and long-running agent tasks | $10.00 | $50.00 | $60.00 per 2M tokens | 1M | Anthropic's most capable widely released Claude model for demanding reasoning and long-horizon agentic work. Generally available on the Claude API as of June 9, 2026. Mythos 5 has the same listed rate but limited availability. | |
| OpenAI | GPT-5.5 Pro | Proprietary | Most demanding OpenAI tasks where quality matters more than cost | $30.00 | $180.00 | $210.00 per 2M tokens | Short / long | Highest-priced OpenAI Pro-tier frontier model on the official API pricing page. Standard short-context rate. Long-context rate is $60 input / $270 output per 1M tokens. | |
| OpenAI | gpt-oss-120b | Open Weight | Customizable open-weight reasoning on H100-class infrastructure | Self-host | Self-host | Self-host | 131K | OpenAI's most powerful open-weight reasoning model; 117B parameters with 5.1B active parameters and Apache 2.0 licensing. Self-hosted or third-party-hosted cost depends on infrastructure. Official OpenAI pricing page does not list a token API rate for this model. | |
| OpenAI | gpt-oss-20b | Open Weight | Local and lower-infrastructure open-weight reasoning experiments | Self-host | Self-host | Self-host | 131K | Smaller OpenAI open-weight reasoning model released with the gpt-oss family under Apache 2.0 licensing. Self-hosted or third-party-hosted cost depends on infrastructure. Official OpenAI pricing page does not list a token API rate for this model. | |
| Gemma 4 31B | Open Weight | Open-weight Google model for local or private-cloud reasoning and coding | Self-host | Self-host | Self-host | 256K | Google DeepMind's 31B dense Gemma 4 open-weight model for coding, reasoning, and multimodal understanding. Self-hosted or Vertex/third-party deployment cost depends on infrastructure; Gemma 4 is distributed as open weights. | ||
| Meta | Llama 4 Scout | Open Weight | Very long-context open-weight multimodal workloads | Self-host | Self-host | Self-host | 10M | Meta's 17B-active, 109B-total open-weight multimodal MoE model with a very long advertised context window. Self-hosted or partner-hosted cost depends on infrastructure. Distributed under the Llama 4 Community License. | |
| Meta | Llama 4 Maverick | Open Weight | High-quality open-weight multimodal chat, reasoning, and coding | Self-host | Self-host | Self-host | 1M | Meta's 17B-active, 400B-total open-weight multimodal MoE model focused on strong quality-to-cost performance. Self-hosted or partner-hosted cost depends on infrastructure. Distributed under the Llama 4 Community License. | |
| Qwen | Qwen3-235B-A22B-Instruct-2507 | Open Weight | Open-weight multilingual instruction following, coding, and tool-use workloads | Self-host | Self-host | Self-host | 256K | Updated Qwen3 235B MoE instruct model with 22B active parameters, Apache 2.0 licensing, and improved long-context understanding. Self-hosted or third-party-hosted cost depends on infrastructure. |