Cost Breakdown by Model
| Model | Requests/mo | Input Cost | Output Cost | Total |
|---|
These are estimates. Want to see your real costs?
Preto analyzes your actual LLM traffic and shows exactly where you're overspending. One URL change. No code rewrite.
See My Real Costs — FreeWorks with OpenAI, Anthropic, and Google. No credit card required.
How to Estimate Your LLM API Costs
LLM API pricing is based on tokens — the chunks of text that models process. Every API call has two cost components: input tokens (your prompt) and output tokens (the model's response). Output tokens are typically 2-5x more expensive because they require more computation.
The Cost Formula
Your monthly LLM cost comes down to three numbers: how many requests you send per day, how many tokens each request uses, and which model you're calling.
Monthly cost = (daily requests × 30) × (avg tokens per request ÷ 1,000,000) × price per million tokens
For most production apps, the total is higher than expected. The average AI startup spends $5K-$35K per month on LLM APIs, and 40% of that spend is typically waste — frontier models running tasks that cheaper models handle equally well.
Where Teams Overspend
- Model mismatch: Using GPT-4o ($10/M output) for tasks that GPT-4o mini ($0.60/M output) handles with identical quality
- Duplicate prompts: The average production app sends 15% identical prompts that could be cached
- Prompt bloat: System prompts that grow over time, adding tokens without adding value
- No per-feature tracking: Impossible to know which features drive which costs
From Estimates to Real Data
This calculator gives you a directional estimate. For real cost intelligence — per-request tracking, model routing recommendations, and automated savings — try Preto free. One URL change, no code rewrite, and you'll see exactly where every AI dollar goes.
Frequently Asked Questions
How much does it cost to use OpenAI's API?
OpenAI API costs vary by model. GPT-4o costs $2.50 per million input tokens and $10.00 per million output tokens. GPT-4o mini costs $0.15/$0.60. GPT-5 costs $10.00/$30.00. Use the calculator above to estimate your monthly bill based on your actual usage patterns.
How do I estimate my monthly LLM API costs?
Multiply your daily request count by 30, then multiply by your average input and output tokens per request. Divide by 1 million and multiply by the model's per-million-token price. The calculator above does this automatically for any model and supports multiple models at once.
Which LLM API is cheapest?
For most tasks, Google Gemini 2.0 Flash ($0.10/$0.40 per million tokens) and OpenAI GPT-4o mini ($0.15/$0.60) are the cheapest production-quality options. DeepSeek V3 ($0.27/$1.10) offers strong performance at low cost. The right choice depends on your quality requirements — cheaper models work well for classification, extraction, and summarization.
How can I reduce my LLM API costs?
The top strategies are: route simple tasks to cheaper models (saves 40-70%), cache repeated prompts (saves 10-30%), optimize prompt length (saves 15-25%), and use batch APIs for non-urgent requests (saves 50%). Preto.ai identifies these opportunities automatically by analyzing your actual API traffic.
What is the difference between input and output token pricing?
Input tokens are what you send to the model (your prompt, system instructions, context). Output tokens are what the model generates (its response). Output tokens are typically 2-5x more expensive because they require more computation. For example, GPT-4o charges $2.50 per million input tokens but $10.00 per million output tokens.