Get instant access to the calculator
Enter your work email and start calculating your real AI unit economics.
- Real gross margin per pricing tier
- Power user break-even threshold
- LLM intensity ratio health check
- Optimization impact (routing + caching)
No spam. Unsubscribe anytime.
Pricing Tiers
Add each tier. Set price to $0 for free tiers.
Model & Tokens
Primary model and average tokens per query.
Other Costs / User
Non-LLM per-user monthly costs.
What-If Scenario
Toggle to see optimized vs. current margins.
Margin by Tier
| Tier | Price | Users | LLM/User | Profit | Margin |
|---|
Health Check
Power User Break-Even
Max queries/day before margin goes negative
Want your real numbers?
Preto tracks LLM costs per feature, per user, per tier. One URL change.
See Real Unit Economics — FreeFree up to 10K requests. No credit card.
How to Calculate AI SaaS Unit Economics
Traditional SaaS gross margin calculations assume infrastructure costs are fixed and scale sublinearly with users. AI SaaS breaks this assumption. LLM API costs are variable COGS that scale directly with user engagement — when a user makes more queries, the cost goes up proportionally.
Why Blended Margins Are Misleading
A single "gross margin" number across all pricing tiers hides the real story. Your free tier may have the same usage patterns as your paid tier but generates zero revenue. Your enterprise tier likely has higher per-user costs but much higher pricing. Calculating each tier separately often reveals that one tier is subsidizing another.
The LLM Intensity Ratio
Track your total LLM API spend as a percentage of MRR every month. Below 20% is healthy. Between 20-30% needs monitoring. Above 30% requires active cost management. Above 50% is a crisis that will show up in your next board meeting.
The Power User Problem
In most AI SaaS products, the top 10% of users by feature usage account for 40-60% of LLM costs. If those power users are on your lowest pricing tier — which they often are — you have a systematic negative-margin cohort. This calculator helps you find the break-even query volume for each tier.
From Estimates to Real Data
This calculator gives you directional unit economics. For real per-user, per-feature cost attribution — the numbers you'd put in a board deck or investor memo — try Preto free. One URL change, and you'll see exactly how LLM costs map to your revenue tiers.
Frequently Asked Questions
What is the real gross margin for AI SaaS products?
Most AI SaaS products show 70-80% gross margins in pitch decks, but after properly classifying LLM API costs as variable COGS, the real number is often 40-55% unoptimized. With model routing and caching applied, teams typically reach 60-68%. The gap depends on your model choice, query volume per user, and whether you have agentic workflows multiplying calls.
How do I calculate LLM cost per user?
Multiply daily queries per user by 30 (for monthly), then multiply by average tokens per query. Divide by 1 million and multiply by your model's per-million-token price. For agentic workflows, multiply the query count by calls per action (typically 5-15). The result is your LLM COGS per user per month.
What is the LLM intensity ratio?
The LLM intensity ratio is your total monthly LLM API spend divided by your MRR. It tells you how much of each revenue dollar goes to AI costs. Below 20% is comfortable, 20-30% needs monitoring, above 30% requires active cost management, and above 50% is a crisis. Track it monthly — when it grows faster than MRR, your margin is compressing in real time.
Should LLM API costs be classified as COGS?
Yes. LLM API costs should be classified as variable COGS, not fixed infrastructure. Unlike traditional SaaS hosting costs that scale sublinearly with users, LLM costs scale directly and linearly with user engagement. This distinction matters for accurate gross margin calculation and for investor due diligence.
How much can model routing and caching reduce costs?
Model routing typically saves 20-40% by sending simple tasks (classification, extraction, yes/no questions) to cheap models at $0.10-0.60/M tokens instead of frontier models at $2-15/M tokens. Prompt caching saves another 15-25% on duplicate requests. Combined, most teams see 40-60% total reduction without any quality loss on the tasks that matter.