Best intelligence per dollar
Quality index divided by reuse-adjusted blended cost at 50% reuse. Rewards models that are both capable and cheap to run once caching is working.
| # | Model | Provider | Quality / $ | List blended | Cache disc. |
|---|---|---|---|---|---|
| 1 | OpenAI gpt-oss-120b | Fireworks | 372.9 | $0.26 | −90% |
| 2 | GPT-5 Mini | OpenAI | 136.0 | $0.69 | −90% |
| 3 | Grok 4.3 | xAI | 78.7 | $1.56 | −84% |
| 4 | Kimi K2.6 | Fireworks | 60.0 | $1.71 | −83% |
| 5 | DeepSeek-V4-Pro | Fireworks | 55.8 | $2.17 | −92% |
| 6 | GLM 5.1 | Fireworks | 49.9 | $2.15 | −81% |
| 7 | Claude Haiku 4.5 | Anthropic | 48.1 | $2.00 | −90% |
| 8 | Gemini 3.5 Flash | 30.0 | $3.38 | −90% | |
| 9 | Gemini 3.1 Pro Preview | 24.6 | $4.50 | −90% | |
| 10 | Claude Sonnet 4.6 | Anthropic | 18.0 | $6.00 | −90% |
| 11 | Claude Opus 4.8 | Anthropic | 11.7 | $10.00 | −90% |
| 12 | Claude Opus 4.7 | Anthropic | 11.4 | $10.00 | −90% |
| 13 | GPT-5.5 | OpenAI | 10.0 | $11.25 | −90% |
| 14 | Claude Fable 5 | Anthropic | 5.9 | $20.00 | −90% |
| 15 | GPT-5.5 Pro | OpenAI | 1.4 | $67.50 | −0% |
Method
intelligence ÷ reuseAdjustedBlended(model, 0.5, 0.25).
Rank the same models differently.
Cheapest models for cached agent workloads
Which model is cheapest once a typical agent prefix is served from cache?
Fastest time-to-first-token (warm prefix)
Which model responds fastest when the prefix is already cached?
Best models for coding agents
Which models balance code quality with reuse economics?
Highest measured cache capture
Which models convert reuse opportunity into billed savings most reliably?
Rank models for your workload.
A diagnostic measures your real reuse and re-ranks the catalog for the way you actually call models.