Leaderboard

Best models for coding agents

A composite for repository-scale coding agents: 60% quality index, 40% measured reuse, limited to tool-capable models that teams actually wire into coding loops.

All leaderboards

#	Model	Provider	Coding fit	List blended	Cache disc.
1	Claude Fable 5	Anthropic	85.2	$20.00	−90%
2	Claude Opus 4.8	Anthropic	84.2	$10.00	−90%
3	GPT-5.5	OpenAI	81.0	$11.25	−90%
4	Claude Sonnet 4.6	Anthropic	76.0	$6.00	−90%
5	GLM 5.1	Fireworks	73.8	$2.15	−81%
6	Kimi K2.6	Fireworks	72.2	$1.71	−83%
7	DeepSeek-V4-Pro	Fireworks	70.4	$2.17	−92%

Method

0.6 x intelligence + 0.4 x reuseMedianPct, filtered to tool-capable models tagged for coding or agentic use.

Other lenses

Rank the same models differently.

Cheapest models for cached agent workloads

Which model is cheapest once a typical agent prefix is served from cache?

Fastest time-to-first-token (warm prefix)

Which model responds fastest when the prefix is already cached?

Best intelligence per dollar

Which model gives the most quality per reuse-adjusted dollar?

Highest measured cache capture

Which models convert reuse opportunity into billed savings most reliably?

Rank models for your workload.

A diagnostic measures your real reuse and re-ranks the catalog for the way you actually call models.

Run a diagnostic Pricing calculator