Highest measured cache capture

Median capture rate from the Zumik corpus: realized reused tokens over candidate reusable tokens. High capture means the savings actually land.

#ModelProviderCapture rateList blendedCache disc.
1Claude Fable 5Anthropic93%$20.0090%
2Claude Opus 4.8Anthropic93%$10.0090%
3Claude Opus 4.7Anthropic92%$10.0090%
4Claude Sonnet 4.6Anthropic91%$6.0090%
5Claude Haiku 4.5Anthropic90%$2.0090%
6GPT-5.5 ProOpenAI90%$67.500%
7GPT-5.5OpenAI88%$11.2590%
8GPT-5 MiniOpenAI85%$0.6990%
9GLM 5.1Fireworks82%$2.1581%
10Gemini 3.1 Pro PreviewGoogle82%$4.5090%
11Kimi K2.6Fireworks81%$1.7183%
12Gemini 3.5 FlashGoogle81%$3.3890%
13DeepSeek-V4-ProFireworks80%$2.1792%
14OpenAI gpt-oss-120bFireworks79%$0.2690%
15Grok 4.3xAI76%$1.5684%
Method

Median captureRatePct across eligible requests, with provider-reported evidence where available.

Rank models for your workload.

A diagnostic measures your real reuse and re-ranks the catalog for the way you actually call models.