The numbers nobody else publishes.
Most benchmarks rank model quality. This one measures prompt reuse: how much providers actually capture, how much latency a warm prefix removes, and which workloads have reusable structure at all.
2.4M eligible requests
Prompt-cache capture by provider
Of the reuse a workload could capture, how much do providers actually deliver?
See results180k paired requestsTTFT savings from a warm prefix
How much faster is time-to-first-token when the prefix is already cached?
See results610 workloadsReuse opportunity by workload type
Which agent workloads actually have reusable structure?
See resultsNote
These figures are aggregate and anonymized, framed as methodology plus results. They illustrate the method and the shape of the findings rather than a guaranteed price sheet - your numbers come from a diagnostic on your own traffic.
Benchmark your own workload.
Run a diagnostic and get your capture rate, TTFT savings, and reuse score with provider-reported evidence.