Data · benchmark

The numbers nobody else publishes.

Most benchmarks rank model quality. This one measures prompt reuse: how much providers actually capture, how much latency a warm prefix removes, and which workloads have reusable structure at all.

2.4M eligible requests

Prompt-cache capture by provider

Of the reuse a workload could capture, how much do providers actually deliver?

See results 180k paired requests

TTFT savings from a warm prefix

How much faster is time-to-first-token when the prefix is already cached?

See results 610 workloads

Reuse opportunity by workload type

Which agent workloads actually have reusable structure?

See results

Note

These figures are aggregate and anonymized, framed as methodology plus results. They illustrate the method and the shape of the findings rather than a guaranteed price sheet - your numbers come from a diagnostic on your own traffic.

Benchmark your own workload.

Run a diagnostic and get your capture rate, TTFT savings, and reuse score with provider-reported evidence.

Run a diagnostic Workload trends