Cache capture rate
Realized reused tokens divided by candidate reusable tokens - how much of the available reuse a provider actually delivered.
Capture rate is the honest metric. You can have huge opportunity and poor capture if breakpoints are misplaced, prefixes are volatile, or retention windows are too short.
Our prompt-cache-capture benchmark reports median capture by provider, with the evidence level attached so you know how trustworthy each number is.
Keep reading.
Reuse opportunity
The maximum share of input tokens that could be served from cache, independent of whether they actually were.
Evidence level
A label for how trustworthy a reuse measurement is, from provider_reported down to trace_estimated and unknown.
Prompt caching
Reusing the computed state of a repeated prompt prefix so it is billed at a reduced cache-read rate instead of being recomputed.
See it in practice.
Definitions are useful; measurement is better. Run a diagnostic on your own workload.