Glossary

Cache capture rate

Realized reused tokens divided by candidate reusable tokens - how much of the available reuse a provider actually delivered.

Prompt-cache capture benchmark

Capture rate is the honest metric. You can have huge opportunity and poor capture if breakpoints are misplaced, prefixes are volatile, or retention windows are too short.

Our prompt-cache-capture benchmark reports median capture by provider, with the evidence level attached so you know how trustworthy each number is.

Related terms

Keep reading.

Reuse opportunity

The maximum share of input tokens that could be served from cache, independent of whether they actually were.

Evidence level

A label for how trustworthy a reuse measurement is, from provider_reported down to trace_estimated and unknown.

Prompt caching

Reusing the computed state of a repeated prompt prefix so it is billed at a reduced cache-read rate instead of being recomputed.

See it in practice.

Definitions are useful; measurement is better. Run a diagnostic on your own workload.

Run a diagnostic Back to glossary