Retention locality
The share of repeated requests that recur within the provider or runtime cache-retention window, so the cache is still warm.
A prefix only helps if it recurs before the cache expires. Retention locality measures how much recurrence lands inside the relevant window - minutes for some schemes, an hour or more for others.
High opportunity with poor locality is the classic trap: the structure repeats, but too slowly to ever be warm.
Keep reading.
Workload Reuse Score (WRS)
A 0-100 score of how much a workload can benefit from reuse, built from opportunity, recurrence, locality, latency sensitivity, continuity, and payload redundancy.
Prompt caching
Reusing the computed state of a repeated prompt prefix so it is billed at a reduced cache-read rate instead of being recomputed.
Cache capture rate
Realized reused tokens divided by candidate reusable tokens - how much of the available reuse a provider actually delivered.
See it in practice.
Definitions are useful; measurement is better. Run a diagnostic on your own workload.