Workload Reuse Score (WRS)
A 0-100 score of how much a workload can benefit from reuse, built from opportunity, recurrence, locality, latency sensitivity, continuity, and payload redundancy.
WRS replaces the crude "long prompts mean BYOC" heuristic. It combines opportunity ratio, recurrence, retention locality, TTFT sensitivity, session continuity, and payload redundancy into one comparable number.
A high WRS means the workload has reuse potential - not that it needs self-hosting. Deployment readiness is scored separately so infrastructure decisions follow evidence, not prompt length.
Keep reading.
Reuse opportunity
The maximum share of input tokens that could be served from cache, independent of whether they actually were.
Cache capture rate
Realized reused tokens divided by candidate reusable tokens - how much of the available reuse a provider actually delivered.
Retention locality
The share of repeated requests that recur within the provider or runtime cache-retention window, so the cache is still warm.
See it in practice.
Definitions are useful; measurement is better. Run a diagnostic on your own workload.