The most expensive assumption in agent infrastructure is that a repeated prefix means saved compute. It does not. A prefix can repeat all day and still be recomputed every time if it expired from the cache, if a timestamp at the top broke the match, or if the provider never had a chance to warm it.
So we refuse to collapse the two ideas. Zumik reports opportunity and capture separately, and treats the difference as the thing worth fixing.
Opportunity is the ceiling
Opportunity ratio is candidate reusable tokens over total input tokens. It comes from prefix-family analysis: how often do equivalent reusable blocks recur in your traffic? A coding agent with a fixed tool registry and repo policy might show an opportunity ratio above 0.7. That is the most you could ever save on input prefill.
It is a ceiling, not a promise. Nothing about opportunity guarantees a single cache hit.
Capture is what you actually got
Capture rate is realized reused tokens over candidate reusable tokens. This is the honest number, and it is the one that shows up on the invoice. In our corpus, capture varies more by provider scheme than by anything else: explicit caching captures the most when breakpoints are placed well, implicit caching is convenient but swings widely, and short retention windows quietly cap everything.
When opportunity is high and capture is low, the fix is almost never new hardware. It is prompt ordering, breakpoint placement, or retention locality.
The missed-opportunity gap
We name the difference explicitly: missed-opportunity tokens. It is the single most actionable line in a diagnostic, because it tells you how much money is sitting on the table and, usually, why.
Close that gap with prompt construction first. Only after capture plateaus does it make sense to ask whether a dedicated lane or BYOC would move it further.