Prompt caching, captured - not just configured.

Every provider caches prompts, but the discount only lands if you place stable content correctly and respect each scheme's quirks. These guides explain how to capture the savings, and the small mistakes that quietly erase them.

Four ways providers cache.

Automatic

Caches any prefix over a threshold with no markup. Forgiving, capped discount. (OpenAI, Fireworks)

Explicit

You place cache_control breakpoints. Highest discount, write premium. (Anthropic)

Implicit

Discounts apply on recent prefix overlap, no breakpoints. Convenient, variable. (Gemini)

Context

Reuses a cached context across consecutive requests. (xAI)

Measure your real capture rate.

See how much of each scheme's discount your workload actually lands - with provider-reported evidence.