Provider · context caching

xAI

Grok 4 and Grok-3 Mini, context caching, cheap fast frontier.

xAI caching guide Browse all models

−75%

Cache-read discount

No batch tier

Models on Zumik

Yes

BYOK supported

How caching works here

Grok models reuse a cached context prefix when consecutive requests share it. There is no async batch tier today, so cost control depends on cache hits and routing the cheap Grok-3 Mini where quality allows.

What Zumik sees

Without a batch lane, xAI cost discipline lives entirely in alias routing and reuse. Zumik leans on Grok-3 Mini for auto.fast and reserves Grok 4 for auto.best to keep blended cost in range.

Pitfall

Treating xAI like OpenAI for background jobs - there is no 50% batch discount to fall back on, so non-interactive work should usually route elsewhere.

Profile

Min cache size1,024 tok

RetentionShort idle window

Service tiersstandard

BYOCManaged only

Models

xAI models in the catalog.

Model	Context	Input	Output	Cache read	Reuse-adj
Grok Build 0.1	256K	$1.00	$2.00	$0.20 −80%	$0.92
Grok 4.20 (Non-Reasoning)	1M	$1.25	$2.50	$0.20 −84%	$1.13
Grok 4.20 (Reasoning)	1M	$1.25	$2.50	$0.20 −84%	$1.13
Grok 4.20 Multi-Agent	1M	$1.25	$2.50	$0.20 −84%	$1.13
Grok 4.3	1M	$1.25	$2.50	$0.20 −84%	$1.13

Frequently asked

xAI, answered.

How does xAI prompt caching work?

What discount does xAI caching give?

Cache reads on xAI are about 75% cheaper than list input price.

Does xAI support BYOK on Zumik?

Yes. You can bring your own xAI key, and provider-native caching, batch, and service tiers stay active under your account.

What is the common xAI caching mistake?

Treating xAI like OpenAI for background jobs - there is no 50% batch discount to fall back on, so non-interactive work should usually route elsewhere.

Route xAI the smart way.

Capture xAI's 75% cache-read discount automatically through Zumik.

xAI caching guide Compare providers