Provider · automatic caching

OpenAI

GPT-5 family, automatic prefix caching, flex/scale tiers.

OpenAI caching guide Browse all models

−75%

Cache-read discount

50%

Batch discount

Models on Zumik

Yes

BYOK supported

How caching works here

Caching is automatic for prompts at or above 1,024 tokens. The longest matching prefix is reused, billed at the cache-read rate, and reported back as cached tokens in the usage object. Keeping stable content at the front of the request is what makes the prefix match.

What Zumik sees

Across our corpus, OpenAI returns provider-reported cached-token counts on most eligible requests, which gives Zumik the strongest evidence level (provider_reported) for capture without any runtime instrumentation.

Pitfall

Injecting a timestamp, request id, or per-call system note near the top of the prompt resets the prefix and silently drops the hit rate to near zero.

Profile

Min cache size1,024 tok

RetentionMinutes idle, up to ~24h on extended retention

Service tiersflex, default, scale, priority

BYOCManaged only

Models

OpenAI models in the catalog.

Model	Context	Input	Output	Cache read	Reuse-adj
GPT-5 Nano	400K	$0.05	$0.40	$0.01 −90%	$0.12
GPT-5 Nano-2025-08-07	400K	$0.05	$0.40	$0.01 −90%	$0.12
GPT-4.1 Nano	1M	$0.10	$0.40	$0.03 −75%	$0.14
GPT-4.1 Nano-2025-04-14	1M	$0.10	$0.40	$0.03 −75%	$0.14
GPT-4o Mini	128K	$0.15	$0.60	$0.07 −50%	$0.23
GPT-4o Mini-2024-07-18	128K	$0.15	$0.60	$0.07 −50%	$0.23
GPT-5.4 Nano	400K	$0.20	$1.25	$0.02 −90%	$0.39
GPT-5.4 Nano-2026-03-17	400K	$0.20	$1.25	$0.02 −90%	$0.39
GPT-5 Mini	400K	$0.25	$2.00	$0.03 −90%	$0.59
GPT-5 Mini-2025-08-07	400K	$0.25	$2.00	$0.03 −90%	$0.59
GPT-4.1 Mini	1M	$0.40	$1.60	$0.10 −75%	$0.58
GPT-4.1 Mini-2025-04-14	1M	$0.40	$1.60	$0.10 −75%	$0.58
GPT-3.5-turbo	16K	$0.50	$1.50	$0.00 −100%	$0.54
GPT-3.5-turbo-0125	16K	$0.50	$1.50	$0.00 −100%	$0.54
GPT-3.5-turbo-1106	16K	$0.50	$1.50	$0.00 −100%	$0.54
GPT-5.4 Mini	400K	$0.75	$4.50	$0.07 −90%	$1.41
GPT-5.4 Mini-2026-03-17	400K	$0.75	$4.50	$0.07 −90%	$1.41
o3 Mini	200K	$1.10	$4.40	$0.55 −50%	$1.70
o3 Mini-2025-01-31	200K	$1.10	$4.40	$0.55 −50%	$1.70
o4 Mini	200K	$1.10	$4.40	$0.28 −75%	$1.58
o4 Mini-2025-04-16	200K	$1.10	$4.40	$0.28 −75%	$1.58
GPT-5	400K	$1.25	$10.00	$0.13 −90%	$2.97
GPT-5 Chat	400K	$1.25	$10.00	$0.13 −90%	$2.97
GPT-5 Codex	400K	$1.25	$10.00	$0.13 −90%	$2.97
GPT-5-2025-08-07	400K	$1.25	$10.00	$0.13 −90%	$2.97
GPT-5.1	400K	$1.25	$10.00	$0.13 −90%	$2.97
GPT-5.1 Chat	128K	$1.25	$10.00	$0.13 −90%	$2.97
GPT-5.1 Codex	400K	$1.25	$10.00	$0.13 −90%	$2.97
GPT-5.1 Codex Max	400K	$1.25	$10.00	$0.13 −90%	$2.97
GPT-5.1-2025-11-13	400K	$1.25	$10.00	$0.13 −90%	$2.97
GPT-5.2	400K	$1.75	$14.00	$0.17 −90%	$4.16
GPT-5.2 Chat	128K	$1.75	$14.00	$0.17 −90%	$4.16
GPT-5.2 Codex	400K	$1.75	$14.00	$0.17 −90%	$4.16
GPT-5.2-2025-12-11	400K	$1.75	$14.00	$0.17 −90%	$4.16
GPT-5.3 Chat	128K	$1.75	$14.00	$0.17 −90%	$4.16
GPT-5.3 Codex	400K	$1.75	$14.00	$0.17 −90%	$4.16
GPT-4.1	1M	$2.00	$8.00	$0.50 −75%	$2.88
GPT-4.1-2025-04-14	1M	$2.00	$8.00	$0.50 −75%	$2.88
o3	200K	$2.00	$8.00	$0.50 −75%	$2.88
o3-2025-04-16	200K	$2.00	$8.00	$0.50 −75%	$2.88
o4 Mini-deep-research	200K	$2.00	$8.00	$0.50 −75%	$2.88
GPT-4o	128K	$2.50	$10.00	$1.25 −50%	$3.86
GPT-4o-2024-08-06	128K	$2.50	$10.00	$1.25 −50%	$3.86
GPT-4o-2024-11-20	128K	$2.50	$10.00	$1.25 −50%	$3.86
GPT-5.4	1.1M	$2.50	$15.00	$0.25 −90%	$4.70
GPT-5.4-2026-03-05	1.1M	$2.50	$15.00	$0.25 −90%	$4.70
GPT-4o-2024-05-13	128K	$5.00	$15.00	$5.00	$7.50
GPT-5.5	1.1M	$5.00	$30.00	$0.50 −90%	$9.39
GPT-5.5-2026-04-23	1.1M	$5.00	$30.00	$0.50 −90%	$9.39
GPT-4-turbo	128K	$10.00	$30.00	$10.00	$15.00
GPT-4-turbo-2024-04-09	128K	$10.00	$30.00	$10.00	$15.00
GPT-5 Pro	400K	$15.00	$120.00	$15.00	$41.25
GPT-5 Pro-2025-10-06	400K	$15.00	$120.00	$15.00	$41.25
o1	200K	$15.00	$60.00	$7.50 −50%	$23.16
o1-2024-12-17	200K	$15.00	$60.00	$7.50 −50%	$23.16
GPT-5.2 Pro	400K	$21.00	$168.00	$21.00	$57.75
GPT-5.2 Pro-2025-12-11	400K	$21.00	$168.00	$21.00	$57.75
GPT-4	8K	$30.00	$60.00	$30.00	$37.50
GPT-4-0613	8K	$30.00	$60.00	$30.00	$37.50
GPT-5.4 Pro	1.1M	$30.00	$180.00	$30.00	$67.50
GPT-5.4 Pro-2026-03-05	1.1M	$30.00	$180.00	$30.00	$67.50
GPT-5.5 Pro	1.1M	$30.00	$180.00	$30.00	$67.50
GPT-5.5 Pro-2026-04-23	1.1M	$30.00	$180.00	$30.00	$67.50
o1 Pro	200K	$150.00	$600.00	$150.00	$262.50
o1 Pro-2025-03-19	200K	$150.00	$600.00	$150.00	$262.50
GPT-3.5-turbo-16k	16K	—	—	—	—
GPT-3.5-turbo-instruct	4K	—	—	—	—
GPT-3.5-turbo-instruct-0914	4K	—	—	—	—

Frequently asked

OpenAI, answered.

How does OpenAI prompt caching work?

What discount does OpenAI caching give?

Cache reads on OpenAI are about 75% cheaper than list input price.

Does OpenAI support BYOK on Zumik?

Yes. You can bring your own OpenAI key, and provider-native caching, batch, and service tiers stay active under your account.

What is the common OpenAI caching mistake?

Injecting a timestamp, request id, or per-call system note near the top of the prompt resets the prefix and silently drops the hit rate to near zero.

Route OpenAI the smart way.

Capture OpenAI's 75% cache-read discount and batch tier automatically through Zumik.

OpenAI caching guide Compare providers