OpenAI · automatic caching

GPT-5.5

OpenAI flagship with automatic prefix caching above 1,024 tokens. A strong all-rounder and the default for auto.best when latency is not the constraint.

Call it through Zumik About OpenAI

$5.00

Input / 1M tokens

$30.00

Output / 1M tokens

$0.50

Cache read · −90%

1.1M

Context window

Specifications

At a glance.

Provider	OpenAI
Family	gpt-5
Released	2026-04
License	Proprietary
Context window	1.1M tokens
Max output	128K tokens
Modalities	text, image, pdf
Tool calling	Yes
Reasoning mode	Yes
Caching	automatic
Batch discount	50% off

Measured by Zumik

What reuse looks like here.

GPT-5.5 · agent trafficper request

Total input100%

Candidate reuse61%

Realized reuse54%

Capture rate88%

71/100

Speed score · 1424ms round-trip

138

Output tokens / sec

Reuse economics

What you actually pay once caching works.

At a typical 55% prefix reuse, a million input tokens on GPT-5.5 effectively costs $2.52 instead of $5.00 - blending to roughly $9.39 with a 25% output share. Background work drops a further 50% on the batch tier.

Estimate it for your workload

Best for

Routes through these aliases:

Call it

Same OpenAI client, this model.

python

from openai import OpenAI

client = OpenAI(base_url="https://api.zumik.ai/v1", api_key="zk_live_...")

r = client.responses.create(
    model="gpt-5-5",          # or an alias like auto.best
    input="Draft a fix for the failing test.",
)
print(r.usage.input_tokens_cached)   # confirm reuse

Compare

How GPT-5.5 stacks up.

Head-to-head

GPT-5.5 vs Claude Fable 5

The two default flagships for general and coding agents.

Head-to-head

GPT-5.5 vs Grok 4.3

Frontier general models with different cost-control levers.

Similar models

Other options for these workloads.

Frequently asked

GPT-5.5, answered.

How much does GPT-5.5 cost?

GPT-5.5 is $5.00 per million input tokens and $30.00 per million output tokens through Zumik. Cache reads are $0.50 per million, a 90% discount on input.

What is GPT-5.5's context window?

GPT-5.5 supports a 1.1M-token context window with up to 128K output tokens.

Does GPT-5.5 support prompt caching?

Yes. OpenAI uses Automatic prefix caching caching. In the Zumik corpus, GPT-5.5 shows a median cache capture of 88% on agent workloads.

Which Zumik aliases route to GPT-5.5?

GPT-5.5 is a candidate for the auto.best, auto.balanced, code.balanced aliases, selected when it wins under current routing policy.

Run GPT-5.5 with reuse measured.

Point an OpenAI client at Zumik and see exactly how much of this model's input you are reusing.

Migration quickstart Back to catalog