DeepSeek-V4-Pro

Open-weights reasoning model with frontier-adjacent quality at open-weights prices. A common BYOC candidate when volume is concentrated.

$1.74
Input / 1M tokens
$3.48
Output / 1M tokens
$0.14
Cache read · −92%
1M
Context window

At a glance.

ProviderFireworks AI
Familydeepseek_v4
Released2026-04
LicenseOpen weights
Context window1M tokens
Max output384K tokens
Parameters1600B
Modalitiestext
Tool callingYes
Reasoning modeYes
Cachingautomatic
Batch discountNo batch tier

What reuse looks like here.

DeepSeek-V4-Pro · agent trafficper request
Total input100%
Candidate reuse59%
Realized reuse47%
Capture rate80%
130ms
Warm TTFT · −63% vs cold
270
Output tokens / sec
Reuse economics

What you actually pay once caching works.

At a typical 55% prefix reuse, a million input tokens on DeepSeek-V4-Pro effectively costs $0.86 instead of $1.74 - blending to roughly $1.52 with a 25% output share. There is no batch tier, so cost control here leans on caching and routing.

Estimate it for your workload
Best for
codingreasoningcost-sensitive

Routes through these aliases:

Same OpenAI client, this model.

python
from openai import OpenAI

client = OpenAI(base_url="https://api.zumik.ai/v1", api_key="zk_live_...")

r = client.responses.create(
    model="deepseek-v4-pro",          # or an alias like code.balanced
    input="Draft a fix for the failing test.",
)
print(r.usage.input_tokens_cached)   # confirm reuse

DeepSeek-V4-Pro, answered.

How much does DeepSeek-V4-Pro cost?

DeepSeek-V4-Pro is $1.74 per million input tokens and $3.48 per million output tokens through Zumik. Cache reads are $0.14 per million, a 92% discount on input.

What is DeepSeek-V4-Pro's context window?

DeepSeek-V4-Pro supports a 1M-token context window with up to 384K output tokens.

Does DeepSeek-V4-Pro support prompt caching?

Yes. Fireworks AI uses Automatic prompt caching (serverless and dedicated) caching. In the Zumik corpus, DeepSeek-V4-Pro shows a median cache capture of 80% on agent workloads.

Which Zumik aliases route to DeepSeek-V4-Pro?

DeepSeek-V4-Pro is a candidate for the code.balanced, reasoning.best, auto.cheapest aliases, selected when it wins under current routing policy.

Run DeepSeek-V4-Pro with reuse measured.

Point an OpenAI client at Zumik and see exactly how much of this model's input you are reusing.