Drop-in OpenAI compatibility, then more.
The fastest way onto Zumik is the one that changes the least code. Point your existing OpenAI client at /v1, keep every request and response shape identical, and start measuring reuse the same day.
from openai import OpenAI
client = OpenAI(
base_url="https://api.zumik.ai/v1", # the only line that changes
api_key="zk_live_...",
)
r = client.responses.create(
model="code.balanced", # or any OpenAI model id
input="Summarize the failing tests.",
)
print(r.output_text)A conformance-tested subset first, expanding deliberately.
Zumik ships a tested compatibility matrix rather than a long list of half-working routes. Phase 1 covers the surface most agents actually call.
| Method | Path | Purpose | Phase |
|---|---|---|---|
| POST | /v1/responses | Create a response | Phase 1 |
| GET | /v1/responses/{id} | Retrieve a response | Phase 1 |
| DELETE | /v1/responses/{id} | Delete a response | Phase 1 |
| POST | /v1/responses/{id}/cancel | Cancel a response | Phase 1 |
| POST | /v1/responses/input_tokens | Count input tokens | Phase 1 |
| POST | /v1/chat/completions | Chat Completions | Phase 1 |
| POST | /v1/embeddings | Create embeddings | Phase 1 |
| GET | /v1/models | List models | Phase 1 |
| POST | /v1/conversations | Conversation state | Phase 2 |
| POST | /v1/batches | Batch jobs | Phase 2 |
Proprietary behavior rides on headers, never on the body.
This is the contract that keeps compatibility honest: the JSON body stays exactly OpenAI-shaped, and anything Zumik-specific travels as an optional header. Omit them all and you get plain OpenAI behavior.
| Header | Value | Effect |
|---|---|---|
| Agent-Hints-Ref | ah_… | Reference a stored agent-hints contract. |
| Agent-Session-Ref | ses_… | Attach the request to a stateful session. |
| Agent-Trace-Mode | metadata | tokenized | encrypted_full_fidelity | Pick how much the trace retains. |
| Agent-Idempotency-Key | … | Make a mutating request safely retryable. |
On the way back, QoS and trace metadata are returned as response headers (Agent-QoS-Target-Met, Agent-Trace-Id) so your OpenAI response parser never sees an unexpected field.
Same code, new visibility.
The base-URL swap costs nothing and immediately gives you reuse measurement, reproducible alias routing, and provider-native caching captured and reported. When you want more, the native /v2 API adds sessions, branches, replay, and purge receipts.
{
"usage": {
"input_tokens": 18240,
"input_tokens_cached": 11310, // provider_reported
"output_tokens": 642
}
}OpenAI compatibility, answered.
Is the /v1 API really byte-for-byte OpenAI-compatible?
Yes. Request and response JSON shapes match upstream exactly. Zumik never adds proprietary fields, renames fields, or invents response events on /v1. A client sending no Zumik headers receives valid OpenAI behavior.
What happens to OpenAI features Zumik cannot implement?
Rather than silently ignoring a meaningful field, Zumik either rejects the request with an OpenAI-compatible error or routes it to a backend that can implement it correctly.
How do I use Zumik-specific behavior without breaking compatibility?
Through optional HTTP headers (Agent-Hints-Ref, Agent-Session-Ref, Agent-Trace-Mode, Agent-Idempotency-Key) or by moving to the native /v2 API. The /v1 body stays untouched.
Which SDKs work?
Any OpenAI SDK, Python, TypeScript, Go, plus raw curl, works by setting the base URL to https://api.zumik.ai/v1 and using a Zumik key.
Change one line. Start measuring.
Point your OpenAI client at api.zumik.ai/v1 and see what your agents actually reuse. $5/month plus pay-as-you-go credits.