Drop-in OpenAI compatibility, then more.

The fastest way onto Zumik is the one that changes the least code. Point your existing OpenAI client at /v1, keep every request and response shape identical, and start measuring reuse the same day.

from openai import OpenAI

client = OpenAI(
    base_url="https://api.zumik.ai/v1",   # the only line that changes
    api_key="zk_live_...",
)

r = client.responses.create(
    model="code.balanced",                # or any OpenAI model id
    input="Summarize the failing tests.",
)
print(r.output_text)

A conformance-tested subset first, expanding deliberately.

Zumik ships a tested compatibility matrix rather than a long list of half-working routes. Phase 1 covers the surface most agents actually call.

MethodPathPurposePhase
POST/v1/responsesCreate a responsePhase 1
GET/v1/responses/{id}Retrieve a responsePhase 1
DELETE/v1/responses/{id}Delete a responsePhase 1
POST/v1/responses/{id}/cancelCancel a responsePhase 1
POST/v1/responses/input_tokensCount input tokensPhase 1
POST/v1/chat/completionsChat CompletionsPhase 1
POST/v1/embeddingsCreate embeddingsPhase 1
GET/v1/modelsList modelsPhase 1
POST/v1/conversationsConversation statePhase 2
POST/v1/batchesBatch jobsPhase 2

Proprietary behavior rides on headers, never on the body.

This is the contract that keeps compatibility honest: the JSON body stays exactly OpenAI-shaped, and anything Zumik-specific travels as an optional header. Omit them all and you get plain OpenAI behavior.

HeaderValueEffect
Agent-Hints-Refah_…Reference a stored agent-hints contract.
Agent-Session-Refses_…Attach the request to a stateful session.
Agent-Trace-Modemetadata | tokenized | encrypted_full_fidelityPick how much the trace retains.
Agent-Idempotency-KeyMake a mutating request safely retryable.
Result

On the way back, QoS and trace metadata are returned as response headers (Agent-QoS-Target-Met, Agent-Trace-Id) so your OpenAI response parser never sees an unexpected field.

Why move

Same code, new visibility.

The base-URL swap costs nothing and immediately gives you reuse measurement, reproducible alias routing, and provider-native caching captured and reported. When you want more, the native /v2 API adds sessions, branches, replay, and purge receipts.

Move to native /v2 state
usage object
{
  "usage": {
    "input_tokens": 18240,
    "input_tokens_cached": 11310,   // provider_reported
    "output_tokens": 642
  }
}

OpenAI compatibility, answered.

Is the /v1 API really byte-for-byte OpenAI-compatible?

Yes. Request and response JSON shapes match upstream exactly. Zumik never adds proprietary fields, renames fields, or invents response events on /v1. A client sending no Zumik headers receives valid OpenAI behavior.

What happens to OpenAI features Zumik cannot implement?

Rather than silently ignoring a meaningful field, Zumik either rejects the request with an OpenAI-compatible error or routes it to a backend that can implement it correctly.

How do I use Zumik-specific behavior without breaking compatibility?

Through optional HTTP headers (Agent-Hints-Ref, Agent-Session-Ref, Agent-Trace-Mode, Agent-Idempotency-Key) or by moving to the native /v2 API. The /v1 body stays untouched.

Which SDKs work?

Any OpenAI SDK, Python, TypeScript, Go, plus raw curl, works by setting the base URL to https://api.zumik.ai/v1 and using a Zumik key.

Change one line. Start measuring.

Point your OpenAI client at api.zumik.ai/v1 and see what your agents actually reuse. $5/month plus pay-as-you-go credits.