Send background work to a batch tier
You have evaluations, backfills, or summaries that do not need to be interactive.
01
Tag the request as background QoS
Set the QoS class to background. Zumik routes eligible traffic to a batch tier and reports the outcome.
z.responses.create(
model="batch.standard",
input=long_eval_prompt,
qos={"class": "background"},
)Where to go from here.
Got it running? Measure it.
Once requests flow through Zumik, a diagnostic shows exactly what you are reusing.