Streaming
Set stream: true to receive the response as Server-Sent Events. Tokens arrive chunk by chunk and end with a data: [DONE] line. Provider fallback can still occur before the first token.
Stream with the SDK
stream = client.chat.completions.create(
model="{{default_model}}",
messages=[{"role": "user", "content": "Stream a haiku."}],
max_tokens={{max_tokens}},
stream=True,
)
for chunk in stream:
delta = chunk.choices[0].delta.content or ""
print(delta, end="", flush=True)
Wire format
data: {"choices":[{"delta":{"content":"Hel"}}]}
data: {"choices":[{"delta":{"content":"lo"}}]}
data: [DONE]Billing still finalizes after the stream completes — usage and charge are recorded once the full response is known.