Low-latency output
Streaming
Omixa supports Server-Sent Events for chat and compatible reasoning responses while keeping billing and usage capture consistent.
Server-Sent Events
Set `stream` to true and read each `data:` frame until `[DONE]`. The event body stays compatible with OpenAI-style clients where the selected route supports it.
const response = await fetch('/api/v1/chat/completions', {
method: 'POST',
headers: { Authorization: `Bearer ${process.env.OMIXA_KEY}`, 'Content-Type': 'application/json' },
body: JSON.stringify({ model: 'gpt-5', stream: true, messages: [{ role: 'user', content: 'Stream this.' }] })
});
Reasoning streams
Reasoning-capable models may expose summaries or thinking deltas when enabled. Use these fields for developer tools and observability, not for end-user guarantees.
- Use `reasoning_effort` only on model families that support it.
- Use `show_thinking` when you need model-native reasoning summaries.
- Always handle normal assistant text and reasoning chunks separately.