Real-time streaming of user speech (STT) and agent speech (TTS) events for an active call via Server-Sent Events.
The connection is real-time — events stream directly from the call runtime as they are produced. The SSE connection auto-closes when the call ends (sse_close event). Only active calls can be subscribed to; completed calls return a 400 error.
Transcript event types:
user_interim_transcription — Partial, in-progress transcription as the user speaks. Use for live preview only; will be superseded by user_transcription.user_transcription — Final transcription for a completed user speech turn.tts_completed — Fired when the agent finishes speaking a TTS segment. Includes the spoken text and optionally TTS latency.Lifecycle events:
sse_init — Sent immediately when the SSE connection is established.sse_close — Sent when the call ends, right before the server closes the connection.Other event types (e.g. call_start, call_end, turn_latency, metrics) are also sent on this stream.
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
The call ID to subscribe events for
"CALL-1758124225863-80752e"
SSE event stream established successfully
Events are sent as data: <JSON>\n\n. Each event has an event_type field.
The type of event
sse_init, user_interim_transcription, user_transcription, tts_completed, sse_close Unique identifier for the event
ISO 8601 timestamp of the event
The call ID this event belongs to
Partial transcription text (only for user_interim_transcription)
Final transcription text (only for user_transcription)
Text spoken by the agent (only for tts_completed)
TTS latency in milliseconds (only for tts_completed)