Skip to main content

API Guides & Tutorials > Voice Agent

VOICE SESSION API

REST API for creating and managing LiveKit-based voice agent sessions.

Overview

The Voice Session API provides a two-step flow for voice agent interactions:

  1. Create a session via REST — returns a LiveKit room URL + JWT
  2. Connect via LiveKit WebRTC — stream audio with the livekit-client SDK; the agent joins automatically and handles STT → LLM → TTS

The voice pipeline runs inside a LiveKit agent process. The backend's role is session metadata, token issuance, usage tracking, and transcript storage. There is no direct WebSocket between the browser and the CallMissed API — audio flows over WebRTC to the LiveKit server.

Authentication: All REST endpoints accept both JWT (Authorization: Bearer <jwt>) and API key (Authorization: Bearer cm_<key>). API keys must have stt, tts, and llm permissions to create a session.

Create Session

bash
curl -X POST https://api.callmissed.com/v1/voice/sessions \
  -H "Authorization: Bearer cm_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "system_prompt": "You are a helpful assistant.",
    "voice": "shubh",
    "language": "en-IN",
    "llm_model": "kimi-k2.5",
    "tts_provider": "sarvam",
    "max_duration_seconds": 300,
    "webhook_url": "https://your-app.com/webhooks/voice"
  }'

Response (201 Created):

json
{
  "id": "7c2b9e30-1d8a-4c5f-9b3d-2f4a6e8b1c2d",
  "tenant_id": "0a1b2c3d-4e5f-6a7b-8c9d-0e1f2a3b4c5d",
  "bot_id": null,
  "status": "created",
  "config": {
    "system_prompt": "You are a helpful assistant.",
    "voice": "shubh",
    "language": "en-IN",
    "llm_model": "kimi-k2.5",
    "tts_provider": "sarvam",
    "max_duration_seconds": 300,
    "livekit_room": "voice-<random>"
  },
  "ws_url": "wss://livekit.callmissed.com",
  "token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
  "started_at": null,
  "ended_at": null,
  "duration_seconds": null,
  "turn_count": 0,
  "total_audio_seconds": 0,
  "end_reason": null,
  "metadata": null,
  "created_at": "2026-04-19T12:00:00Z"
}
  • ws_url is the LiveKit server URL — not the CallMissed API.
  • token is a LiveKit JWT (not an opaque vs_* string). TTL is 1 hour.
  • The token is returned once on creation and is not fetchable again.

Request Body

FieldTypeDefaultNotes
bot_iduuidOptional bot to load prompt/knowledge from
system_promptstring"You are a helpful voice assistant..."Max 4096 chars. Overrides bot's prompt if both set
voicestringshubhSarvam TTS voice ID (39 voices)
languagestringen-INBCP-47 language for STT + TTS
llm_modelstringkimi-k2.5Any catalog model (sarvam-30b, sarvam-105b, openrouter/*, etc.). kimi-k2.5-fast is currently under maintenance.
tts_providerstringsarvamCurrently sarvam only
max_duration_secondsint30030–3600
webhook_urlstringReceives session events (see below)
metadataobjectArbitrary JSON stored with the session

Connect via LiveKit

Use ws_url + token returned by create. Do not try to open a WebSocket to the CallMissed API — use the LiveKit client:

bash
npm install livekit-client
javascript
import { Room, RoomEvent, Track } from "livekit-client";

const session = await fetch("https://api.callmissed.com/v1/voice/sessions", {
  method: "POST",
  headers: {
    "Authorization": "Bearer cm_your_api_key",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    system_prompt: "You are a helpful assistant.",
    voice: "shubh",
    language: "en-IN",
    llm_model: "kimi-k2.5",
  }),
}).then(r => r.json());

const room = new Room();

room.on(RoomEvent.TrackSubscribed, (track) => {
  if (track.kind === Track.Kind.Audio) {
    document.body.appendChild(track.attach());
  }
});

room.on(RoomEvent.TranscriptionReceived, (segments, participant) => {
  for (const seg of segments) {
    if (!seg.final) continue;
    const who = participant?.isLocal ? "You" : "Agent";
    console.log(who + ": " + seg.text);
  }
});

await room.connect(session.ws_url, session.token);
await room.localParticipant.setMicrophoneEnabled(true);

The agent joins the room, greets the user, listens for speech, and responds. Server-side VAD detects speech boundaries; interruptions are handled automatically (speak while the agent is talking and it stops and listens).

List Sessions

bash
curl "https://api.callmissed.com/v1/voice/sessions?status=completed&limit=50&offset=0" \
  -H "Authorization: Bearer cm_your_api_key"

Query parameters:

ParamValuesDefault
statuscreated / active / completed / failed / timeout
limit1–20050
offset≥ 00

Returns an array of VoiceSessionOut objects (same shape as Get Session).

Get Session

bash
curl https://api.callmissed.com/v1/voice/sessions/{id} \
  -H "Authorization: Bearer cm_your_api_key"

Response contains everything from the create response except ws_url and token — those are issued once at creation.

Get Transcript

bash
curl "https://api.callmissed.com/v1/voice/sessions/{id}/transcript?format=json" \
  -H "Authorization: Bearer cm_your_api_key"

`format` query param — defaults to json:

FormatContent-TypeShape
jsonapplication/jsonArray of turns: turn_index, user_transcript, agent_response, interrupted, stt_ms, first_token_ms, first_audio_ms, total_ms, llm_model, created_at
txttext/plainHuman-readable alternating User: / Agent: lines
srtapplication/x-subripSubRip subtitles with timing derived from per-turn durations

Delete Session

bash
curl -X DELETE https://api.callmissed.com/v1/voice/sessions/{id} \
  -H "Authorization: Bearer cm_your_api_key"

Returns 204 No Content. Sessions in created or active state are marked completed with end_reason = "api_delete"; already-finished sessions are left unchanged.

Limits

LimitValueBehavior on exceed
Session create rate10 / minute / tenantHTTP 429
Concurrent active sessions (free)1HTTP 429
Concurrent active sessions (starter)5HTTP 429
Concurrent active sessions (pro)20HTTP 429
Concurrent active sessions (enterprise)unlimited
Minimum credit balance to createserver-configuredHTTP 402
Max session duration3600s (capped by max_duration_seconds)session auto-ends
LiveKit token TTL3600s (1 hour)reconnect requires a new session

Webhook Events

If webhook_url is set on session creation, the following events are delivered as POST requests with JSON body and HMAC-SHA256 signature:

EventWhen
voice_session.startedSession created (token issued)
voice_session.endedSession marked completed (normal finish or DELETE)
voice_session.failedSession entered failed state

Delivery headers:

HeaderValue
X-CallMissed-EventEvent name (e.g. voice_session.started)
X-CallMissed-DeliveryDelivery UUID (unique per attempt batch)
X-CallMissed-Signaturesha256=<hex> HMAC of the raw body using your webhook secret

Verify the signature:

python
import hmac, hashlib

raw_body = await request.body()                            # bytes — do not re-serialize
received = request.headers["X-CallMissed-Signature"]       # e.g. "sha256=abc123..."
expected = "sha256=" + hmac.new(
    webhook_secret.encode(), raw_body, hashlib.sha256
).hexdigest()
assert hmac.compare_digest(expected, received)
javascript
import crypto from "node:crypto";

const received = req.headers["x-callmissed-signature"];   // "sha256=..."
const expected = "sha256=" + crypto
  .createHmac("sha256", webhookSecret)
  .update(rawBody)                                        // raw Buffer / string
  .digest("hex");

if (!crypto.timingSafeEqual(Buffer.from(expected), Buffer.from(received))) {
  return res.status(401).send("invalid signature");
}
POST/v1/voice/sessions

Create a voice session (returns LiveKit token + URL)

GET/v1/voice/sessions

List voice sessions

GET/v1/voice/sessions/:id

Get session details

GET/v1/voice/sessions/:id/transcript

Get session transcript (json | txt | srt)

DELETE/v1/voice/sessions/:id

End a session

Was this page helpful?