API Guides & Tutorials
TEXT TO SPEECH
Convert text to natural-sounding speech using Sarvam AI.
Overview
The Text to Speech API converts text into audio via Sarvam AI — Indic languages, 39 voices across 11 languages.
Endpoint: POST /v1/audio/speech
Basic Usage
from openai import OpenAI
client = OpenAI(
api_key="cm_your_key",
base_url="https://api.callmissed.com/v1"
)
response = client.audio.speech.create(
model="bulbul:v3",
voice="shubh",
input="Namaste, kaise hain aap?"
)
response.stream_to_file("speech.mp3")Parameters
| Parameter | Type | Description |
|---|---|---|
model | string | bulbul:v3 (Sarvam) |
input | string | Text to synthesize |
voice | string | Voice ID — default shubh for Sarvam (39 voices available) |
language | string | Language code (e.g. hi-IN, ta-IN) |
speed | number | Speech speed 0.5–2.0 (default 1.0) — maps to Sarvam's pace |
speech_sample_rate | integer | 8000, 16000, 22050, 24000, or 48000 Hz |
response_format | string | Output format — see below |
Audio Formats
Supported values for response_format: mp3, opus, aac, flac, wav, pcm
Was this page helpful?