Skip to main content

Speech to Text

AUDIO TRANSLATION

Translate audio in any supported language to English text. OpenAI-compatible endpoint.

Overview

The Audio Translation API translates speech in any of 24 supported languages to English text. This is the OpenAI-compatible /v1/audio/translations endpoint.

Unlike Speech to Text (which transcribes in the original language), this endpoint always outputs English.

Endpoint: POST /v1/audio/translations

Supported input languages: Hindi, Bengali, Tamil, Telugu, Kannada, Malayalam, Marathi, Gujarati, Punjabi, Odia, Assamese, Urdu, Nepali, Konkani, Kashmiri, Sindhi, Sanskrit, Santali, Manipuri, Brij, Maithili, Dogri, English — plus auto-detection.

Basic Usage

from openai import OpenAI

client = OpenAI(
    api_key="cm_your_key",
    base_url="https://api.callmissed.com/v1"
)

# Translate Hindi audio to English text
with open("hindi_audio.wav", "rb") as f:
    translation = client.audio.translations.create(
        model="saaras:v3",
        file=f,
    )

print(translation.text)

Response:

json
{"text": "Hello, how are you? I wanted to discuss the project."}

Parameters

ParameterTypeRequiredDescription
filefileYesAudio file (WAV, MP3, AAC, OGG, FLAC, WebM, M4A)
modelstringNoModel ID (default: saaras:v3)
response_formatstringNojson (default), text, or verbose_json
temperaturefloatNoSampling temperature
promptstringNoPrompt to guide transcription style

Response Formats

json (default)

json
{"text": "Hello, how are you?"}

text

Returns plain text with no JSON wrapping.

verbose_json

json
{
  "task": "translate",
  "language": "en",
  "duration": 4.52,
  "text": "Hello, how are you?",
  "segments": [],
  "words": []
}

> Tip: For transcription in the original language (not translated), use Speech to Text instead. For output modes like transliteration or code-mixing, use the mode parameter on the transcription endpoint.

Was this page helpful?