Skip to main content

API Guides & Tutorials

Chat Completion

Generate text responses using our OpenAI-compatible chat completion API.

Overview

The Chat Completion API generates AI responses given a list of messages. It's fully OpenAI-compatible — use the same SDK and request format.

Endpoint: POST /v1/chat/completions

How a request flows

Every chat completion takes the same path through the platform — your app never talks to the underlying provider directly:

app

Your app

Send POST /v1/chat/completions with model + messages

gateway

CallMissed gateway

Authenticate the cm_ key, check credits, route by model id

provider

Provider

Run inference on the best-fit backend — picked from the model id

gateway

CallMissed gateway

Stream tokens back and deduct credits when the response completes

done

Your app

Receive the completion (all at once, or token-by-token when streaming)

tip
The model id decides routing automatically — bare ids and slash-prefixed ids are routed to the right backend for you. See How CallMissed Works.

Make your first request

1

Get an API key

Create a key in the dashboard (Profile → API Keys). It looks like cm_xxxx… and is shown once.

2

Point your SDK at CallMissed

Set the base URL to https://api.callmissed.com/v1 and pass your cm_ key. No other change to your OpenAI code.

3

Send messages and read the reply

Call chat.completions.create with a model and a messages array. Read response.choices[0].message.content.

Send a single-turn or multi-turn conversation and receive a complete response. Use any OpenAI SDK — set base_url to https://api.callmissed.com/v1 and api_key to your cm_ key.

Basic Usage

from openai import OpenAI

client = OpenAI(
    api_key="cm_your_key",
    base_url="https://api.callmissed.com/v1"
)

response = client.chat.completions.create(
    model="sarvam-30b",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of India?"}
    ]
)

print(response.choices[0].message.content)

Parameters

ParameterTypeDescription
modelstringModel ID (e.g. sarvam-30b, openai/gpt-5.4-mini)
messagesarrayList of {role, content} objects. System prompt goes here as {"role": "system", "content": "..."}
streambooleanEnable streaming SSE responses
temperaturenumberSampling temperature (0–2)
max_tokensintegerMaximum tokens to generate
nintegerNumber of completions to generate (default 1)
top_pfloatNucleus sampling (0–1)
top_kintegerTop-K sampling
frequency_penaltyfloatPenalize repeated tokens (−2 to 2)
presence_penaltyfloatPenalize new topics (−2 to 2)
repetition_penaltyfloatReduce repetition (0–2)
seedintegerDeterministic sampling
stoparrayStop sequences
logit_biasobjectToken probability adjustments
logprobsbooleanReturn log probabilities
top_logprobsintegerTop N log probs per token
toolsarrayTool/function definitions for function calling
parallel_tool_callsbooleanAllow parallel function calls
response_formatobject{"type": "json_object"} or {"type": "json_schema", "json_schema": {...}}
structured_outputsbooleanEnforce strict JSON schema
stream_optionsobject{"include_usage": true} to get token counts in stream
reasoning_effortstring"none" / "minimal" / "low" / "medium" / "high" — see the per-model matrix below

Frontier Parameters

When using slash-prefixed frontier models, these additional parameters are supported:

ParameterTypeDescription
providerobjectProvider routing preferences (sort, order, only, ignore, max_price)
modelsarrayFallback model list
routestring"fallback"
pluginsarray[{"id": "web"}] for web search, "file-parser", "response-healing", "context-compression"
reasoningobject{"effort": "high", "max_tokens": 5000}
transformsarray["middle-out"] for context compression
note
OpenAI Python SDK note — The OpenAI client validates kwargs against its known parameters and rejects provider=... (and the others above) with TypeError: Completions.create() got an unexpected keyword argument 'provider'. Pass them via extra_body instead: ``python client.chat.completions.create( model="openai/gpt-5.4", messages=[...], extra_body={ "provider": {"sort": "throughput", "order": ["openai"]}, "models": ["openai/gpt-5.4-mini"], }, ) Raw HTTP / curl users can keep provider` at the top level — only the OpenAI SDK gates kwargs.

Vision (Image Input)

Multimodal content (text + image parts) is accepted on any model whose

supports_vision flag is true in GET /v1/models. Models without vision

support reject image content with 400 unsupported_image_input before the

upstream call, so you're not charged.

python
from openai import OpenAI

client = OpenAI(api_key="cm_your_key", base_url="https://api.callmissed.com/v1")

resp = client.chat.completions.create(
    model="anthropic/claude-sonnet-4.6",   # supports_vision: true
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What is in this image?"},
            {"type": "image_url", "image_url": {"url": "https://example.com/cat.png"}},
        ],
    }],
)

Current vision-capable models: openai/gpt-5.4-pro, openai/gpt-5.4,

openai/gpt-5.4-mini, openai/gpt-5.4-nano, anthropic/claude-opus-4.6,

anthropic/claude-sonnet-4.6, anthropic/claude-haiku-4.5,

google/gemini-3.1-pro-preview, google/gemini-3-flash-preview, google/gemini-3.5-flash, google/gemini-3.1-flash-lite,

x-ai/grok-4.20, qwen/qwen3.5-plus, qwen/qwen3.5-flash, kimi-k2.5,

kimi-k2.6, kimi-k2.7-code, gemma-4-26b-a4b-it, mistral-small-3.1,

mistralai/mistral-small-2603, auto (free plan), openrouter/auto.

Check the live GET /v1/models response for the authoritative list — it's

computed from the same set the runtime guard uses.

Context Window

Every model in the catalog advertises a context_window (token count for the

combined prompt + completion). The GET /v1/models response exposes it under

two keys for cross-client compatibility:

  • context_window (OpenAI/CallMissed canonical name)
  • context_length (OpenAI SDK convention — same value)
python
from openai import OpenAI

client = OpenAI(api_key="cm_your_key", base_url="https://api.callmissed.com/v1")

for m in client.models.list():
    extra = m.model_extra or {}
    print(m.id, extra.get("context_window"), extra.get("supports_vision"))

Representative context windows (treat GET /v1/models as the authoritative

source — the table below is a snapshot):

Modelcontext_window
openai/gpt-5.4, openai/gpt-5.4-pro, openai/gpt-5.4-mini, openai/gpt-5.4-nano1,048,576
anthropic/claude-opus-4.6, anthropic/claude-sonnet-4.61,048,576
google/gemini-3.1-pro-preview, google/gemini-3-flash-preview, google/gemini-3.5-flash, google/gemini-3.1-flash-lite1,048,576
nemotron-3-super1,048,576
x-ai/grok-4.20262,144
qwen/qwen3.5-plus, qwen/qwen3.5-flash262,144
kimi-k2.5, kimi-k2.5-fast, kimi-k2.6, kimi-k2.7-code, glm-5.2262,144
sarvam-105b, gpt-oss-120b, glm-4.7-flash, gemma-4-26b-a4b-it, mistralai/mistral-small-2603131,072
sarvam-30b65,536

Responses API

For clients built on OpenAI's newer Responses API, CallMissed exposes a compatible POST /v1/responses endpoint. It accepts a Responses-shaped body and translates to the same chat engine under the hood — so you can point an OpenAI Responses client at https://api.callmissed.com/v1 without changes.

Endpoint: POST /v1/responses

from openai import OpenAI

client = OpenAI(api_key="cm_your_key", base_url="https://api.callmissed.com/v1")

resp = client.responses.create(
    model="gpt-4.1",
    input="Write a haiku about databases.",
)
print(resp.output_text)
  • input accepts a plain string or the Responses message-array form.
  • Streaming is supported (stream: true) and emits Responses-style SSE events.
  • The same models, pricing, vision, and tool-calling support as /v1/chat/completions apply — this is a request/response-shape adapter, not a different model set.

If you're starting fresh, /v1/chat/completions is the most widely-supported surface; use /v1/responses when porting an existing Responses-API integration.

Error Format

All errors return OpenAI-compatible format:

json
{
  "error": {
    "message": "Invalid API key",
    "type": "invalid_request_error",
    "code": "invalid_api_key"
  }
}
Was this page helpful?