Skip to main content

API Guides & Tutorials

CHAT COMPLETION

Generate text responses using our OpenAI-compatible chat completion API.

Overview

The Chat Completion API generates AI responses given a list of messages. It's fully OpenAI-compatible — use the same SDK and request format.

Endpoint: POST /v1/chat/completions

Basic Usage

from openai import OpenAI

client = OpenAI(
    api_key="cm_your_key",
    base_url="https://api.callmissed.com/v1"
)

response = client.chat.completions.create(
    model="sarvam-30b",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of India?"}
    ]
)

print(response.choices[0].message.content)

Parameters

ParameterTypeDescription
modelstringModel ID (e.g. sarvam-30b, openai/gpt-5.4-mini)
messagesarrayList of {role, content} objects. System prompt goes here as {"role": "system", "content": "..."}
streambooleanEnable streaming SSE responses
temperaturenumberSampling temperature (0–2)
max_tokensintegerMaximum tokens to generate
nintegerNumber of completions to generate (default 1)
top_pfloatNucleus sampling (0–1)
top_kintegerTop-K sampling
frequency_penaltyfloatPenalize repeated tokens (−2 to 2)
presence_penaltyfloatPenalize new topics (−2 to 2)
repetition_penaltyfloatReduce repetition (0–2)
seedintegerDeterministic sampling
stoparrayStop sequences
logit_biasobjectToken probability adjustments
logprobsbooleanReturn log probabilities
top_logprobsintegerTop N log probs per token
toolsarrayTool/function definitions for function calling
parallel_tool_callsbooleanAllow parallel function calls
response_formatobject{"type": "json_object"} or {"type": "json_schema", "json_schema": {...}}
structured_outputsbooleanEnforce strict JSON schema
stream_optionsobject{"include_usage": true} to get token counts in stream
reasoning_effortstring"none" / "minimal" / "low" / "medium" / "high" — see the per-model matrix below

Frontier Parameters

When using slash-prefixed frontier models, these additional parameters are supported:

ParameterTypeDescription
providerobjectProvider routing preferences (sort, order, only, ignore, max_price)
modelsarrayFallback model list
routestring"fallback"
pluginsarray[{"id": "web"}] for web search, "file-parser", "response-healing", "context-compression"
reasoningobject{"effort": "high", "max_tokens": 5000}
transformsarray["middle-out"] for context compression

> OpenAI Python SDK note — The OpenAI client validates kwargs against its known parameters and rejects provider=... (and the others above) with TypeError: Completions.create() got an unexpected keyword argument 'provider'. Pass them via extra_body instead:

>

> ```python

> client.chat.completions.create(

> model="openai/gpt-5.4",

> messages=[...],

> extra_body={

> "provider": {"sort": "throughput", "order": ["OpenAI", "Azure"]},

> "models": ["openai/gpt-5.4-mini"],

> },

> )

> ```

>

> Raw HTTP / curl users can keep provider at the top level — only the OpenAI SDK gates kwargs.

Vision (Image Input)

Multimodal content (text + image parts) is accepted on any model whose

supports_vision flag is true in GET /v1/models. Models without vision

support reject image content with 400 unsupported_image_input before the

upstream call, so you're not charged.

python
from openai import OpenAI

client = OpenAI(api_key="cm_your_key", base_url="https://api.callmissed.com/v1")

resp = client.chat.completions.create(
    model="anthropic/claude-sonnet-4.6",   # supports_vision: true
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What is in this image?"},
            {"type": "image_url", "image_url": {"url": "https://example.com/cat.png"}},
        ],
    }],
)

Current vision-capable models: openai/gpt-5.4-pro, openai/gpt-5.4,

openai/gpt-5.4-mini, openai/gpt-5.4-nano, anthropic/claude-opus-4.6,

anthropic/claude-sonnet-4.6, anthropic/claude-haiku-4.5,

google/gemini-3.1-pro-preview, google/gemini-3-flash-preview, google/gemini-3.1-flash-lite,

x-ai/grok-4.20, qwen/qwen3.5-plus, qwen/qwen3.5-flash, kimi-k2.5,

kimi-k2.6, gemma-4-26b-a4b-it, mistral-small-3.1,

mistralai/mistral-small-2603, auto (free plan), openrouter/auto.

Check the live GET /v1/models response for the authoritative list — it's

computed from the same set the runtime guard uses.

Context Window

Every model in the catalog advertises a context_window (token count for the

combined prompt + completion). The GET /v1/models response exposes it under

two keys for cross-client compatibility:

  • context_window (OpenAI/CallMissed canonical name)
  • context_length (OpenAI SDK convention — same value)
python
from openai import OpenAI

client = OpenAI(api_key="cm_your_key", base_url="https://api.callmissed.com/v1")

for m in client.models.list():
    extra = m.model_extra or {}
    print(m.id, extra.get("context_window"), extra.get("supports_vision"))

Representative context windows (treat GET /v1/models as the authoritative

source — the table below is a snapshot):

Modelcontext_window
openai/gpt-5.4, openai/gpt-5.4-pro, openai/gpt-5.4-mini, openai/gpt-5.4-nano1,048,576
anthropic/claude-opus-4.6, anthropic/claude-sonnet-4.61,048,576
google/gemini-3.1-pro-preview, google/gemini-3-flash-preview, google/gemini-3.1-flash-lite1,048,576
nemotron-3-super1,048,576
x-ai/grok-4.20262,144
qwen/qwen3.5-plus, qwen/qwen3.5-flash262,144
kimi-k2.5, kimi-k2.5-fast, kimi-k2.6262,144
sarvam-105b, gpt-oss-120b, glm-4.7-flash, gemma-4-26b-a4b-it, mistralai/mistral-small-2603131,072
sarvam-30b65,536

Error Format

All errors return OpenAI-compatible format:

json
{
  "error": {
    "message": "Invalid API key",
    "type": "invalid_request_error",
    "code": "invalid_api_key"
  }
}
Was this page helpful?