Rate Limits & Quotas — CallMissed Docs

Limit Layers

Requests pass through several limits, in order:

Layer	Limit	Scope
Global middleware	200 requests / minute	per IP
Auth endpoints	5–10 requests / minute	per IP (login, register, refresh, OTP)
Per-key RPM	plan defaults: Free 60 · Starter 500 · Pro 3,000 · Enterprise 10,000 (override per key)	per API key
Monthly budget	configurable credit cap	per tenant / per key
Plan limits	tier-based caps on LLM/STT/TTS calls, conversations, storage, team size	per tenant

Set a per-key RPM and a budget cap when issuing keys, and check live consumption with GET /api/v1/keys/:id/rate-state.

Rate-limited responses include standard headers so you can pace requests:

Header	Meaning
`Retry-After`	Seconds to wait before retrying (on 429)
`X-RateLimit-Limit`	The ceiling for the current window
`X-RateLimit-Remaining`	Requests left in the window

When you receive 429 Too Many Requests:

See Error Codes for the full status/code reference.

Was this page helpful?