You send traffic toDocumentation Index
Fetch the complete documentation index at: https://docs.routing.run/llms.txt
Use this file to discover all available pages before exploring further.
https://api.routing.run and authenticate with one rk_ API key for every model your plan tier allows.
If https://api.routing.run is slow or returning errors, use https://ai.routing.sh as the secondary endpoint hosted by the routing.run team. The same paths and API keys work on both hosts.
routing.run gives you one endpoint, one route/... model namespace, and automatic failover behind each model ID.
Inference matches APIs you already use:
- POST
/v1/chat/completions— recommended default for apps and coding agents - POST
/v1/messages— compatibility path for Anthropic-style requests - POST
/v1/embeddings— OpenAI-compatible embeddings for semantic search and RAG - POST
/v1/rerank— reranking for search results and RAG contexts - GET
/v1/status— public health check (does not require authentication)
route/ prefix. Each model includes failover and a circuit breaker so your integration stays simple.
Why teams use routing.run
- Keep one API key and one base URL across your integration.
- Use stable
route/...model IDs in apps, SDKs, and coding agents. - Move coding agents and apps onto the same default endpoint:
/v1/chat/completions. - Create embeddings at
/v1/embeddingswith Cohere English, Cohere multilingual, and Qwen3 models. - Rerank search results and RAG contexts at
/v1/rerankwith Cohere Rerank v4 Pro. - Use Mistral Large 3, Medium 2505, and Small 2503 for chat and summarization.
- Check public health with
/v1/statusbefore debugging request issues.
Get started
Quickstart
Get your API key and make your first request in under 2 minutes.
API overview
See the integration model, retries, circuit breakers, and the public endpoint surface.
API compatibility
OpenAI compatible
Use this as the default integration path. It is the most reliable endpoint for apps and coding agents.
Anthropic compatible
Use this only when you specifically need Anthropic-style requests. Response compatibility is currently partial.
Embeddings and rerank
Use Cohere and Qwen embedding models plus Cohere Rerank v4 Pro through routing.run.