Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.routing.run/llms.txt

Use this file to discover all available pages before exploring further.

You send traffic to https://api.routing.run and authenticate with one rk_ API key for every model your plan tier allows. If https://api.routing.run is slow or returning errors, use https://ai.routing.sh as the secondary endpoint hosted by the routing.run team. The same paths and API keys work on both hosts. routing.run gives you one endpoint, one route/... model namespace, and automatic failover behind each model ID. Inference matches APIs you already use:
  • POST /v1/chat/completions — recommended default for apps and coding agents
  • POST /v1/messages — compatibility path for Anthropic-style requests
  • POST /v1/embeddings — OpenAI-compatible embeddings for semantic search and RAG
  • POST /v1/rerank — reranking for search results and RAG contexts
  • GET /v1/status — public health check (does not require authentication)
Model IDs always use the route/ prefix. Each model includes failover and a circuit breaker so your integration stays simple.

Why teams use routing.run

  • Keep one API key and one base URL across your integration.
  • Use stable route/... model IDs in apps, SDKs, and coding agents.
  • Move coding agents and apps onto the same default endpoint: /v1/chat/completions.
  • Create embeddings at /v1/embeddings with Cohere English, Cohere multilingual, and Qwen3 models.
  • Rerank search results and RAG contexts at /v1/rerank with Cohere Rerank v4 Pro.
  • Use Mistral Large 3, Medium 2505, and Small 2503 for chat and summarization.
  • Check public health with /v1/status before debugging request issues.

Get started

Quickstart

Get your API key and make your first request in under 2 minutes.

API overview

See the integration model, retries, circuit breakers, and the public endpoint surface.

API compatibility

https://mintcdn.com/routing/Bdepg-ZiTHSkFbP-/images/ai-tools/openai.svg?fit=max&auto=format&n=Bdepg-ZiTHSkFbP-&q=85&s=20abb0f26a0ce48b6bff9705347b8d49

OpenAI compatible

Use this as the default integration path. It is the most reliable endpoint for apps and coding agents.
https://mintcdn.com/routing/X1FNfaLkHxe6r1oe/images/ai-tools/claude-code.svg?fit=max&auto=format&n=X1FNfaLkHxe6r1oe&q=85&s=7b981276e38639a34ef8dee48c587259

Anthropic compatible

Use this only when you specifically need Anthropic-style requests. Response compatibility is currently partial.

Embeddings and rerank

Use Cohere and Qwen embedding models plus Cohere Rerank v4 Pro through routing.run.

Coding agents

Copyable setup prompts: OpenClaw, OpenCode, pi.dev, Kilo Code CLI, Claude Code, Codex CLI. IDE and VS Code integrations show manual custom-provider settings on each page.