Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.routing.run/llms.txt

Use this file to discover all available pages before exploring further.

Use the same https://api.routing.run base URL and the same authentication headers you use for chat completions.
Authorization: Bearer <access-token>
or:
X-API-Key: <routing-api-key>
Do not call upstream providers directly from frontend code. Send only routing.run route/... model IDs. routing.run handles provider selection, secrets, auth, plan access, and usage tracking.

Embeddings

POST /v1/embeddings is OpenAI-compatible.
curl -sS -X POST https://api.routing.run/v1/embeddings \
  -H "X-API-Key: ${ROUTING_RUN_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "route/cohere-embed-v3-english-3",
    "input": "The quick brown fox jumps over the lazy dog"
  }'

Batch request

{
  "model": "route/cohere-embed-v3-multilingual-3",
  "input": [
    "Hello world",
    "Bonjour le monde",
    "Hola mundo"
  ]
}

Request body

Response

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "embedding": [0.0123, -0.0456],
      "index": 0
    }
  ],
  "model": "route/cohere-embed-v3-english-3",
  "usage": {
    "prompt_tokens": 8,
    "total_tokens": 8
  },
  "provider": "routing-inference"
}

Available embedding models

ModelUse case
route/cohere-embed-v3-english-3English semantic search and RAG
route/cohere-embed-v3-multilingual-3Multilingual semantic search and RAG
route/qwen3-embedding-8bGeneral-purpose embeddings
Embedding models are available on Premium, Max, and Ultra plan tiers. Max and Ultra have allowed_models: all, so they inherit these automatically.

TypeScript example

async function createEmbedding(input: string) {
  const response = await fetch(`${API_BASE_URL}/v1/embeddings`, {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'X-API-Key': ROUTING_API_KEY,
    },
    body: JSON.stringify({
      model: 'route/cohere-embed-v3-english-3',
      input,
    }),
  })

  if (!response.ok) {
    throw new Error('Embedding request failed')
  }

  return response.json()
}

Rerank

POST /v1/rerank ranks a list of documents against a query.
curl -sS -X POST https://api.routing.run/v1/rerank \
  -H "X-API-Key: ${ROUTING_RUN_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "route/cohere-rerank-v4.0-pro",
    "query": "best database for vector search",
    "documents": [
      "PostgreSQL supports pgvector for vector search.",
      "Redis is often used as an in-memory cache.",
      "SQLite is a small embedded relational database."
    ],
    "top_n": 2
  }'

Request body

Response

The response shape is provider-compatible and includes ranked results.
{
  "results": [
    {
      "index": 0,
      "relevance_score": 0.98
    },
    {
      "index": 1,
      "relevance_score": 0.42
    }
  ],
  "model": "route/cohere-rerank-v4.0-pro",
  "provider": "routing-inference"
}

Available rerank models

ModelUse case
route/cohere-rerank-v4.0-proReranking search results and RAG contexts
Rerank is available on Premium, Max, and Ultra plan tiers.

TypeScript example

async function rerank(query: string, documents: string[]) {
  const response = await fetch(`${API_BASE_URL}/v1/rerank`, {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'X-API-Key': ROUTING_API_KEY,
    },
    body: JSON.stringify({
      model: 'route/cohere-rerank-v4.0-pro',
      query,
      documents,
      top_n: 5,
    }),
  })

  if (!response.ok) {
    throw new Error('Rerank request failed')
  }

  return response.json()
}