Embeddings and rerank - routing.run docs

Use the same https://api.routing.run base URL and the same authentication headers you use for chat completions.

Authorization: Bearer <access-token>

or:

X-API-Key: <routing-api-key>

Do not call upstream providers directly from frontend code. Send only routing.run route/... model IDs. routing.run handles provider selection, secrets, auth, plan access, and usage tracking.

Embeddings

POST /v1/embeddings is OpenAI-compatible.

curl -sS -X POST https://api.routing.run/v1/embeddings \
  -H "X-API-Key: ${ROUTING_RUN_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "route/cohere-embed-v3-english-3",
    "input": "The quick brown fox jumps over the lazy dog"
  }'

Batch request

{
  "model": "route/cohere-embed-v3-multilingual-3",
  "input": [
    "Hello world",
    "Bonjour le monde",
    "Hola mundo"
  ]
}

Request body

Response

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "embedding": [0.0123, -0.0456],
      "index": 0
    }
  ],
  "model": "route/cohere-embed-v3-english-3",
  "usage": {
    "prompt_tokens": 8,
    "total_tokens": 8
  },
  "provider": "routing-inference"
}

Available embedding models

Model	Use case
`route/cohere-embed-v3-english-3`	English semantic search and RAG
`route/cohere-embed-v3-multilingual-3`	Multilingual semantic search and RAG
`route/qwen3-embedding-8b`	General-purpose embeddings

Embedding models are available on Premium, Max, and Ultra plan tiers. Max and Ultra have allowed_models: all, so they inherit these automatically.

TypeScript example

async function createEmbedding(input: string) {
  const response = await fetch(`${API_BASE_URL}/v1/embeddings`, {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'X-API-Key': ROUTING_API_KEY,
    },
    body: JSON.stringify({
      model: 'route/cohere-embed-v3-english-3',
      input,
    }),
  })

  if (!response.ok) {
    throw new Error('Embedding request failed')
  }

  return response.json()
}

Rerank

POST /v1/rerank ranks a list of documents against a query.

curl -sS -X POST https://api.routing.run/v1/rerank \
  -H "X-API-Key: ${ROUTING_RUN_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "route/cohere-rerank-v4.0-pro",
    "query": "best database for vector search",
    "documents": [
      "PostgreSQL supports pgvector for vector search.",
      "Redis is often used as an in-memory cache.",
      "SQLite is a small embedded relational database."
    ],
    "top_n": 2
  }'

Request body

Response

The response shape is provider-compatible and includes ranked results.

{
  "results": [
    {
      "index": 0,
      "relevance_score": 0.98
    },
    {
      "index": 1,
      "relevance_score": 0.42
    }
  ],
  "model": "route/cohere-rerank-v4.0-pro",
  "provider": "routing-inference"
}

Available rerank models

Model	Use case
`route/cohere-rerank-v4.0-pro`	Reranking search results and RAG contexts

Rerank is available on Premium, Max, and Ultra plan tiers.

TypeScript example

async function rerank(query: string, documents: string[]) {
  const response = await fetch(`${API_BASE_URL}/v1/rerank`, {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'X-API-Key': ROUTING_API_KEY,
    },
    body: JSON.stringify({
      model: 'route/cohere-rerank-v4.0-pro',
      query,
      documents,
      top_n: 5,
    }),
  })

  if (!response.ok) {
    throw new Error('Rerank request failed')
  }

  return response.json()
}

Documentation Index

​Embeddings

​Batch request

​Request body

​Response

​Available embedding models

​TypeScript example

​Rerank

​Request body

​Response

​Available rerank models

​TypeScript example

Embeddings

Batch request

Request body

Response

Available embedding models

TypeScript example

Rerank

Request body

Response

Available rerank models

TypeScript example