Documentation Index
Fetch the complete documentation index at: https://docs.routing.run/llms.txt
Use this file to discover all available pages before exploring further.
Use the same https://api.routing.run base URL and the same authentication headers you use for chat completions.
Authorization: Bearer <access-token>
or:
X-API-Key: <routing-api-key>
Do not call upstream providers directly from frontend code. Send only routing.run route/... model IDs. routing.run handles provider selection, secrets, auth, plan access, and usage tracking.
Embeddings
POST /v1/embeddings is OpenAI-compatible.
curl -sS -X POST https://api.routing.run/v1/embeddings \
-H "X-API-Key: ${ROUTING_RUN_API_KEY}" \
-H "Content-Type: application/json" \
-d '{
"model": "route/cohere-embed-v3-english-3",
"input": "The quick brown fox jumps over the lazy dog"
}'
Batch request
{
"model": "route/cohere-embed-v3-multilingual-3",
"input": [
"Hello world",
"Bonjour le monde",
"Hola mundo"
]
}
Request body
Response
{
"object": "list",
"data": [
{
"object": "embedding",
"embedding": [0.0123, -0.0456],
"index": 0
}
],
"model": "route/cohere-embed-v3-english-3",
"usage": {
"prompt_tokens": 8,
"total_tokens": 8
},
"provider": "routing-inference"
}
Available embedding models
| Model | Use case |
|---|
route/cohere-embed-v3-english-3 | English semantic search and RAG |
route/cohere-embed-v3-multilingual-3 | Multilingual semantic search and RAG |
route/qwen3-embedding-8b | General-purpose embeddings |
Embedding models are available on Premium, Max, and Ultra plan tiers. Max and Ultra have allowed_models: all, so they inherit these automatically.
TypeScript example
async function createEmbedding(input: string) {
const response = await fetch(`${API_BASE_URL}/v1/embeddings`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'X-API-Key': ROUTING_API_KEY,
},
body: JSON.stringify({
model: 'route/cohere-embed-v3-english-3',
input,
}),
})
if (!response.ok) {
throw new Error('Embedding request failed')
}
return response.json()
}
Rerank
POST /v1/rerank ranks a list of documents against a query.
curl -sS -X POST https://api.routing.run/v1/rerank \
-H "X-API-Key: ${ROUTING_RUN_API_KEY}" \
-H "Content-Type: application/json" \
-d '{
"model": "route/cohere-rerank-v4.0-pro",
"query": "best database for vector search",
"documents": [
"PostgreSQL supports pgvector for vector search.",
"Redis is often used as an in-memory cache.",
"SQLite is a small embedded relational database."
],
"top_n": 2
}'
Request body
Response
The response shape is provider-compatible and includes ranked results.
{
"results": [
{
"index": 0,
"relevance_score": 0.98
},
{
"index": 1,
"relevance_score": 0.42
}
],
"model": "route/cohere-rerank-v4.0-pro",
"provider": "routing-inference"
}
Available rerank models
| Model | Use case |
|---|
route/cohere-rerank-v4.0-pro | Reranking search results and RAG contexts |
Rerank is available on Premium, Max, and Ultra plan tiers.
TypeScript example
async function rerank(query: string, documents: string[]) {
const response = await fetch(`${API_BASE_URL}/v1/rerank`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'X-API-Key': ROUTING_API_KEY,
},
body: JSON.stringify({
model: 'route/cohere-rerank-v4.0-pro',
query,
documents,
top_n: 5,
}),
})
if (!response.ok) {
throw new Error('Rerank request failed')
}
return response.json()
}