POST /v1/messages matches Anthropic Messages. Point the SDK base_url at routing.run.
Base URL
https://api.routing.run/v1
Quick setup
import os
import anthropic
client = anthropic.Anthropic(
api_key=os.environ["ROUTING_RUN_API_KEY"],
base_url="https://api.routing.run/v1",
)
message = client.messages.create(
model="route/deepseek-v3.2",
max_tokens=1024,
messages=[
{"role": "user", "content": "Summarize risks in this dependency diff: …"}
],
)
print(message.content[0].text)
How it works
Anthropic-shaped requests are converted internally, run through the same routing chain as chat completions, then converted back to Anthropic-shaped responses.
Request
Response
{
"id": "msg_1744701234567",
"type": "message",
"role": "assistant",
"content": "- Supply chain: pinned lodash has a known advisory …",
"model": "route/deepseek-v3.2",
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {
"input_tokens": 220,
"output_tokens": 180
},
"credits_charged": 0.003
}
content is a string (assistant text) in the wire JSON. credits_charged is a routing.run extension (float credits deducted for that call). The upstream internal id is not included in this JSON response (it is logged server-side).
Stop reasons
| Value | Meaning |
|---|
end_turn | The model finished generating naturally |
max_tokens | The response hit the max_tokens limit |
Error handling
See Authentication for plain-text + X-Error-Code handling (same middleware stack as chat).