API Usage
List available models
Get a list of all configured targets in the OpenAI models format:
curl http://localhost:3000/v1/models
Sending requests
Send requests to the gateway using the standard OpenAI API format:
curl -X POST http://localhost:3000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4",
"messages": [{"role": "user", "content": "Hello!"}]
}'
The model field determines which target receives the request.
Model override header
Override the target using the model-override header. This routes the request to a different target regardless of the model field in the body:
curl -X POST http://localhost:3000/v1/chat/completions \
-H "model-override: claude-3" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4",
"messages": [{"role": "user", "content": "Hello!"}]
}'
This is also used for routing requests without bodies – for example, to get the embeddings usage for your organization:
curl -X GET http://localhost:3000/v1/organization/usage/embeddings \
-H "model-override: claude-3"
Metrics
When the --metrics flag is enabled (the default), Prometheus metrics are exposed on a separate port:
curl http://localhost:9090/metrics
See Command Line Options for metrics configuration flags.