Quickstart
This guide walks through a complete batch workflow: prepare a file, submit it, and get results.
1. Log In
dw login
2. See Available Models
dw models list
3. Create a Batch File
A batch file is JSONL where each line is an API request:
{"custom_id": "q1", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "Qwen/Qwen3-VL-30B-A3B-Instruct-FP8", "messages": [{"role": "user", "content": "What is batch inference?"}], "max_tokens": 256}}
{"custom_id": "q2", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "Qwen/Qwen3-VL-30B-A3B-Instruct-FP8", "messages": [{"role": "user", "content": "Explain transformers in 2 sentences."}], "max_tokens": 256}}
Save this as batch.jsonl.
4. Validate and Inspect
dw files validate batch.jsonl
dw files stats batch.jsonl
5. Submit and Stream Results
The fastest path from file to results:
dw stream batch.jsonl > results.jsonl
This uploads the file, creates a batch, watches progress, and pipes results to stdout as they complete.
6. Check Cost
For a single input file, dw stream prints Batch: <id> to stderr when the batch is created. When streaming a directory of multiple JSONL files, batch IDs are printed after streaming completes. Use a printed batch ID to see the cost breakdown:
dw batches analytics <batch-id>
Alternative: Step-by-Step
If you prefer manual control over each step:
# Upload
dw files upload batch.jsonl
# Create batch from the uploaded file
dw batches create --file <file-id>
# Watch progress
dw batches watch <batch-id>
# Download results
dw batches results <batch-id> -o results.jsonl
Real-Time Inference
For one-off requests without a batch file:
dw realtime Qwen/Qwen3-VL-30B-A3B-Instruct-FP8 "What is batch inference?"
Next Steps
- Batch Processing — full batch workflow details
- Local File Operations — validate, prepare, merge, split JSONL
- Project System — scaffold and run multi-step projects
- Examples — real-world use-case examples