| Model | TTFT p50 | TTFT p95 | Latency | Tok/s | Errors | N |
|---|---|---|---|---|---|---|
|
anthropic/claude-opus-4.6
|
1574ms | 2055ms | 1895ms | 70.4 | 5 | 60 |
|
anthropic/claude-opus-4.7
|
1576ms | 11789ms | 1634ms | 113.0 | 5 | 60 |
|
anthropic/claude-sonnet-4.6
|
1822ms | 4493ms | 2290ms | 45.2 | 5 | 60 |
|
deepseek/deepseek-v3.2
|
1737ms | 4685ms | 2411ms | 29.7 | 0 | 65 |
|
deepseek/deepseek-v4-flash
|
1246ms | 4778ms | 2376ms | 73.9 | 5 | 60 |
|
deepseek/deepseek-v4-pro
|
0ms | 4171ms | 2331ms | 10.3 | 0 | 65 |
|
google/gemini-2.0-flash-001
|
428ms | 855ms | 480ms | 217.0 | 8 | 57 |
|
google/gemini-2.5-flash
|
428ms | 875ms | 472ms | 224.9 | 5 | 60 |
|
google/gemini-2.5-flash-lite
|
314ms | 490ms | 372ms | 195.3 | 5 | 60 |
|
openai/gpt-4o-mini
|
1127ms | 1190ms | 1225ms | 102.5 | 0 | 3 |
|
openai/gpt-5.4
|
897ms | 1375ms | 1140ms | 48.0 | 5 | 57 |
|
openai/gpt-5.5
|
1215ms | 3565ms | 1568ms | 44.3 | 5 | 60 |
|
openai/gpt-oss-120b
|
0ms | 1564ms | 1080ms | 20.9 | 5 | 60 |
|
x-ai/grok-4-fast
|
2668ms | 10318ms | 2790ms | 1547.1 | 0 | 65 |
|
x-ai/grok-4.1-fast
|
2481ms | 4200ms | 2570ms | 2705.8 | 0 | 65 |
|
x-ai/grok-4.3
|
3404ms | 6081ms | 3610ms | 1182.5 | 0 | 65 |