llm-benchlive

updated 2026-06-15T01:00:06Z
22
Models
13267
Total Probes
13267
Healthy
998
Errors

Time to First Token

Total Latency

Throughput (tok/s)

Statistics

Model TTFT p50 TTFT p95 Latency Tok/s Errors N
anthropic/claude-opus-4.6
1647ms 3798ms 2026ms 55.9 60 626
anthropic/claude-opus-4.7
2036ms 5345ms 2254ms 86.8 106 845
anthropic/claude-opus-4.8
1245ms 3310ms 1401ms 97.0 22 243
anthropic/claude-sonnet-4.6
2017ms 4433ms 2485ms 44.6 106 845
deepseek/deepseek-v3.2
1423ms 6230ms 2128ms 32.0 6 945
deepseek/deepseek-v4-flash
848ms 5079ms 2000ms 63.7 84 867
deepseek/deepseek-v4-pro
0ms 4409ms 2353ms 13.3 4 947
google/gemini-2.0-flash-001
414ms 1148ms 466ms 215.6 36 146
google/gemini-2.5-flash
416ms 5610ms 485ms 192.9 82 869
google/gemini-2.5-flash-lite
329ms 1402ms 392ms 191.9 82 869
google/gemini-3.1-flash-lite
597ms 2770ms 621ms 534.6 27 189
google/gemini-3.5-flash
0ms 0ms 1378ms 12.3 31 522
openai/gpt-4o-mini
1358ms 3118ms 1482ms 93.4 62 587
openai/gpt-5.4
1072ms 2173ms 1318ms 51.1 44 258
openai/gpt-5.5
1543ms 3148ms 1875ms 46.3 106 845
openai/gpt-oss-120b
0ms 3398ms 1550ms 14.9 107 844
x-ai/grok-4-fast
2182ms 5544ms 2276ms 2034.7 7 223
x-ai/grok-4.1-fast
2582ms 4196ms 2656ms 2659.9 6 224
x-ai/grok-4.20
566ms 2117ms 655ms 140.8 11 710
x-ai/grok-4.20-multi-agent
4024ms 13939ms 4384ms 5289.2 1 143
x-ai/grok-4.3
1834ms 5073ms 1948ms 1339.6 7 944
x-ai/grok-build-0.1
1988ms 3646ms 2136ms 1710.9 1 576