llm-benchlive

updated 2026-06-21T18:00:06Z
22
Models
15521
Total Probes
15521
Healthy
1159
Errors

Time to First Token

Total Latency

Throughput (tok/s)

Statistics

Model TTFT p50 TTFT p95 Latency Tok/s Errors N
anthropic/claude-opus-4.6
1647ms 3798ms 2026ms 55.9 60 626
anthropic/claude-opus-4.7
1976ms 5282ms 2192ms 85.9 125 987
anthropic/claude-opus-4.8
1272ms 3310ms 1450ms 90.9 35 391
anthropic/claude-sonnet-4.6
1942ms 4512ms 2419ms 44.5 125 987
deepseek/deepseek-v3.2
1376ms 5438ms 2085ms 34.3 7 1105
deepseek/deepseek-v4-flash
856ms 5607ms 1954ms 63.7 97 1015
deepseek/deepseek-v4-pro
0ms 4409ms 2283ms 13.3 4 1108
google/gemini-2.0-flash-001
414ms 1148ms 466ms 215.6 36 146
google/gemini-2.5-flash
413ms 5792ms 482ms 196.2 95 1017
google/gemini-2.5-flash-lite
329ms 1211ms 391ms 192.5 95 1017
google/gemini-3.1-flash-lite
597ms 2770ms 621ms 534.6 27 189
google/gemini-3.5-flash
0ms 0ms 1356ms 12.5 44 670
openai/gpt-4o-mini
1251ms 2886ms 1379ms 93.4 81 729
openai/gpt-5.4
1072ms 2173ms 1318ms 51.1 44 258
openai/gpt-5.5
1455ms 3148ms 1783ms 47.5 125 987
openai/gpt-oss-120b
0ms 3278ms 1520ms 15.5 126 986
x-ai/grok-4-fast
2182ms 5544ms 2276ms 2034.7 7 223
x-ai/grok-4.1-fast
2582ms 4196ms 2656ms 2659.9 6 224
x-ai/grok-4.20
553ms 2040ms 631ms 148.4 11 871
x-ai/grok-4.20-multi-agent
4024ms 13939ms 4384ms 5289.2 1 143
x-ai/grok-4.3
1787ms 4826ms 1889ms 1337.1 7 1105
x-ai/grok-build-0.1
1982ms 3505ms 2117ms 1662.1 1 737