pavilion-weeyuga-v3 — qwen2.5/qwen2.5-coder/qwen3/qwen3.5 on pavilion
96 calls across 16 cell(s); 12 errors
Archive run
This run is published for transparency. The site-grade is
archive-only — the run was meta-only,
didn’t complete cleanly, lacked documented methodology,
or had a methodology issue a newer run supersedes. Headline
numbers in this row should not be cited as current findings
without reading any caveat below and the underlying
run.md in the public archive.
Methodology
See SITE_DATA_AUDIT_AND_MIGRATION_PLAN_2026-05-06
for the full procedure.
Reproducible at git SHA 371ce70c.
Results
| Cell | tok/s mean | tok/s p50 | tok/s p95 | duration p50 | calls |
|---|---|---|---|---|---|
| qwen3.5:4b | — | — | — | — | 6 |
| qwen3.5:35b-a3b-uncensored-iq1m | — | — | — | — | 6 |
| qwen3.5:35b-a3b-iq2s | — | — | — | — | 6 |
| qwen3.5:9b-q6k | — | — | — | — | 6 |
| qwen3.5:9b-q4km | — | — | — | — | 6 |
| qwen3.5:2b | — | — | — | — | 6 |
| qwen3.5:0.8b | — | — | — | — | 6 |
| qwen3.5:9b | — | — | — | — | 6 |
| qwen2.5-coder:14b | — | — | — | — | 6 |
| qwen2.5-coder:3b | — | — | — | — | 6 |
| qwen3:14b | — | — | — | — | 6 |
| qwen3:8b | — | — | — | — | 6 |
| qwen3:4b | — | — | — | — | 6 |
| qwen2.5:3b | — | — | — | — | 6 |
| qwen2.5-coder:1.5b | — | — | — | — | 6 |
| qwen2.5-coder:0.5b | — | — | — | — | 6 |
tokens per second — mean · p50 · p95
No tokens-per-second data captured.
Cold start vs warm
Cold-start measurements are the first call into a model after it loads from disk; warm calls are everything after. The ratio shows how much of the deployment’s wall-time cost is one-time vs steady-state.
| Cell | cold n | cold tok/s | cold p50 | warm n | warm tok/s | warm p50 | warm/cold |
|---|---|---|---|---|---|---|---|
| pavilion:weeyuga:qwen3.5:4b | 2 | — | — | 4 | — | — | — |
| pavilion:weeyuga:qwen3.5:35b-a3b-… | 2 | — | — | 4 | — | — | — |
| pavilion:weeyuga:qwen3.5:35b-a3b-… | 2 | — | — | 4 | — | — | — |
| pavilion:weeyuga:qwen3.5:9b-q6k | 2 | — | — | 4 | — | — | — |
| pavilion:weeyuga:qwen3.5:9b-q4km | 2 | — | — | 4 | — | — | — |
| pavilion:weeyuga:qwen3.5:2b | 2 | — | — | 4 | — | — | — |
| pavilion:weeyuga:qwen3.5:0.8b | 2 | — | — | 4 | — | — | — |
| pavilion:weeyuga:qwen3.5:9b | 2 | — | — | 4 | — | — | — |
| pavilion:weeyuga:qwen2.5-coder:14b | 2 | — | — | 4 | — | — | — |
| pavilion:weeyuga:qwen2.5-coder:3b | 2 | — | — | 4 | — | — | — |
| pavilion:weeyuga:qwen3:14b | 2 | — | — | 4 | — | — | — |
| pavilion:weeyuga:qwen3:8b | 2 | — | — | 4 | — | — | — |
| pavilion:weeyuga:qwen3:4b | 2 | — | — | 4 | — | — | — |
| pavilion:weeyuga:qwen2.5:3b | 2 | — | — | 4 | — | — | — |
| pavilion:weeyuga:qwen2.5-coder:1.5b | 2 | — | — | 4 | — | — | — |
| pavilion:weeyuga:qwen2.5-coder:0.5b | 2 | — | — | 4 | — | — | — |
Raw data
Every run gets its JSONL, log, summary, and metadata published. Clone the archive; re-run it; tell us where we got it wrong.
Cite
Margetic, S. et al. (2026). benchmarks.weeyuga.com/benchmarks/ad057f5b.html Public benchmarks of the Weeyuga cluster. Run id: ad057f5b-ed3f-4a95-a38e-361be310ffd6. SHA 371ce70c.