Archive · Mac (slobodans-macbook-air) · 2026-04-16-qwen3_5_4b_mlx-vs-standard-4b.html. Originally rendered 2026-04-16. Re-hosted from MyServers on 2026-05-06. Methodology and harness conventions may differ from what we use today; see /methodology.html for current standards. ← back to all benchmarks
Runtime Comparison

Qwen3.5 4B MLX vs Qwen3.5 4B Ollama

Both runs now exist on the same 8 GB M1 MacBook Air, so this page compares the Apple-native MLX lane against the regular Ollama lane without crossing machines or changing benchmark questions.

MLX Questions
35
Ollama Questions
35
MLX Primary Avg
19.75s
Ollama Primary Avg
63.67s
MLX Tok/s
16.14
Ollama Tok/s
5.24

Per-Suite Comparison

SuiteQuestionsMLX Primary AvgOllama Primary AvgMLX Tok/sOllama Tok/sMLX FormatOllama FormatMLX MarkersOllama Markers
small-model-coding-eval-v1-qwen3_5_4b_mlx 5 9.56s 30.71s 14.62 6.57 2/5 4/5 20/23 20/23
overnight-python-telemetry-v1-qwen3_5_4b_mlx 20 12.78s 35.32s 15.72 5.14 0/20 0/20 70/80 68/80
overnight-python-telemetry-v2-qwen3_5_4b_mlx 10 38.78s 136.85s 17.74 4.79 3/10 8/10 56/66 55/66

Artifact Paths