Mac Qwen Coder Ladder

This page compares the local Apple-silicon Mac runs for qwen2.5-coder:0.5b, qwen2.5-coder:1.5b, and qwen2.5-coder:3b under the shared benchmark stack.

Qwen 0.5B 314.5s

Shared local-Mac total for 5Q, 20Q, and 10Q.

Hello: 1.8s · 20Q avg: 4.4s · 10Q avg: 8.4s

Qwen 1.5B 769.4s

Shared local-Mac total for 5Q, 20Q, and 10Q.

Hello: 4.8s · 20Q avg: 11.4s · 10Q avg: 24.5s

Qwen 3B 914.5s

Shared local-Mac total for 5Q, 20Q, and 10Q.

Hello: 5.8s · 20Q avg: 13.3s · 10Q avg: 31.1s

How To Read This

This page compares only the local Apple-silicon Mac runs under the same one-model Ollama shape: one loaded model, one parallel slot, 4096 context, and the same benchmark packets. That makes it the cleanest view of how model size changes behavior on this Mac without the VPS in the loop.

Five-Question Packet

Metric	Qwen 0.5B	Qwen 1.5B	Qwen 3B
Total wall time	88.0s	68.6s	110.1s
Average question time	17.6s	13.7s	22.0s
Average throughput	45.69 tok/s	21.17 tok/s	12.86 tok/s
Average marker hit	80%	83%	90%
Format passes	3/5	2/5	3/5
Strict passes	2/5	2/5	3/5

Python 20Q

Metric	Qwen 0.5B	Qwen 1.5B	Qwen 3B
Total wall time	122.2s	378.6s	397.1s
Primary avg duration	4.4s	11.4s	13.3s
Follow-up avg duration	1.7s	7.5s	6.5s
Primary avg throughput	42.18 tok/s	16.89 tok/s	14.20 tok/s
Primary avg marker hit	78%	88%	85%
Usable primary answers	20/20	20/20	20/20

Real-Context 10Q

Metric	Qwen 0.5B	Qwen 1.5B	Qwen 3B
Total wall time	104.3s	322.2s	407.3s
Primary avg duration	8.4s	24.5s	31.1s
Follow-up avg duration	2.0s	7.7s	9.6s
Primary avg throughput	38.14 tok/s	17.82 tok/s	14.01 tok/s
Primary avg marker hit	55%	69%	78%
Usable primary answers	10/10	10/10	10/10

Drill-Down Reports

Every link below goes to a standalone archived report page.

Qwen 0.5B 5Q report
Qwen 0.5B 20Q report
Qwen 0.5B 10Q report
Qwen 1.5B 5Q report
Qwen 1.5B 20Q report
Qwen 1.5B 10Q report
Qwen 3B 5Q report
Qwen 3B 20Q report
Qwen 3B 10Q report