Qwen2.5-Coder 0.5B: Mac vs VPS50

This page compares the same qwen2.5-coder:0.5b benchmark stack on Slobodan's Apple M1 Mac and on vps50. It uses only the suites completed on both machines: hello-check, the five-question small eval, the twenty-question Python suite, and the ten-question real-context suite.

Shared Benchmark Total 316.3s vs 627.7s

Mac M1 against VPS50 for hello + 5Q + 20Q + 10Q.

Wall-Time Advantage 1.98x

How much shorter the full shared benchmark was on the Mac.

5Q Throughput Advantage 2.21x

Average tokens-per-second on the small-eval packet.

Quality Snapshot 78% vs 81%

Primary marker-hit average on the 20-question Python suite.

Hello Check

Metric	Mac M1	VPS50
Latency	1.8s	4.0s
Reply length	38	38

Five-Question Small Eval

Metric	Mac M1	VPS50
Total wall time	88.0s	116.6s
Average question time	17.6s	23.3s
Average throughput	45.69 tok/s	20.67 tok/s
Average marker hit	80%	80%
Format passes	3/5	3/5
Strict passes	2/5	2/5

Python 20Q

Metric	Mac M1	VPS50
Total wall time	122.2s	260.1s
Primary avg duration	4.4s	9.4s
Follow-up avg duration	1.7s	3.6s
Primary avg throughput	42.18 tok/s	20.88 tok/s
Follow-up avg throughput	41.80 tok/s	22.08 tok/s
Primary avg marker hit	78%	81%
Follow-up avg marker hit	60%	60%
Usable primary answers	20/20	20/20
Usable follow-up answers	20/20	20/20

Python 10Q

Metric	Mac M1	VPS50
Total wall time	104.3s	247.0s
Primary avg duration	8.4s	20.4s
Follow-up avg duration	2.0s	4.2s
Primary avg throughput	38.14 tok/s	17.21 tok/s
Follow-up avg throughput	38.91 tok/s	18.07 tok/s
Primary avg marker hit	55%	63%
Follow-up avg marker hit	80%	70%
Usable primary answers	10/10	10/10
Usable follow-up answers	10/10	10/10

20Q Extremes

Fastest and slowest primary tasks on each machine, so the tail latency shape is easy to compare.

Mac M1

Fastest: Pydantic Model in 1.9s
Slowest: Pytest Fixture in 8.5s

VPS50

Fastest: Typed Dataclass in 3.2s
Slowest: Refactor Split in 31.3s

10Q Extremes

The real-context packet shows how sharply first-order repository prompts differ between the Mac and the VPS.

Mac M1

Fastest: Orchestration Timeline Forensics in 4.9s
Slowest: API Token Audit Regression Test in 15.0s

VPS50

Fastest: Ingest Log Triage in 4.4s
Slowest: Task Bulk Job Debug Packet in 33.8s

Artifacts

Direct links to the underlying JSON and standalone report pages.

Mac

Hello report
5Q report
20Q report
10Q report

VPS50

Hello report
5Q report
20Q report
10Q report