Yeti Claw

Mission Control | capacity dispatch

Savage banner artwork featuring Yeti Claw, BeastMode, and Gojira

Mission Control Dispatch | Published May 7, 2026

Mac Mini vs DGX Spark: text concurrency capacity review

We ran a controlled simultaneous-conversation benchmark across the Mac mini text stack and the DGX Spark text stack to find where each unit stays interactive, where it starts queueing, and where the operator should stop pretending throughput is still improving.

Download The PDF Download Artifact Bundle Back To Mission Control
Mac mini validated band 4+

Four simultaneous mixed-model conversations validated with no thermal warnings.

Spark premium band 2

Two simultaneous conversations stay in the premium responsiveness zone.

Spark stable ceiling tested 8

Eight mixed-model conversations completed without request failures.

Peak Spark GPU temp 67C

Thermals stayed controlled even when the box was visibly queueing.

Executive read

What we learned

  • The Mac mini passed every controlled step from 1 to 4 concurrent mixed-model conversations with zero request failures and no thermal or performance warnings.
  • The DGX Spark passed every controlled step from 1 to 8 concurrent mixed-model conversations with zero request failures, a 67C peak GPU temperature, and no transport loss.
  • DGX Spark did not fail under load, but it stopped getting meaningfully faster after the 2-conversation band. After that point, extra concurrency mostly turned into waiting time.
  • The Mac mini result is conservative because Low Power Mode was enabled during the run. Its actual ceiling is likely higher than the validated band published here.

Systems under test

Benchmark envelope and guardrails

Unit Hardware and mode Models under test Safety control
Mac mini Apple M4 Pro, 12 CPU cores, 48 GB RAM, Low Power Mode enabled qwen2.5:7b, qwen3:30b, nemotron-3-nano:30b Controlled cap at 4 concurrent conversations to preserve access during the run
DGX Spark NVIDIA DGX Spark, ARM 20-core CPU, about 122 GiB system memory llava:latest, nemotron-mini:latest, nemotron:latest, qwen3:8b Automatic stop if any request failed or GPU temperature reached 72C

Curve read

Throughput, latency, and thermals

Spark’s signature was queueing saturation rather than instability. Throughput stayed almost flat from 0.137 requests per second at one concurrent conversation to 0.143 at eight, while P95 latency rose from 21.07 seconds to 127.35 seconds. The Mac mini, by contrast, kept improving throughput throughout the tested band and never triggered a thermal warning.

Throughput curve comparing Mac mini and DGX Spark
Throughput stays flat on Spark after the two-conversation band.
P95 latency curve comparing Mac mini and DGX Spark
P95 latency shows where concurrency stops feeling interactive and starts queueing.
Environmental telemetry curves for Mac mini CPU and DGX Spark GPU temperature
Thermals stayed controlled on both systems inside the tested envelope.

Comparative table

Headline operating bands

Unit Tested range Hard failures Recommended operating band Peak environment
Mac mini 1-4 concurrent conversations 0 4 live mixed-model conversations validated today No thermal warning; 67.18% peak CPU busy
DGX Spark 1-8 concurrent conversations 0 2 premium / 4 acceptable / 6-8 queued 67C peak GPU temp; 96% peak GPU utilization

Mac mini step table

Validated Mac mini results

Concurrency Success Throughput rps Avg latency s P50 s P95 s Max s Peak CPU % Load1 Thermal warn
19/90.1039.7210.0713.5714.1067.187.07No
29/90.12314.9013.7521.6221.7964.606.68No
312/120.22411.2210.9923.5425.6755.357.28No
416/160.26513.3914.3026.0828.6044.675.91No

DGX Spark step table

Validated Spark results

Concurrency Success Throughput rps Avg latency s P50 s P95 s Max s Peak GPU C Peak GPU % Peak GPU W
112/120.1377.282.7921.0721.0863.096.049.61
212/120.13513.339.1930.2831.5766.096.046.61
416/160.13822.3111.7959.2762.2565.096.048.01
624/240.14329.8315.2295.35100.9266.096.047.52
832/320.14338.1517.78127.35134.5867.096.047.16