Yeti Claw

Mission Control | fleet dispatch

Mission Control banner art for the Yeti Claw fleet map

Mission Control Dispatch | Published May 7, 2026

Yeti Claw fleet map: where each inference lane fits

This is the operator view of the fleet as it exists today. We took the economics dispatch, the Mac mini and Spark text-capacity study, the BeastMode lane benchmark, and the Spark image sweep, then collapsed them into one practical answer: which box should take which work, what each lane costs, and where queueing should start instead of pretending everything is a real-time surface.

Open Summary JSON Download Fleet Map Bundle Back To Mission Control
Cheapest local lane Mac mini

$0.110/h 24x7 with the lowest measured text-token cost in the fleet.

Only proven image lane Spark

8 of 10 open-source image models completed in the concurrent burst study.

Fastest BeastMode lane Chewbacuh

10.37s baseline latency with a flat throughput plateau around 0.098 rps.

Spark premium text band 2

Two simultaneous conversations stay in the premium responsiveness zone before queueing dominates.

Executive read

What the fleet map says in plain language

  • Use the Mac mini as the default public text lane when you want the best cost-to-latency ratio on a small box.
  • Use DGX Spark whenever the job is image-first, multimodal, or premium enough to justify the more expensive but more capable box.
  • Use Chewbacuh as the first BeastMode overflow lane when the Mac mini is saturated or when you want dedicated public-route capacity on ESXi.
  • Use LiL-Beastly when the larger 14B lane is worth the extra wait time. It is an availability lane, not the cheap real-time lane.

Lane map

The fleet at a glance

Mac mini

Default

The strongest small-surface economics play in the fleet and the least dramatic operator lane for public text.

24x7 cost$0.110/h
Validated band4+
Representative tok/s47.77
$ / 1M tok$0.64

Use it for: everyday chat, premium low-latency text, and the cheapest local inference in the current fleet.

DGX Spark

Multimodal

The only lane with benchmark-proven image-generation coverage and still a strong text lane when the workload stays busy.

24x7 cost$0.254/h
Text band2 / 4 / 8
Image sweep8 / 10
$ / 1M tok$1.68

Use it for: image generation, multimodal experiments, and premium shared text where queueing is acceptable past two live sessions.

Chewbacuh

Overflow

The faster of the two ESXi text workers, useful for public-route overflow and queue-friendly text traffic.

VM shape8 / 48
Baseline latency10.37s
Throughput plateau0.098 rps
$ / 1M tok*$1.04

Use it for: supplemental public text capacity once the primary small-box lane is saturated.

LiL-Beastly

14B Lane

The heavier ESXi worker. Stable, but deliberately slower because the larger model is the point.

VM shape12 / 96
Baseline latency21.20s
Throughput plateau0.051 rps
$ / 1M tok*$3.98

Use it for: larger-model availability where users will tolerate queue-heavy behavior in exchange for model quality.

* BeastMode token economics are lower-bound capital-only estimates because this management path does not expose clean per-host watt telemetry.

Routing matrix

What should take which job

Scenario Primary lane Fallback lane Why
Fast public chat Mac mini Chewbacuh Best cost-to-latency ratio with the least operational drama.
Shared premium text DGX Spark Mac mini Spark remains strong for text as long as the live band stays small.
Larger-model public text LiL-Beastly Chewbacuh The 14B lane exists for model availability, not for cheapest real-time output.
Image generation DGX Spark Queue on Spark No other current public lane has benchmark-proven image-generation coverage.
Burst traffic / overflow Chewbacuh LiL-Beastly BeastMode turns spare ESXi capacity into a shared text safety valve.

Series bridge

The four reports that shape this map

Cloud vs. local inference economics

The buy-vs-rent model: local hourly cost, breakeven windows, and token-normalized economics.

Open article PDF

Mac mini vs DGX Spark text capacity

The concurrency study that defines the comfort band for the two primary text boxes.

Open article PDF

Chewbacuh vs LiL-Beastly

The ESXi lane benchmark that tells us when BeastMode is useful and when it is just queueing.

Open article PDF

DGX Spark image model sweep

The evidence that Spark is more than a text box and where its queue-design pressure begins.

Open article PDF

Operator playbook

How this should change product behavior

  • Default the main public text experience to the Mac mini unless a route explicitly needs a BeastMode or Spark lane.
  • Queue image jobs on Spark instead of treating them like synchronous chat replies, because the image sweep proved that model times vary wildly.
  • Expose BeastMode as deliberate routing, not hidden magic. Users should understand when they are choosing the faster 8B lane versus the slower 14B lane.
  • Use Spark for premium multimodal and image-heavy flows, then use Mission Control data to justify when the queue should widen or when new hardware is warranted.