Mission Control banner art for the Yeti Claw fleet map

Mission Control Dispatch | Published May 7, 2026

Yeti Claw fleet map: where each inference lane fits

This is the operator view of the fleet as it exists today. We took the economics dispatch, the Mac mini and Spark text-capacity study, the BeastMode lane benchmark, and the Spark image sweep, then collapsed them into one practical answer: which box should take which work, what each lane costs, and where queueing should start instead of pretending everything is a real-time surface.

Fleet Map Operations Mac mini DGX Spark BeastMode Mission Control

Open Summary JSON Download Fleet Map Bundle Back To Mission Control

Cheapest local lane Mac mini

$0.110/h 24x7 with the lowest measured text-token cost in the fleet.

Only proven image lane Spark

8 of 10 open-source image models completed in the concurrent burst study.

Fastest BeastMode lane Chewbacuh

10.37s baseline latency with a flat throughput plateau around 0.098 rps.

Spark premium text band 2

Two simultaneous conversations stay in the premium responsiveness zone before queueing dominates.

Executive read

What the fleet map says in plain language

Use the Mac mini as the default public text lane when you want the best cost-to-latency ratio on a small box.
Use DGX Spark whenever the job is image-first, multimodal, or premium enough to justify the more expensive but more capable box.
Use Chewbacuh as the first BeastMode overflow lane when the Mac mini is saturated or when you want dedicated public-route capacity on ESXi.
Use LiL-Beastly when the larger 14B lane is worth the extra wait time. It is an availability lane, not the cheap real-time lane.

Lane map

The fleet at a glance

Primary text surface

Mac mini

Default

The strongest small-surface economics play in the fleet and the least dramatic operator lane for public text.

24x7 cost$0.110/h

Validated band4+

Representative tok/s47.77

$ / 1M tok$0.64

Use it for: everyday chat, premium low-latency text, and the cheapest local inference in the current fleet.

Premium multimodal box

DGX Spark

Multimodal

The only lane with benchmark-proven image-generation coverage and still a strong text lane when the workload stays busy.

24x7 cost$0.254/h

Text band2 / 4 / 8

Image sweep8 / 10

$ / 1M tok$1.68

Use it for: image generation, multimodal experiments, and premium shared text where queueing is acceptable past two live sessions.

BeastMode fast lane

Chewbacuh

Overflow

The faster of the two ESXi text workers, useful for public-route overflow and queue-friendly text traffic.

VM shape8 / 48

Baseline latency10.37s

Throughput plateau0.098 rps

$ / 1M tok*$1.04

Use it for: supplemental public text capacity once the primary small-box lane is saturated.

BeastMode heavy lane

LiL-Beastly

14B Lane

The heavier ESXi worker. Stable, but deliberately slower because the larger model is the point.

VM shape12 / 96

Baseline latency21.20s

Throughput plateau0.051 rps

$ / 1M tok*$3.98

Use it for: larger-model availability where users will tolerate queue-heavy behavior in exchange for model quality.

* BeastMode token economics are lower-bound capital-only estimates because this management path does not expose clean per-host watt telemetry.

Routing matrix

What should take which job

Scenario	Primary lane	Fallback lane	Why
Fast public chat	Mac mini	Chewbacuh	Best cost-to-latency ratio with the least operational drama.
Shared premium text	DGX Spark	Mac mini	Spark remains strong for text as long as the live band stays small.
Larger-model public text	LiL-Beastly	Chewbacuh	The 14B lane exists for model availability, not for cheapest real-time output.
Image generation	DGX Spark	Queue on Spark	No other current public lane has benchmark-proven image-generation coverage.
Burst traffic / overflow	Chewbacuh	LiL-Beastly	BeastMode turns spare ESXi capacity into a shared text safety valve.

Series bridge

The four reports that shape this map

Economics

Cloud vs. local inference economics

The buy-vs-rent model: local hourly cost, breakeven windows, and token-normalized economics.

Open article PDF

Capacity

Mac mini vs DGX Spark text capacity

The concurrency study that defines the comfort band for the two primary text boxes.

Open article PDF

BeastMode

Chewbacuh vs LiL-Beastly

The ESXi lane benchmark that tells us when BeastMode is useful and when it is just queueing.

Open article PDF

Image Lab

DGX Spark image model sweep

The evidence that Spark is more than a text box and where its queue-design pressure begins.

Open article PDF

Operator playbook

How this should change product behavior

Default the main public text experience to the Mac mini unless a route explicitly needs a BeastMode or Spark lane.
Queue image jobs on Spark instead of treating them like synchronous chat replies, because the image sweep proved that model times vary wildly.
Expose BeastMode as deliberate routing, not hidden magic. Users should understand when they are choosing the faster 8B lane versus the slower 14B lane.
Use Spark for premium multimodal and image-heavy flows, then use Mission Control data to justify when the queue should widen or when new hardware is warranted.

Download desk

Fleet-map companion files

This dispatch is a synthesis layer over the published reports. Grab the summary JSON or the small artifact bundle, then jump to the underlying committee PDFs for the deep data.

Summary JSON Fleet ZIP Readme

Call to action

Use the map to route traffic on purpose

Text first: use the Mac mini as the default small-surface text lane.
Overflow: route into Chewbacuh before reaching for the heavier 14B lane.
Images and multimodal: keep them on Spark and queue them honestly.
Scale decisions: use the economics report before buying new local metal or renting more cloud GPU hours.

Keep reading