Yeti Claw

Mission Control | capacity dispatch

BeastMode banner artwork for the Chewbacuh and LiL-Beastly capacity dispatch

Mission Control Dispatch | Published May 7, 2026

BeastMode inference lanes: Chewbacuh vs LiL-Beastly

We moved the two BeastMode ESXi inference guests onto static 192.168.12.x service addresses, published them to the public Yeti Claw surface, and then ran the first controlled capacity benchmark through the live BeastMode route to see where each lane stays interactive and where it simply starts queueing.

Download The PDF Download Artifact Bundle Back To Mission Control
Chewbacuh baseline 10.37s

Average public-path latency at concurrency 1 on qwen3:8b.

LiL-Beastly baseline 21.20s

Average public-path latency at concurrency 1 on qwen3:14b.

Highest tested band 4

Both lanes completed the 4-way step with zero request failures.

Peak CPU busy 100%

Both VMs saturated CPU on every benchmark step.

Executive read

What we learned

  • Chewbacuh is the fast lane. It averaged 10.37 seconds at one live conversation and held throughput around 0.098 requests per second across the entire tested band.
  • LiL-Beastly is the heavier lane. It averaged 21.20 seconds at one live conversation and held throughput near 0.05 requests per second across the tested band.
  • Neither lane failed under load. Both completed every request through concurrency 4 with zero HTTP or inference failures.
  • Both lanes became queue-bound quickly. Additional concurrency mostly increased wait time instead of increasing useful throughput.

Systems under test

Benchmark envelope and operator guardrails

Lane Guest profile Model Static service IP Guardrail
Chewbacuh 8 vCPU, 48 GiB RAM, ESXi guest qwen3:8b 192.168.12.173 Controlled cap at concurrency 4 to preserve interactive access
LiL-Beastly 12 vCPU, 96 GiB RAM, ESXi guest qwen3:14b 192.168.12.174 Controlled cap at concurrency 4 because the 14B lane is explicitly queue-heavy

Curve read

Throughput, latency, and guest pressure

Both BeastMode lanes saturate CPU early. The important difference is starting latency and model weight: Chewbacuh stays noticeably faster, while LiL-Beastly trades speed for the larger 14B model. Mission Control captured guest CPU, load average, and memory use. Physical temperature telemetry is not exposed reliably inside these VMs, so it is intentionally excluded from the committee findings.

Throughput curve comparing Chewbacuh and LiL-Beastly
Throughput stays almost flat, which is the signature of queueing rather than parallel speed-up.
Latency curve comparing Chewbacuh and LiL-Beastly
Latency rises sharply as sessions stack up on CPU-saturated lanes.
Guest CPU and memory curves for Chewbacuh and LiL-Beastly
Both guests hit 100% CPU busy; LiL-Beastly also carries the larger memory footprint.

Comparative table

Headline operating bands

Lane Tested range Hard failures Recommended operating band Peak environment
Chewbacuh 1-4 concurrent conversations 0 1 premium / 2 acceptable / 4 queued 100% peak CPU busy, 6.80 GiB peak guest memory used
LiL-Beastly 1-4 concurrent conversations 0 1 premium / 2 acceptable / 4 queued-heavy 100% peak CPU busy, 11.30 GiB peak guest memory used

Chewbacuh step table

Validated qwen3:8b results

Concurrency Success Throughput rps Avg latency s P95 s Peak CPU % Peak load1 Peak mem MB
16/60.09610.3710.71100.06.016749.2
26/60.09918.5620.41100.07.426751.3
412/120.09835.8841.24100.08.096802.1

LiL-Beastly step table

Validated qwen3:14b results

Concurrency Success Throughput rps Avg latency s P95 s Peak CPU % Peak load1 Peak mem MB
16/60.04721.2024.20100.011.2811264.2
26/60.05036.4840.06100.012.1611245.5
412/120.05168.4778.62100.012.2811299.0

Operator opinion

How to use the lanes

  • Route speed-sensitive public chat to Chewbacuh first.
  • Reserve LiL-Beastly for prompts where the larger model is worth the higher wait time.
  • Keep clear working indicators in the UI because both lanes become queue-bound before they become unstable.
  • Do not market concurrency 4 as “real-time” on either lane. It is stable, but it is not snappy.