Is the DGX Spark Worth $4,700?

NVIDIA DGX Spark on desk

After running benchmarks on the DGX Spark across several models in Part 1, the obvious next question is: how does it actually compare to the alternatives, and is it worth the price tag?

Here's the lineup I put together for this comparison.


1. The Contenders

MachinePriceVRAM / RAMMemory BW
DGX Spark GB10~$4,700128GB unified273 GB/s
RTX 5090 build~$3,80032GB GDDR71,792 GB/s
RTX 4090 build~$2,80024GB GDDR6X1,008 GB/s
Mac Studio M4 Max$3,699128GB unified546 GB/s

The DGX is the most expensive machine on this list. The RTX 5090 has more than double the bandwidth for $900 less, which might make it the preferred choice for many hobbyists and AI practitioners. Even the Mac Studio, which a lot of people online treat as the obvious choice for running local models, has the same RAM and double the bandwidth for $1,000 less. Apple hardware being a value play is a relatively new phenomenon, but it really is competitive for local models right now.

Where the DGX Spark obviously beats everyone is form factor. It's small enough to fit in a bag, though most people probably don't need their AI lab to be portable.


2. Speed Comparison

ModelSpark GB10RTX 5090 ✅RTX 4090 ⚠️Mac Studio ⚠️
qwq:32b8.5 tok/s64.8 tok/s~40 tok/s~28 tok/s
qwen2.5-coder:32b7.3 tok/s62.2 tok/s~38 tok/s~25 tok/s
deepseek-r1:70b4.1 tok/s❌ OOM❌ OOM~10 tok/s
qwen3.5:122b15.1 tok/s❌ OOM❌ OOM❌ OOM

✅ First-party data via RunPod. ⚠️ RTX 4090 and Mac Studio figures sourced from community benchmarks (r/LocalLLaMA) — RTX 4090 allocation was unavailable on RunPod during testing.

Let's get the embarrassing stat out of the way first: speed. On a 32B model, the RTX 5090 is 7.6× faster than the DGX Spark. We rented an RTX 5090 on RunPod for a few dollars to get first-party numbers. This goes back to the bandwidth advantage. When a model fits entirely in VRAM, throughput is essentially equivalent to memory bandwidth. The GDDR7 at 1,792 GB/s clearly dominates the LPDDR5x at 273 GB/s. The RTX 4090 is roughly 5× faster and the Mac Studio about 3× faster on 32B models.

The Spark is not winning on speed. There's no way to spin that differently.


3. The 128GB Advantage

What you also see in that table is that none of the alternatives can handle the 70B or 122B models at all. That's where the Spark's real advantage comes through.

The RTX 5090 smokes the Spark on 32B models. But once the model doesn't fit in VRAM anymore, the story changes completely. With partial CPU offload, models run 10–20× slower than full GPU processing. The VRAM cliff is a real problem.

The Spark handles large models without breaking a sweat. I loaded claude-distilled (16GB) and QwQ-32B (18GB) simultaneously and it handled both flawlessly. Try doing that on a 32GB GPU. R1-70B runs at 4.1 tok/s, which is slow, but usable for async tasks, research, or overnight batch jobs. My MoE model Qwen3.5:122b runs at 15.1 tok/s with ~10B active parameters per token. You simply can't load that on anything else on this list.

So yes, the 128GB of unified memory is the main selling point. Most other aspects are tradeoffs I'm willing to make.


4. The Real Competitor: Mac Studio M4 Max

In my view, the true head-to-head is the Mac Studio M4 Max. If your primary use case is inference and you're already in the Mac ecosystem, it might just be the machine to get. Same RAM, 2× the bandwidth, $1,300 cheaper.

What you get with the Spark instead is CUDA. New models, inference engines, and research repos typically run on NVIDIA before anything else. Something that appealed to me specifically was training and fine-tuning my own models. Metal is improving, but it's not on par with CUDA yet. Most cutting-edge research code runs on Linux/CUDA and requires porting to run on Metal. CUDA just gives you less friction.

Bottom line: The Mac Studio is the better inference appliance. The Spark is the better developer workstation.


5. Total Cost of Ownership

OptionUpfrontPower/yr3-Year Total
DGX Spark$4,700~$80~$4,940
RTX 5090 build$3,800~$200~$4,400
RTX 4090 build$2,800~$180~$3,340
Mac Studio 128GB$3,699~$60~$3,879
Cloud (RunPod, 8h/day)$0~$15,768

The Spark only draws about 22–45W at idle. That's generally less than a full high-end GPU desktop. Cloud at 8 hours a day breaks even with the Spark in under a year. If you're a heavy daily user, the hardware pays back fast. The question is how many hours a day you actually run inference.

On resale value, my guess is the Spark holds its value better than a GPU, but we'll see how that plays out.


6. Why I Bought It

I wanted to run models that are too large to fit on anything else. I'll admit the speed difference on 32B models was a bit of a surprise. 8.5 tok/s vs 64.8 tok/s on the RTX 5090 is a massive gap and hard to ignore. But when I load R1-70B or run two models simultaneously, I don't regret the decision. That matters more to me than raw speed.

If you need to run large models locally and CUDA matters, the Spark is the only consumer option that gets you there. If you don't, there are cheaper ways.


7. Buy or Don't Buy?

✅ Buy it if:

  • You want to run 70B+ models locally without VRAM limits
  • You need CUDA for training, fine-tuning, or custom inference pipelines
  • You want a quiet, compact, fanless Linux appliance with zero driver headaches
  • You're a developer who needs large models without cloud latency or cost
  • Privacy matters to you. Nothing ever leaves your machine

❌ Don't buy it if:

  • You mostly run 7B–32B models. The RTX 4090/5090 is 5–7× faster for less money
  • You're already in the Mac ecosystem. The Mac Studio M4 Max is better value for pure inference
  • Speed is your priority. 4.1 tok/s on R1-70B is real friction for chat use
  • You're cost-sensitive. The premium is hard to justify unless you actually max out the 128GB

Where to buy

DGX Spark GB10

Available directly from NVIDIA or on Amazon.

→ Buy from NVIDIA directly

DGX Spark on Amazon →
Affiliate link · No extra cost to you

RTX 5090

GPU-only card — you'll need a build to go with it. Benchmark data via RunPod.

RTX 5090 on Amazon →
Affiliate link · No extra cost to you

RTX 4090

RTX 4090 on Amazon →
Affiliate link · No extra cost to you

Mac Studio M4 Max

Not available on Amazon. Buy direct from Apple.

→ Buy Mac Studio from Apple


Up next, Part 3: I ran two instances of the same AI model against each other in a strategy game. Same weights, same architecture. They played completely differently. What that tells us about how temperature affects decision-making and what it means for training your own models on a Spark. Coming soon.