Open-Source Models We Advise On

Three models ranked on arena.ai's agent leaderboard, filtered to fully open licenses (MIT / Apache 2.0), all 100B+ total parameters.

The memory math: minimum GPU memory (GB) = total parameters (in billions) × 1 GB, plus 20% working room for context, attention cache, and overhead. A mixture-of-experts model only computes with a fraction of its parameters per token, but every parameter still has to sit in memory — so the total parameter count is what determines minimum hardware, not the "active" count.
Apache 2.0

Mixtral 8x22B — Mistral AI

Mixture-of-experts: only 39B of the 141B total parameters activate per token, but all 141B must still be resident in memory.

Total parameters141B
Every parameter is a number the model stores and must load into GPU memory — more parameters generally means more capable, but also more hardware to run it.
Active parameters / token39B
How many parameters actually do math for each word generated. Lower than the total because this is a "mixture of experts" model — but it does NOT reduce the memory needed.
LicenseApache 2.0
Fully permissive — commercial use, modification, and redistribution with no restrictions.
Minimum memory needed169.2 GB
Math shown: 141B × 1 GB = 141 GB, + 20% working room = 169.2 GB.
NVIDIA RTX PRO 6000 Blackwell (Workstation Edition) NVIDIA press image

2x RTX PRO 6000 Blackwell (cheapest option that clears the bar)

GPU / systemNVIDIA RTX PRO 6000 Blackwell (Workstation Edition) × 2
Combined memory192 GB
GPU cost$17,130
Host system (est.)$4,000
Total price$21,130
1,450 W sustained draw
🏠 = 1.21x an average home (~1,200W continuous)
🔋 = drains a 90 kWh EV battery in 62.07 hrs

Datacenter-grade alternative: 3x H100 SXM (datacenter-grade alternative) — 240GB, $94,000. More expensive and still needs a server chassis with NVLink support — shown for comparison only.

Apache 2.0

Qwen3-235B-A22B — Alibaba

Ranked on arena.ai's open-weight leaderboard; mixture-of-experts with 22B active parameters per token.

Total parameters235B
Every parameter is a number the model stores and must load into GPU memory — more parameters generally means more capable, but also more hardware to run it.
Active parameters / token22B
How many parameters actually do math for each word generated. Lower than the total because this is a "mixture of experts" model — but it does NOT reduce the memory needed.
LicenseApache 2.0
Fully permissive — commercial use, modification, and redistribution with no restrictions.
Minimum memory needed282.0 GB
Math shown: 235B × 1 GB = 235 GB, + 20% working room = 282.0 GB.
NVIDIA RTX PRO 6000 Blackwell (Workstation Edition) NVIDIA press image

3x RTX PRO 6000 Blackwell (cheapest option that clears the bar)

GPU / systemNVIDIA RTX PRO 6000 Blackwell (Workstation Edition) × 3
Combined memory288 GB
GPU cost$25,695
Host system (est.)$4,000
Total price$29,695
2,050 W sustained draw
🏠 = 1.71x an average home (~1,200W continuous)
🔋 = drains a 90 kWh EV battery in 43.9 hrs

Datacenter-grade alternative: 1x DGX H100 node (datacenter-grade alternative) — 640GB, $350,000. Far more expensive, but adds huge concurrency headroom for serving many users at once.

MIT License

DeepSeek-V3 — DeepSeek AI

The DeepSeek V3/V4 family leads arena.ai's open-weight leaderboard as of mid-2026, trading blows with closed frontier models.

Total parameters671B
Every parameter is a number the model stores and must load into GPU memory — more parameters generally means more capable, but also more hardware to run it.
Active parameters / token37B
How many parameters actually do math for each word generated. Lower than the total because this is a "mixture of experts" model — but it does NOT reduce the memory needed.
LicenseMIT License
Fully permissive — one of the most capable open-weight models under the least restrictive common license.
Minimum memory needed805.2 GB
Math shown: 671B × 1 GB = 671 GB, + 20% working room = 805.2 GB.
NVIDIA DGX H100 (8x H100 node) NVIDIA press image

2x DGX H100 nodes, networked as a cluster

GPU / systemNVIDIA DGX H100 (8x H100 node) × 2
Combined memory1,280 GB
Total price$700,000
17,000 W sustained draw
🏠 = 14.17x an average home (~1,200W continuous)
🔋 = drains a 90 kWh EV battery in 5.29 hrs
No single GPU or node available on the market today holds 805GB alone — this genuinely requires multiple machines working together. See the Clusters page.

Datacenter-grade alternative: 11x H100 SXM (raw GPU count, for comparison) — 880GB, $334,000. 11 GPUs don't fit in one 8-GPU DGX node, so this still means 2+ physical machines in practice.