Open-Source Models We Advise On

Three models ranked on arena.ai's agent leaderboard, filtered to fully open licenses (MIT / Apache 2.0), all 100B+ total parameters.

The memory math: minimum GPU memory (GB) = total parameters (in billions) × 1 GB, plus 20% working room for context, attention cache, and overhead. A mixture-of-experts model only computes with a fraction of its parameters per token, but every parameter still has to sit in memory — so the total parameter count is what determines minimum hardware, not the "active" count.

Apache 2.0

Mixtral 8x22B — Mistral AI

Mixture-of-experts: only 39B of the 141B total parameters activate per token, but all 141B must still be resident in memory.

Total parameters141B

Every parameter is a number the model stores and must load into GPU memory — more parameters generally means more capable, but also more hardware to run it.

Active parameters / token39B

How many parameters actually do math for each word generated. Lower than the total because this is a "mixture of experts" model — but it does NOT reduce the memory needed.

LicenseApache 2.0

Fully permissive — commercial use, modification, and redistribution with no restrictions.

Source: https://mistral.ai/news/mixtral-8x22b/

Minimum memory needed169.2 GB

Math shown: 141B × 1 GB = 141 GB, + 20% working room = 169.2 GB.

NVIDIA RTX PRO 6000 Blackwell (Workstation Edition)

NVIDIA press image

2x RTX PRO 6000 Blackwell (cheapest option that clears the bar)

GPU / systemNVIDIA RTX PRO 6000 Blackwell (Workstation Edition) × 2

Combined memory192 GB

GPU cost$17,130

Host system (est.)$4,000

Total price$21,130

⚡ 1,450 W sustained draw

🏠 = 1.21x an average home (~1,200W continuous)

🔋 = drains a 90 kWh EV battery in 62.07 hrs

Datacenter-grade alternative: 3x H100 SXM (datacenter-grade alternative) — 240GB, $94,000. More expensive and still needs a server chassis with NVLink support — shown for comparison only.

Apache 2.0

Qwen3-235B-A22B — Alibaba

Ranked on arena.ai's open-weight leaderboard; mixture-of-experts with 22B active parameters per token.

Total parameters235B

Every parameter is a number the model stores and must load into GPU memory — more parameters generally means more capable, but also more hardware to run it.

Active parameters / token22B

How many parameters actually do math for each word generated. Lower than the total because this is a "mixture of experts" model — but it does NOT reduce the memory needed.

LicenseApache 2.0

Fully permissive — commercial use, modification, and redistribution with no restrictions.

Source: https://huggingface.co/Qwen/Qwen3-235B-A22B/blob/main/LICENSE

Minimum memory needed282.0 GB

Math shown: 235B × 1 GB = 235 GB, + 20% working room = 282.0 GB.

NVIDIA press image

3x RTX PRO 6000 Blackwell (cheapest option that clears the bar)

GPU / systemNVIDIA RTX PRO 6000 Blackwell (Workstation Edition) × 3

Combined memory288 GB

GPU cost$25,695

Host system (est.)$4,000

Total price$29,695

⚡ 2,050 W sustained draw

🏠 = 1.71x an average home (~1,200W continuous)

🔋 = drains a 90 kWh EV battery in 43.9 hrs

Datacenter-grade alternative: 1x DGX H100 node (datacenter-grade alternative) — 640GB, $350,000. Far more expensive, but adds huge concurrency headroom for serving many users at once.

MIT License

DeepSeek-V3 — DeepSeek AI

The DeepSeek V3/V4 family leads arena.ai's open-weight leaderboard as of mid-2026, trading blows with closed frontier models.

Total parameters671B

Every parameter is a number the model stores and must load into GPU memory — more parameters generally means more capable, but also more hardware to run it.

Active parameters / token37B

How many parameters actually do math for each word generated. Lower than the total because this is a "mixture of experts" model — but it does NOT reduce the memory needed.

LicenseMIT License

Fully permissive — one of the most capable open-weight models under the least restrictive common license.

Source: https://github.com/deepseek-ai/DeepSeek-V3

Minimum memory needed805.2 GB

Math shown: 671B × 1 GB = 671 GB, + 20% working room = 805.2 GB.

NVIDIA press image

2x DGX H100 nodes, networked as a cluster

GPU / systemNVIDIA DGX H100 (8x H100 node) × 2

Combined memory1,280 GB

Total price$700,000

⚡ 17,000 W sustained draw

🏠 = 14.17x an average home (~1,200W continuous)

🔋 = drains a 90 kWh EV battery in 5.29 hrs

No single GPU or node available on the market today holds 805GB alone — this genuinely requires multiple machines working together. See the Clusters page.

Datacenter-grade alternative: 11x H100 SXM (raw GPU count, for comparison) — 880GB, $334,000. 11 GPUs don't fit in one 8-GPU DGX node, so this still means 2+ physical machines in practice.