Mistral
Mixtral VRAM Requirements: 8x7B and 8x22B at Every Quantization Level
Mixtral 8x7B has 46.7B params but only 12.9B activate per token. You still need VRAM for all 46.7B. Exact VRAM for every quant from Q2 to FP16.
Mistral & Mixtral Guide: Every Model Worth Running Locally
Nemo 12B with 128K context fits in 8GB VRAM at Q4. Mistral 7B runs on 4GB. Mixtral 8x7B needs 26-32GB and is now outpaced by Qwen 3. What's still worth running and what's been superseded.