Budget GPU

What Can You Actually Run on 4GB VRAM?
1B-3B models run at 18-55 tok/s. Qwen 2.5 3B at Q4 is the sweet spot for chat and simple coding. 7B models don't fit. What works on GTX 1050 Ti and 1650, and when to upgrade.
Jan 30, 2026