RTX 3090
Wicked Fast Gemma 4 vs Qwen 3.6 on RTX 3090: 3.10x Tested
Same RTX 3090, same llama.cpp build, same bench. Gemma 4 26B-A4B Q4_K_XL: 128 tok/s mean. Qwen 3.6-27B Q4_K_M: 41 tok/s. 3.10x faster, firsthand.
DFlash vs MTP on RTX 3090: I Tested Both Locally
Firsthand head-to-head bench of DFlash + DDTree against MTP (PR #22673) on a single RTX 3090, same Qwen 3.6-27B target. Real numbers, both backends.
How to Fix Slow Qwen 3.6 27B on RTX 3090 (10-80 tok/s)
Qwen 3.6-27B at 12 tok/s on a 3090 when others report 35? The 8-step diagnostic checklist for offload, quants, templates, power limits, and backend choice.
How to Get 2.5x Faster Qwen on RTX 3090 (Free)
I built DFlash on my RTX 3090 and ran the full bench. Real 2.5x speedup on Qwen 3.5 and 3.6 — below the 3.43x README claim, still huge. Here's how.
Best Way to Run Qwen 3.6 35B MoE Locally: VRAM, Speed, Setup
Qwen 3.6-35B-A3B has 35B total params but only 3B active per token. Real tok/s on RTX 3090, 4090, 5070 Ti, dual 5060 Ti, and M3 Ultra. Quants and setup.
Best Way to Get 2x Token Output on RTX 3090: Qwen 3.6 + DFlash
Luce DFlash + DDTree pushes Qwen 3.6-27B Q4_K_M from 35 tok/s to 69 tok/s on a single RTX 3090. Real benchmarks, setup, and honest limits.
RTX 4090 vs Used RTX 3090 for Local AI: Which to Buy in 2026
Both have 24GB VRAM. One costs 2-3x more. RTX 4090 vs used RTX 3090 — real benchmarks, real prices, and who should buy which for local LLM inference and image generation.
RTX 3090 vs 4070 Ti Super for Local LLMs
Head-to-head comparison of the RTX 3090 and RTX 4070 Ti Super for running LLMs locally. Covers VRAM, speed, power, price, and which to buy for your use case.
Best Used GPUs for Local AI: 2026 Buying Guide
RTX 3090 at $700-850 for 24GB, RTX 3060 12GB at $170-220, RTX 3080 at $350-400. Tier rankings, fair prices, what to avoid (skip the 8GB 3070), and where to buy safely.
Used GPU Buying Guide for Local AI: How to Buy Smart
RTX 3060 12GB for ~$200, RTX 3090 24GB for ~$750—used GPUs offer 2-3x the VRAM per dollar vs new. Fair prices, scam red flags, and where to buy safely.
What Can You Actually Run on 24GB VRAM?
Qwen 3.5 27B at Q4 fits in 17GB with 64K+ context. 70B at Q3 with limited context. Flux at full FP16. RTX 3090 at $700 vs 4090 at $1,800—every model that fits and which GPU to buy.
Used RTX 3090 Buying Guide for Local AI
24GB VRAM for $650-750—half the cost of an RTX 4090 with the same capacity. Fair prices, eBay red flags, PSU requirements (850W minimum), and how to test before your return window closes.