Three speculative-decoding backends benched head to head on a single RTX 3090. The VRAM calculator finally caught up. And a 120-article audit found stale Qwen 2.5 recommendations.