ExLlamaV2 hits 57 tok/s where llama.cpp hits 31 — same model, same GPU. Benchmarks at every VRAM tier, EXL2 vs GGUF quality, and setup guides.