The best local models for retrieval-augmented generation by VRAM tier. Qwen 3, Command R 35B, embedding models, and RAG stacks with real failure modes.