llama.cpp won't build or runs wrong? CMake, CUDA, Gemma 4 thinking-mode, Qwen 3.6 kwargs, num_ctx VRAM overflow. Exact fixes for every platform.