Real llama.cpp benchmarks across RTX 5090, DGX Spark, and AMD AI395 with ROCm and Vulkan. Token speeds, VRAM usage, and which hardware wins for local AI.