DFlash
Best 24GB Backend Shootout: ik_llama vs BeeLlama vs llama.cpp
ik_llama and BeeLlama both finish in 22-23s on the am17an 9-prompt harness vs mainline llama.cpp's 37s โ 1.66x and 1.62x speedups via opposite strategies.
DFlash vs MTP on RTX 3090: I Tested Both Locally
Firsthand head-to-head bench of DFlash + DDTree against MTP (PR #22673) on a single RTX 3090, same Qwen 3.6-27B target. Real numbers, both backends.
This Week in Local AI โ I Built DFlash and Audited Lightning
I built DFlash from source on a real RTX 3090 and benched both Qwens. Then audited my stack after PyPI's `lightning` package shipped malware that abuses Claude Code hooks.
How to Get 2.5x Faster Qwen on RTX 3090 (Free)
I built DFlash on my RTX 3090 and ran the full bench. Real 2.5x speedup on Qwen 3.5 and 3.6 โ below the 3.43x README claim, still huge. Here's how.