Qwen 3.6
Wicked Fast Gemma 4 vs Qwen 3.6 on RTX 3090: 3.10x Tested
Same RTX 3090, same llama.cpp build, same bench. Gemma 4 26B-A4B Q4_K_XL: 128 tok/s mean. Qwen 3.6-27B Q4_K_M: 41 tok/s. 3.10x faster, firsthand.
DFlash vs MTP on RTX 3090: I Tested Both Locally
Firsthand head-to-head bench of DFlash + DDTree against MTP (PR #22673) on a single RTX 3090, same Qwen 3.6-27B target. Real numbers, both backends.
How to Get 2.5x Faster Qwen on RTX 3090 (Free)
I built DFlash on my RTX 3090 and ran the full bench. Real 2.5x speedup on Qwen 3.5 and 3.6 — below the 3.43x README claim, still huge. Here's how.
This Week in Local AI — DeepSeek V4 Took #1 on Vibe Code
DeepSeek V4-Flash hit #1 on Vibe Code Benchmark. Qwen 3.6 dropped both variants. FP4 landed in llama.cpp. Anthropic admitted they quietly downgraded Claude Code on March 4.
Qwen 3.6 Complete Guide: 27B Dense, 35B-A3B MoE, and Which to Use
Qwen 3.6 landed in two open-weight flavors: 27B dense and 35B-A3B MoE. Benchmarks, hardware fit, and which variant to run on your GPU.
Qwen 3.5 Locally — 27B vs 35B-A3B vs 122B, Which Model Fits Your GPU
Qwen 3.5 and 3.6 on local hardware. 27B dense vs 35B-A3B MoE vs 122B compared. VRAM tables, community tok/s on RTX 3090, and which to pick for your card.
Best Local LLMs for Mac in 2026 — M1, M2, M3, M4 Tested
The best models to run on every Mac tier. Specific picks for 8GB M1 through 192GB M3 Ultra, with real tok/s numbers. Qwen 3.6, DeepSeek V4, MLX vs Ollama, updated April 2026.
Best Local Models for OpenClaw 2026: Qwen 3.6 + DeepSeek V4
Qwen 3.6-27B dense ties Sonnet 4.6 on agentic coding; 3.6-35B-A3B runs OpenClaw on 16GB VRAM. Plus DeepSeek V4-Flash, sampling tips, VRAM tiers.
Best Local Coding Models Ranked: Every VRAM Tier, Every Benchmark (2026)
The best local LLMs for coding in 2026, ranked by VRAM tier. Qwen 3.6-27B, 3.6-35B-A3B, DeepSeek V4-Flash, benchmarks, editor setup, and Claude Code alternatives.