Qwen

Qwen 3.7 Preview Scored 57 AAI: 27B/35B Open Weights Next
Qwen 3.7 Max scored 57 on Artificial Analysis (+5 over 3.6 Max), #1 of 218 ranked. Open 27B/35B weights coming. InsiderLLM benches within 24h of drop.
May 20, 2026
This Week in Local AI — I Built DFlash and Audited Lightning
I built DFlash from source on a real RTX 3090 and benched both Qwens. Then audited my stack after PyPI's `lightning` package shipped malware that abuses Claude Code hooks.
May 3, 2026
How to Fix Slow Qwen 3.6 27B on RTX 3090 (10-80 tok/s)
Qwen 3.6-27B at 12 tok/s on a 3090 when others report 35? The 8-step diagnostic checklist for offload, quants, templates, power limits, and backend choice.
May 1, 2026
Best Way to Run Qwen 3.6 35B MoE Locally: VRAM, Speed, Setup
Qwen 3.6-35B-A3B has 35B total params but only 3B active per token. Real tok/s on RTX 3090, 4090, 5070 Ti, dual 5060 Ti, and M3 Ultra. Quants and setup.
Apr 28, 2026
Best Way to Get 2x Token Output on RTX 3090: Qwen 3.6 + DFlash
Luce DFlash + DDTree pushes Qwen 3.6-27B Q4_K_M from 35 tok/s to 69 tok/s on a single RTX 3090. Real benchmarks, setup, and honest limits.
Apr 27, 2026
Qwen 3.6 Complete Guide: 27B Dense, 35B-A3B MoE, and Which to Use
Qwen 3.6 landed in two open-weight flavors: 27B dense and 35B-A3B MoE. Benchmarks, hardware fit, and which variant to run on your GPU.
Apr 24, 2026
Best Way to Run 31B Models on a Laptop? Treat Them Like Databases
LARQL decompiles transformer weights into a queryable graph called a vindex. The project pitches a new shape for local inference: walk a subgraph, patch facts, stream from disk. Here's what's real, what's claimed, and what's still research.
Apr 21, 2026
Qwen's Architect Just Walked Out the Door
Junyang Lin, the technical lead and public face of Qwen, has left Alibaba. Two other senior team members gone with him. What this means for the model family that runs on half the local AI setups in the world.
Mar 5, 2026
OpenClaw Model Combinations: What to Pair for Each Task
Stop running one model for everything in OpenClaw. Pair Qwen 2.5 Coder 32B for autocomplete, Qwen 3.5 27B for planning, and Qwen3-Coder-Next for agentic coding. Combos by VRAM tier.
Mar 5, 2026
Local AI for Small Business: Email, Invoicing, and Customer Support Without Monthly Subscriptions
A 5-person team spends $1,500-3,000/year on AI subscriptions. A $600 mini PC running Ollama replaces all of them. Here's the setup, the workflows, and the math.
Mar 4, 2026
Best Local Models for PI Agent: Qwen 3.6, Gemma 4 (2026 Setup)
PI Agent runs any model locally via Ollama. May 2026 picks: Qwen 3.6 27B / 35B-A3B MoE, Gemma 4 26B-A4B. Setup, model comparisons, honest limits.
Feb 28, 2026
Qwen 3.5 Locally — 27B vs 35B-A3B vs 122B, Which Model Fits Your GPU
Qwen 3.5 and 3.6 on local hardware. 27B dense vs 35B-A3B MoE vs 122B compared. VRAM tables, community tok/s on RTX 3090, and which to pick for your card.
Feb 26, 2026
Best Qwen 3.5 Setup: Which Model Fits Your GPU (Complete Cheat Sheet)
Pick the right Qwen 3.5 or 3.6 model for your hardware. Covers 0.8B through 397B with VRAM requirements, quant recommendations, and benchmarks for every GPU tier. Updated April 2026 with Qwen 3.6-35B-A3B coverage.
Feb 25, 2026
Qwen vs Llama vs Mistral: Which Model Family Should You Build On?
Qwen has 201 languages and a model for every task. Llama has the biggest community. Mistral pioneered efficient MoE. Decision framework for choosing your model family in 2026.
Feb 21, 2026
Running 70B Models Locally — Exact VRAM by Quantization
Llama 3.3 70B needs 43GB at Q4, 75GB at Q8, 141GB at FP16. Here's every quant level, which GPUs fit, real speeds, and when 32B is the smarter choice.
Feb 14, 2026
Local LLMs vs Claude: When Each Actually Wins
Qwen 3 32B matches Claude on daily tasks at zero marginal cost. Claude still wins on 200K-token documents and multi-step debugging. Benchmarks, pricing, and when to use each.
Feb 3, 2026
Best Qwen Models Ranked: Which to Run Locally
Complete Qwen models guide covering Qwen 3.5, Qwen 3, Qwen 2.5 Coder, and Qwen-VL. VRAM requirements, Ollama setup, Gated DeltaNet architecture, and benchmarks vs Llama and DeepSeek.
Feb 2, 2026
Best Local LLMs for Math & Reasoning: What Actually Works
The best local LLMs for math and reasoning tasks, ranked by VRAM tier. AIME and MATH benchmarks for DeepSeek R1, Qwen 3 thinking, and Phi-4-reasoning.
Feb 2, 2026