Edge-Ai

Qwen 3.5 Small Models: The 9B Beats Last-Gen 30B — Here's What Matters for Local AI
Alibaba's Qwen 3.5 drops 4 small models (0.8B to 9B) — all natively multimodal, 262K context, Apache 2.0. The 9B beats Qwen3-30B on reasoning and destroys GPT-5-Nano on vision. VRAM tables and what to run.
Mar 2, 2026
LiquidAI LFM2: The First Hybrid Model Built for Your Hardware
LFM2-24B-A2B runs at 112 tok/s on CPU with only 2.3B active params. Not a transformer. GGUF files from 13.5GB, Ollama and llama.cpp setup, and where it beats Qwen.
Feb 26, 2026
RWKV-7: Infinite Context, Zero KV Cache — The Local-First Architecture
RWKV-7 uses O(1) memory per token. Context length doesn't increase VRAM. At all. 16 tok/s on a Raspberry Pi. Here's why it matters for local AI and how to run it.
Feb 25, 2026