Multimodal
Gemma 4 Just Dropped: What Local AI Builders Need to Know
Google's Gemma 4 is here -- dense and MoE variants, Apache 2.0, multimodal with vision and audio. VRAM requirements, benchmarks, and how it compares to Qwen 3.5.
Qwen 3.5 Small Models: The 9B Beats Last-Gen 30B — Here's What Matters for Local AI
Alibaba's Qwen 3.5 drops 4 small models (0.8B to 9B) — all natively multimodal, 262K context, Apache 2.0. The 9B beats Qwen3-30B on reasoning and destroys GPT-5-Nano on vision. VRAM tables and what to run.
Best 8GB GPU Model: How to Set Up Qwen 3.5 9B (Step by Step)
Qwen 3.5 9B fits in 6.6GB and beats Qwen 3-class models 3x its size. Setup on Ollama/llama.cpp, quant table, where 9B still fits in the May 2026 lineup.
DeepSeek V4: Everything We Know Before It Drops
DeepSeek V4 launches next week with native image and video generation, 1M context, and rumored 1T MoE params with only 32B active. Here's what local AI builders need to know and how to prepare.
Qwen2.5-VL Not Loading in LM Studio? Fix mmproj and Vision Errors
Fix every Qwen2.5-VL error in LM Studio: missing mmproj, 'model type not supported', no eye icon, vision crashes. Exact fixes with file paths.
Llama 4 Guide: Running Scout and Maverick Locally (2026)
Complete Llama 4 Scout (109B MoE) and Maverick guide for local AI. VRAM, Ollama and vLLM setup, hardware reality, and how it stacks against Qwen 3.6.
Run Qwen2.5-VL Vision in LM Studio (Setup)
Get Qwen2.5-VL running in LM Studio in 5 minutes. Covers the mmproj file most people miss, correct download links, and how to analyze images and PDFs locally.
Best Vision Models You Can Run Locally: Every Model, Every GPU Tier
Qwen 3.6 and Gemma 4 are the new local vision SOTA picks. Full VRAM table, Ollama commands, setup for every GPU from 4GB to 48GB+. Updated May 2026.