Gemma 4

Ollama's quiet Mac shift, the Qwen refresh, and the closed-weight drift
Ollama 0.30 quietly changed how Apple Silicon runs models, auto-routing safetensors to MLX and GGUF to llama.cpp Metal. The open workhorses are now Qwen 3.5 9B and Qwen 3.6 27B. Plus: Qwen's last two flagship models shipped closed.
Jun 8, 2026
Ollama 0.30.0: What's New, What's Faster, What Breaks on Upgrade
Ollama 0.30.0: llama.cpp integration, flash-attention default for Qwen/Gemma, broader model support. Firsthand upgrade notes, known issues to watch.
Jun 2, 2026
Power week in local AI: Mythos, MiroThinker, real Qwen 3.6 builds
Two researchers cracked Apple's flagship defense in a week. An open-source agent beat closed-source on real benchmarks. Multi-GPU stopped being theoretical.
May 18, 2026
Wicked Fast Gemma 4 vs Qwen 3.6 on RTX 3090: 3.10x Tested
Same RTX 3090, same llama.cpp build, same bench. Gemma 4 26B-A4B Q4_K_XL: 128 tok/s mean. Qwen 3.6-27B Q4_K_M: 41 tok/s. 3.10x faster, firsthand.
May 8, 2026
Your RTX 3090 Doesn't Send Policy Change Emails
Anthropic cuts OpenClaw from Claude subscriptions. Gemma 4's first week in review. 12 architecture patterns from the Claude Code leak, ranked for local AI.
Apr 6, 2026
Gemma 4 Just Dropped: What Local AI Builders Need to Know
Google's Gemma 4 is here -- dense and MoE variants, Apache 2.0, multimodal with vision and audio. VRAM requirements, benchmarks, and how it compares to Qwen 3.5.
Apr 2, 2026
Best Local Models for PI Agent: Qwen 3.6, Gemma 4 (2026 Setup)
PI Agent runs any model locally via Ollama. May 2026 picks: Qwen 3.6 27B / 35B-A3B MoE, Gemma 4 26B-A4B. Setup, model comparisons, honest limits.
Feb 28, 2026
llama.cpp Build Errors: Common Fixes for Every Platform
llama.cpp won't build or runs wrong? CMake, CUDA, Gemma 4 thinking-mode, Qwen 3.6 kwargs, num_ctx VRAM overflow. Exact fixes for every platform.
Feb 18, 2026
Best Ways to Fix OpenClaw Tool Call Failures: 2026 Guide
Your OpenClaw agent silently fails, loops, or corrupts its session. Six debug paths plus May 2026 gotchas: Qwen 3.6 whitespace kwargs, Gemma 4 thinking mode.
Feb 14, 2026
Best Local LLMs for Function Calling: Qwen 3.6, Gemma 4
Function calling with local LLMs on Ollama and llama.cpp. Current lineup: Qwen 3.6, Gemma 4, DeepSeek V4. Common failures, agentic loop patterns. May 2026.
Feb 11, 2026
Best Uncensored Local LLMs by VRAM Tier (2026)
Qwen 3.6 abliterated, Gemma 4 Heretic, Dolphin 3.0 — the current uncensored picks by VRAM tier. The Llama 3.1 / Qwen 2.5 era is mostly superseded. HuggingFace repos for every pick.
Feb 10, 2026
Best Local LLMs for Structured Output: Qwen 3.6, Gemma 4
JSON schema, grammar constraints, and Outlines compared. Current model picks: Qwen 3.6, Gemma 4, DeepSeek V4. Common failures + working code. May 2026.
Feb 10, 2026
Best Ways to Manage Multiple Ollama Models: 2026 Workflows
Manage multiple Ollama models in 2026: disk cleanup, switching, tagging. Qwen 3.6, Gemma 4, DeepSeek V4 (cloud-only) — practical workflows.
Feb 8, 2026
Best Vision Models You Can Run Locally: Every Model, Every GPU Tier
Qwen 3.6 and Gemma 4 are the new local vision SOTA picks. Full VRAM table, Ollama commands, setup for every GPU from 4GB to 48GB+. Updated May 2026.
Feb 6, 2026