MLX

Ollama's quiet Mac shift, the Qwen refresh, and the closed-weight drift
Ollama 0.30 quietly changed how Apple Silicon runs models, auto-routing safetensors to MLX and GGUF to llama.cpp Metal. The open workhorses are now Qwen 3.5 9B and Qwen 3.6 27B. Plus: Qwen's last two flagship models shipped closed.
Jun 8, 2026
Ollama 0.30.0: What's New, What's Faster, What Breaks on Upgrade
Ollama 0.30.0: llama.cpp integration, flash-attention default for Qwen/Gemma, broader model support. Firsthand upgrade notes, known issues to watch.
Jun 2, 2026
Apple Neural Engine for LLM Inference: What Actually Works
Apple Silicon has a dedicated Neural Engine that most LLM tools ignore. Here's what it can do for inference, what it can't, and whether ANE-based tools like ANEMLL are worth trying today.
Mar 5, 2026
Best Apple M5 Pro and Max for Local AI (2026)
M5 Pro at 307GB/s, M5 Max at 614GB/s, up to 128GB unified memory. What works, what doesn't, and the May 2026 picks for Qwen 3.6 and Llama 3.3 70B.
Mar 3, 2026
Stable Diffusion on Mac: Image Generation with MLX and Draw Things
Draw Things generates SD 1.5 images in 8-15 seconds on an M2 Pro. ComfyUI takes 3x longer. MLX is fastest but code-only. Complete Mac image gen guide with speed tests.
Feb 26, 2026
LM Studio vs Ollama on Mac: Which Should You Use?
LM Studio's MLX backend is 20-30% faster and uses half the memory. Ollama is lighter, always-on, and better for APIs. Mac-specific benchmarks and when to use each.
Feb 26, 2026
Fine-Tuning on Mac: LoRA & QLoRA with MLX
Fine-tune Llama, Qwen, and Mistral on Apple Silicon using mlx-lm. Real memory numbers, step-by-step commands, and how to deploy your model with Ollama.
Feb 26, 2026
Best Way to Run Qwen 3.5 on Mac: MLX vs Ollama Speed Test
MLX runs Qwen 3.5 up to 2x faster than Ollama on Apple Silicon. Head-to-head benchmarks on M1 through M4, with setup instructions for both.
Feb 26, 2026
Best Local LLMs for Mac in 2026 — M1 through M5 Tested
The best models to run on every Mac tier. Specific picks for 8GB M1 through 192GB M3 Ultra, with real tok/s numbers. Qwen 3.6, Llama 4 Scout, DeepSeek V4, MLX vs Ollama, updated May 2026.
Feb 5, 2026
Running LLMs on Mac M-Series: Setup, Tools, Troubleshooting
Foundational how-to for Apple Silicon local AI: unified memory, MLX vs Ollama vs llama.cpp Metal, verification, and the headless Mac Mini AI server.
Feb 3, 2026