Mac
What Can You Run on 8GB Apple Silicon? Local AI on a Budget Mac
Llama 3.2 3B runs at 30 tok/s. Phi-4 Mini fits with room to spare. 7B models technically load but swap to disk. Honest benchmarks and real limits for 8GB M1/M2/M3/M4 Macs.
Stable Diffusion on Mac: Image Generation with MLX and Draw Things
Draw Things generates SD 1.5 images in 8-15 seconds on an M2 Pro. ComfyUI takes 3x longer. MLX is fastest but code-only. Complete Mac image gen guide with speed tests.
Ollama on Mac: Setup and Optimization Guide (2026)
Install Ollama on Apple Silicon, verify Metal GPU is active, and tune it for your Mac's RAM. Config for M1 through M4 Ultra with model picks per memory tier.
Ollama on Mac Not Working? Fix Metal, Memory Pressure, and Slow Performance
ollama ps says CPU? Generation crawling at 2 tok/s? macOS killed your model mid-sentence? Every Mac-specific Ollama problem diagnosed and fixed with exact commands.
Mac Studio for Local AI: Is It Worth the Price?
Mac Studio M4 Max (128GB) and M3 Ultra (up to 512GB) tested for local LLMs. Real tok/s numbers, cost comparison vs dual RTX 3090, and who should actually buy one.
LM Studio vs Ollama on Mac: Which Should You Use?
LM Studio's MLX backend is 20-30% faster and uses half the memory. Ollama is lighter, always-on, and better for APIs. Mac-specific benchmarks and when to use each.
Fine-Tuning on Mac: LoRA & QLoRA with MLX
Fine-tune Llama, Qwen, and Mistral on Apple Silicon using mlx-lm. Real memory numbers, step-by-step commands, and how to deploy your model with Ollama.
Mac Mini M4 for Local AI: Which Config to Buy and What It Actually Runs
Mac Mini M4 Pro 48GB runs Qwen3-32B at 15-22 tok/s, draws 40W under load, and costs $25/year in electricity. Which config to buy and what each runs.
Best Local LLMs for Mac in 2026 — M1, M2, M3, M4 Tested
The best models to run on every Mac tier. Specific picks for 8GB M1 through 128GB M4 Max, with real tok/s numbers. MLX vs Ollama vs LM Studio compared.
Running LLMs on Mac: M1 Through M4 Guide
M1 with 8GB runs 7B models. M4 Max with 128GB loads 70B+ that need multi-GPU on PC. Unified memory sizing, MLX vs Ollama speeds, and Mac Mini as an always-on AI server.