Software
Text Generation WebUI (Oobabooga) Guide
Install text-generation-webui in 10 minutes. GPU offloading, GGUF/GPTQ/EXL2 model loading, extensions, and the settings most guides skip. Practical setup.
LM Studio Tips & Tricks: Hidden Features
Speculative decoding for 20-50% faster output, MLX that's 21-87% faster on Mac, a built-in OpenAI-compatible API, and the GPU offload settings most users miss.
Ollama vs LM Studio: Which Should You Use for Local AI?
Ollama gives you a CLI with 100+ models and an OpenAI-compatible API. LM Studio gives you a visual GUI with one-click downloads. Most power users run both—here's when to use each.