Hardware
Used Server GPUs for Local AI: Tesla P40, V100, A100, and the eBay Goldmine
A Tesla P40 has 24GB VRAM for $175. A V100 has 32GB for $350. Server GPUs offer insane VRAM per dollar for local AI — if you can handle the quirks. Full breakdown with prices, benchmarks, and cooling fixes.
Mac Studio for Local AI: M4 Max vs M3 Ultra, Every Config Ranked
Mac Studio M4 Max 128GB runs Llama 70B at 11 tok/s, draws 80W, fits on a shelf. M3 Ultra 512GB runs DeepSeek-V3 671B. Here's which config to buy and what it actually runs.
Intel Arc GPUs for Local AI: The Underdog Option That Actually Works
The Arc A770 16GB gives you 16GB of VRAM for ~$250 used. Software support through IPEX-LLM and llama.cpp SYCL is real but rough. Honest benchmarks, what works, and what doesn't.
Used Tesla P40 for Local AI: The $200 Budget Beast
24GB VRAM for $150-$200 on eBay. Pascal architecture, no display output, passive cooling. Full benchmarks, setup guide, and honest comparison to the RTX 3060 and 3090.
RTX 5090 for Local AI: Worth the Upgrade?
32GB GDDR7, 1,792 GB/s bandwidth, 67% faster than 4090 — but $3,500+ street price. Full benchmarks, value analysis, and who should actually buy one.
RTX 4090 vs Used RTX 3090 for Local AI: Which to Buy in 2026
Both have 24GB VRAM. One costs 2-3x more. RTX 4090 vs used RTX 3090 — real benchmarks, real prices, and who should buy which for local LLM inference and image generation.
M4 Max and M3 Ultra for Local LLMs: Apple Silicon in 2026
No M4 Ultra exists. Apple's Mac Studio pairs the M4 Max (128GB, 546 GB/s) with an M3 Ultra (192GB, 800 GB/s). Real benchmarks, pricing, and who should buy which for local AI.
Best Mini PCs for Local AI Under $300 in 2026
A $200 refurbished ThinkCentre runs 7B models at 5-8 tok/s. A $350 AMD Ryzen box hits 10-15 tok/s. Specific picks, real benchmarks, and what's worth buying.
Mac Mini M4 for Local AI: Which Config to Buy and What It Actually Runs
Mac Mini M4 Pro 48GB runs Qwen3-32B at 15-22 tok/s, draws 40W under load, and costs $25/year in electricity. Which config to buy and what each runs.
Running 70B Models Locally — Exact VRAM by Quantization
Llama 3.3 70B needs 43GB at Q4, 75GB at Q8, 141GB at FP16. Here's every quant level, which GPUs fit, real speeds, and when 32B is the smarter choice.
Rescued Hardware, Rescued Bees — Building Tech From What Others Throw Away
A beekeeper who rescues wild colonies from demolition sites builds an AI lab from discarded hardware. The philosophy connecting East Bay Bees, Tai Chi, and mycoSwarm.
Free Local AI vs Paid Cloud APIs: Real Cost Comparison
An RTX 3090 pays for itself in 2 weeks of moderate API usage. Full break-even math for local vs OpenAI, Anthropic, and Google APIs with current 2026 pricing.
RTX 3060 vs 3060 Ti vs 3070 for Local AI
The RTX 3060 has 12GB VRAM, the 3060 Ti and 3070 only have 8GB. For LLMs, the cheapest card wins — it runs 14B models the others can't fit. Speeds, prices, and when the 3070 still makes sense.
Multi-GPU Setups for Local AI: Worth It?
Dual RTX 3090s cost $1,600+ and need a 1,200W PSU — but a single 3090 at $800 runs every model under 32B. When two GPUs actually beat one bigger card, and when they don't.
Razer AIKit Guide: Multi-GPU Local AI on Your Desktop
Open-source Docker stack bundling vLLM, Ray, LlamaFactory, and Grafana into 1 container. Auto-detects GPUs, supports 280K+ HuggingFace models, and handles multi-GPU parallelism.
Multi-GPU Local AI: Run Models Across Multiple GPUs
Dual RTX 3090s give you 48GB VRAM and run 70B models at 16-21 tok/s—vs 1 tok/s with CPU offloading. Tensor vs pipeline parallelism, setup guides, and real scaling numbers.
GB10 Boxes Compared: DGX Spark vs Dell vs ASUS vs MSI
DGX Spark, Dell Pro Max, ASUS Ascent GX10, and MSI EdgeXpert compared with real benchmarks, 45-minute thermal tests, and pricing. Same chip, different chassis.
Best Local LLMs for Mac in 2026 — M1, M2, M3, M4 Tested
The best models to run on every Mac tier. Specific picks for 8GB M1 through 128GB M4 Max, with real tok/s numbers. MLX vs Ollama vs LM Studio compared.
RTX 3090 vs 4070 Ti Super for Local LLMs
Head-to-head comparison of the RTX 3090 and RTX 4070 Ti Super for running LLMs locally. Covers VRAM, speed, power, price, and which to buy for your use case.
How Much Does It Cost to Run LLMs Locally?
$200-800 for hardware, $5-15/month in electricity, and a 3-6 month breakeven vs ChatGPT Plus at $240/year. Full cost breakdown with real numbers.
Best Used GPUs for Local AI: 2026 Buying Guide
RTX 3090 at $700-850 for 24GB, RTX 3060 12GB at $170-220, RTX 3080 at $350-400. Tier rankings, fair prices, what to avoid (skip the 8GB 3070), and where to buy safely.
Best GPU Under $500 for Local AI (2026 Picks)
Find the best GPU under $500 for running local AI in 2026. RTX 4060 Ti 16GB, used RTX 3080, RTX 3060 12GB, and RX 7700 XT compared with real benchmarks.
Best GPU Under $300 for Local AI (2026 Picks)
Find the best GPU under $300 for local AI. We compare the RTX 3060 12GB, RX 7600, and Intel Arc B580 with VRAM analysis, LLM benchmarks, and real pricing.
Running LLMs on Mac: M1 Through M4 Guide
M1 with 8GB runs 7B models. M4 Max with 128GB loads 70B+ that need multi-GPU on PC. Unified memory sizing, MLX vs Ollama speeds, and Mac Mini as an always-on AI server.
Laptop vs Desktop for Local AI: Which Should You Buy?
A $750 desktop RTX 3090 gives you 24GB VRAM. The same money in a gaming laptop gets 8GB. MacBooks break the rules with 48GB+ unified memory for 70B models.
What Can You Actually Run on 4GB VRAM?
1B-3B models run at 18-55 tok/s. Qwen 2.5 3B at Q4 is the sweet spot for chat and simple coding. 7B models don't fit. What works on GTX 1050 Ti and 1650, and when to upgrade.
What Can You Actually Run on 16GB VRAM?
13B-14B models hit 22-53 tok/s at Q4-Q6, Flux runs at FP8, and 20B models squeeze in with short context. Where 16GB beats 12GB, where it trails 24GB, and the best cards at this tier.
Used GPU Buying Guide for Local AI: How to Buy Smart
RTX 3060 12GB for ~$200, RTX 3090 24GB for ~$750—used GPUs offer 2-3x the VRAM per dollar vs new. Fair prices, scam red flags, and where to buy safely.
Mac vs PC for Local AI: Which Should You Choose?
RTX 3090 runs 7B-14B models 2-3x faster than M4 Pro. M4 Max with 128GB loads 70B models a PC can't touch. Real benchmarks, prices, and which platform fits your use case.
What Can You Actually Run on 24GB VRAM?
32B models at 25-38 tok/s, 70B at Q3 with limited context, Flux at full FP16, and LoRA fine-tuning. RTX 3090 at $700 vs 4090 at $1,800—every model that fits and which GPU to buy.
CPU-Only LLMs: What Actually Works
Running CPU-only LLMs without a GPU — what actually works. Best model picks, real speed benchmarks, and a budget dual Xeon server build for 70B models.
What Can You Actually Run on 8GB VRAM?
7B-8B models hit 35-42 tok/s at Q4, SD 1.5 runs great, SDXL is tight but doable. Nothing above 13B fits. Every model that works on RTX 4060 and 3060 Ti, plus the best upgrade path.
What Can You Actually Run on 12GB VRAM?
14B models at Q4 hit 25-32 tok/s, 7B-8B run at near-lossless Q6-Q8, and SDXL generates without workarounds. Every model that fits on an RTX 3060 12GB and the best upgrade path.
RTX 5060 Ti 16GB Killed? Local AI Alternatives
The RTX 5060 Ti 16GB faces production cuts from GDDR7 shortages. See what is really happening and explore the best alternative GPUs for local AI in 2026.
Used RTX 3090 Buying Guide for Local AI
24GB VRAM for $650-750—half the cost of an RTX 4090 with the same capacity. Fair prices, eBay red flags, PSU requirements (850W minimum), and how to test before your return window closes.
NVIDIA GPU Prices Are Rising: What to Do Now
GPU prices are spiking due to GDDR7 shortages and AI datacenter demand. Here's what's happening, which cards are affected, and strategies for local AI builders.
How Much VRAM Do You Need for Local LLMs?
3B models need 2GB. 7B needs 5GB. 70B needs 40GB. Exact VRAM requirements for every model size at Q4 through FP16, plus which GPU to buy at every budget.
Build a Local AI PC for Under $500
A used Dell Optiplex + RTX 3060 12GB runs 14B LLMs and Stable Diffusion for under $450. Full parts list, real speed benchmarks, and what to skip to save money.
AMD vs NVIDIA for Local AI: Is ROCm Finally Ready?
RX 7900 XTX delivers 85-95% of RTX 4090 performance with 24GB VRAM at $700-950. ROCm 6.x finally works on Linux. Honest benchmarks and the real compatibility gaps.
GPU Buying Guide for Local AI: Pick the Right Card
The complete GPU buying guide for local AI. Covers RTX 3060 through 4090 with VRAM analysis, performance benchmarks, prices, and used vs new buying advice.