Hardware
ROCm vs CUDA for Local AI in 2026: The Software Gap Nobody Talks About
AMD GPUs have the bandwidth. They have the VRAM. They still lose by 2x on inference speed. Here's why, what actually works on ROCm 7.2, and whether RDNA 4 fixes anything.
The 8GB VRAM Trap: What 'Runs on 8GB' Actually Means
Every local AI tutorial says 'runs on 8GB!' — and technically it does. What they don't tell you about quantization cliffs, tiny context windows, and why a $275 used GPU changes everything.
Intel Arc GPUs for Local AI: The Underdog Option That Actually Works
The Arc A770 16GB gives you 16GB of VRAM for ~$250 used. Software support through IPEX-LLM and llama.cpp SYCL is real but rough. Honest benchmarks, what works, and what doesn't.
Used Tesla P40 for Local AI: The $200 Budget Beast
24GB VRAM for $150-$200 on eBay. Pascal architecture, no display output, passive cooling. Full benchmarks, setup guide, and honest comparison to the RTX 3060 and 3090.
RTX 5090 for Local AI: Worth the Upgrade?
32GB GDDR7, 1,792 GB/s bandwidth, 67% faster than 4090 — but $3,500+ street price. Full benchmarks, value analysis, and who should actually buy one.
RTX 4090 vs Used RTX 3090 for Local AI: Which to Buy in 2026
Both have 24GB VRAM. One costs 2-3x more. RTX 4090 vs used RTX 3090 — real benchmarks, real prices, and who should buy which for local LLM inference and image generation.
M4 Max and M3 Ultra for Local LLMs: Apple Silicon in 2026
No M4 Ultra exists. Apple's Mac Studio pairs the M4 Max (128GB, 546 GB/s) with an M3 Ultra (192GB, 800 GB/s). Real benchmarks, pricing, and who should buy which for local AI.
Best Mini PCs for Local AI Under $300 in 2026
A $200 refurbished ThinkCentre runs 7B models at 5-8 tok/s. A $350 AMD Ryzen box hits 10-15 tok/s. Specific picks, real benchmarks, and what's worth buying.
Mac Mini M4 for Local AI: Which Config to Buy and What It Actually Runs
Mac Mini M4 Pro 48GB runs Qwen3-32B at 15-22 tok/s, draws 40W under load, and costs $25/year in electricity. Which config to buy and what each runs.
Distributed Wisdom: Running a Thinking Network on $200 Hardware
Five nodes, zero cloud, real AI — how mycoSwarm coordinates cheap hardware into a cognitive system with memory, intent routing, and self-correcting retrieval.
Running 70B Models Locally — Exact VRAM by Quantization
Llama 3.3 70B needs 43GB at Q4, 75GB at Q8, 141GB at FP16. Here's every quant level, which GPUs fit, real speeds, and when 32B is the smarter choice.
Best Hardware for Running OpenClaw — Mac Mini vs VPS vs Your Old PC
OpenClaw runs 24/7. A Mac Mini M4 draws 4 watts idle. A free Oracle VPS costs nothing. A used ThinkCentre costs $85. Here's which one to pick.
Rescued Hardware, Rescued Bees — Building Tech From What Others Throw Away
A beekeeper who rescues wild colonies from demolition sites builds an AI lab from discarded hardware. The philosophy connecting East Bay Bees, Tai Chi, and mycoSwarm.
Building a Distributed AI Swarm for Under $1,100
A complete bill of materials for a three-node distributed AI cluster: RTX 3090 workstation, ThinkCentre M710Q for light inference, Raspberry Pi 5 coordinator. Every part sourced used or cheap, total cost under $1,100.
RTX 3060 vs 3060 Ti vs 3070 for Local AI
The RTX 3060 has 12GB VRAM, the 3060 Ti and 3070 only have 8GB. For LLMs, the cheapest card wins — it runs 14B models the others can't fit. Speeds, prices, and when the 3070 still makes sense.
Mixtral 8x7B & 8x22B VRAM Requirements
Mixtral 8x7B and 8x22B VRAM requirements at every quantization level. Exact numbers from Q2 to FP16, GPU recommendations, and KV cache impact explained.
RTX 3090 vs 4070 Ti Super for Local LLMs
Head-to-head comparison of the RTX 3090 and RTX 4070 Ti Super for running LLMs locally. Covers VRAM, speed, power, price, and which to buy for your use case.
How Much Does It Cost to Run LLMs Locally?
$200-800 for hardware, $5-15/month in electricity, and a 3-6 month breakeven vs ChatGPT Plus at $240/year. Full cost breakdown with real numbers.
Best GPU Under $500 for Local AI (2026 Picks)
Find the best GPU under $500 for running local AI in 2026. RTX 4060 Ti 16GB, used RTX 3080, RTX 3060 12GB, and RX 7700 XT compared with real benchmarks.
Best GPU Under $300 for Local AI (2026 Picks)
Find the best GPU under $300 for local AI. We compare the RTX 3060 12GB, RX 7600, and Intel Arc B580 with VRAM analysis, LLM benchmarks, and real pricing.
Laptop vs Desktop for Local AI: Which Should You Buy?
A $750 desktop RTX 3090 gives you 24GB VRAM. The same money in a gaming laptop gets 8GB. MacBooks break the rules with 48GB+ unified memory for 70B models.
Mac vs PC for Local AI: Which Should You Choose?
RTX 3090 runs 7B-14B models 2-3x faster than M4 Pro. M4 Max with 128GB loads 70B models a PC can't touch. Real benchmarks, prices, and which platform fits your use case.
Used RTX 3090 Buying Guide for Local AI
24GB VRAM for $650-750—half the cost of an RTX 4090 with the same capacity. Fair prices, eBay red flags, PSU requirements (850W minimum), and how to test before your return window closes.
Used Optiplex + RTX 3060 = Local AI for Under $450 (Full Build)
$100 used Optiplex, $180 RTX 3060 12GB, done. Runs 14B LLMs at 25 tokens/sec and Stable Diffusion out of the box. Complete parts list, where to buy cheap, assembly photos, and first benchmarks.
Best VRAM Cheat Sheet for Local LLMs: Every Model, Every Quant
Exact VRAM for Qwen 3.5, Llama, Mistral, and DeepSeek at Q3 through FP16. Lookup tables for 7B, 9B, 13B, 27B, 32B, 70B, and 120B models with real measurements and GPU recommendations. Updated March 2026.
AMD vs NVIDIA for Local AI: Is ROCm Finally Ready?
RX 7900 XTX delivers 85-95% of RTX 4090 performance with 24GB VRAM at $700-950. ROCm 6.x finally works on Linux. Honest benchmarks and the real compatibility gaps.
GPU Buying Guide for Local AI: Pick the Right Card
The complete GPU buying guide for local AI. Covers RTX 3060 through 4090 with VRAM analysis, performance benchmarks, prices, and used vs new buying advice.