Llm-Tools
nanollama: Train Your Own Llama 3 From Scratch on Custom Data
Pretrain Llama 3 architecture models from raw text, export to GGUF, and run with llama.cpp. Forked from Karpathy's nanochat. 46M to 7B parameters.
Crane + Qwen3-TTS: Run Voice Cloning Locally with Rust
Clone any voice with 3 seconds of audio using Qwen3-TTS through Crane's pure Rust inference engine. ~4GB VRAM, faster than real-time, Apache 2.0.
Best Local Alternatives to Claude Code in 2026
Aider, Continue.dev, Cline, OpenCode, Void, and Tabby compared. Which open-source coding tools work best with local models on your own GPU?
SmarterRouter: A VRAM-Aware LLM Gateway for Your Local AI Lab
Intelligent router that profiles your models, manages VRAM, caches responses semantically, and auto-picks the best model per prompt. Works with Ollama and llama.cpp.
LocalAgent: A Local-First Agent Runtime That Actually Cares About Safety
Rust CLI for AI agents with deny-by-default permissions, approval workflows, and deterministic replay. Works with LM Studio, Ollama, and llama.cpp.