Pretrain Llama 3 architecture models from raw text, export to GGUF, and run with llama.cpp. Forked from Karpathy's nanochat. 46M to 7B parameters.