LLM Journey

Building LLMs From Scratch

From 150MB to 12GB datasets. GPT-2 Small trained on 2.8B tokens. 134M parameters. 7x optimization speedup. Every lesson documented.
GPUburnout
GPUburnout
Will Code for Tokens
134M Params
2.8B Tokens
7x Speedup