"Large model" Train a 26M parameter small GPT from scratch in just 3 hours, can be inferred and trained on a personal GPU! - View it on GitHub
Star
0
Rank
13339702