Staged Training for Transformer Language Models - View it on GitHub
Star
27
Rank
625057