Official implementation of the paper: "ZClip: Adaptive Spike Mitigation for LLM Pre-Training". - View it on GitHub
Star
1
Rank
5707828