Sparse Backpropagation for Mixture-of-Expert Training - View it on GitHub
Star
24
Rank
714889