Pytorch implementation of "Oscillation-Reduced MXFP4 Training for Vision Transformers" on DeiT Model Pre-training - View it on GitHub
Star
36
Rank
625886