Pytorch implementation of "Oscillation-Reduced MXFP4 Training for Vision Transformers" on DeiT Model Pre-training - View it on GitHub
Star
33
Rank
658552