A human-readable PyTorch implementation of "Self-attention Does Not Need O(n^2) Memory" (Rabe&Staats'21). - View it on GitHub
Star
8
Rank
1501715