Implementation of Long-Short Transformer, combining local and global inductive biases for attention over long sequences, in Pytorch - View it on GitHub
Star
118
Rank
224065