Transformer based on a variant of attention that is linear complexity in respect to sequence length - View it on GitHub
Star
804
Rank
47446