Transformer based on a variant of attention that is linear complexity in respect to sequence length - View it on GitHub
Star
697
Rank
48336