Transformer based on a variant of attention that is linear complexity in respect to sequence length - View it on GitHub
Star
719
Rank
47910