Transformer based on a variant of attention that is linear complexity in respect to sequence length - View it on GitHub
Star
660
Rank
50159