Transformer based on a variant of attention that is linear complexity in respect to sequence length - View it on GitHub
Star
601
Rank
51971