A variant of Transformer-XL where the memory is updated not with a queue, but with attention - View it on GitHub
Star
45
Rank
428546