Implementation of a Transformer using ReLA (Rectified Linear Attention) from https://arxiv.org/abs/2104.07012 - View it on GitHub
Star
49
Rank
402655