Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena - View it on GitHub
Star
203
Rank
155958