Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena - View it on GitHub
Star
204
Rank
144189