Alleviating Forgetfulness of Linear Attention by Hybrid Sparse Attention and Contextualized Learnable Token Eviction. - View it on GitHub
Star
4
Rank
2827316