Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper - View it on GitHub
Star
605
Rank
59931