Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper - View it on GitHub
Star
791
Rank
49144