Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper - View it on GitHub
Star
780
Rank
49077