Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper - View it on GitHub
Star
10
Rank
1393427