Gitstar Ranking
Users
Organizations
Repositories
Rankings
Users
Organizations
Repositories
Sign in with GitHub
thu-ml
Fetched on 2025/12/10 14:42
thu-ml
/
SageAttention
[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models. -
View it on GitHub
https://arxiv.org/abs/2410.02367
Star
2814
Rank
14138