Gitstar Ranking
Users
Organizations
Repositories
Rankings
Users
Organizations
Repositories
Sign in with GitHub
thu-ml
Fetched on 2026/03/01 18:37
thu-ml
/
SageAttention
[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models. -
View it on GitHub
https://arxiv.org/abs/2410.02367
Star
3184
Rank
12690