Gitstar Ranking
Users
Organizations
Repositories
Rankings
Users
Organizations
Repositories
Sign in with GitHub
sayakpaul
Fetched on 2026/03/14 10:06
sayakpaul
/
SageAttention
[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models. -
View it on GitHub
https://arxiv.org/abs/2410.02367
Star
0
Rank
13942919