gmh5225/turboquant-kv - Gitstar Ranking

gmh5225

Fetched on 2026/07/13 21:13

Open-source PyTorch implementation of Google TurboQuant (ICLR 2026) — extreme KV-cache quantization to ~3 bits with zero accuracy loss. 6x less memory, up to 8x faster inference. - View it on GitHub

https://pypi.org/project/turboquant-kv

Star

Rank

14120501

gmh5225

gmh5225 / turboquant-kv