gmh5225/turboquant - Gitstar Ranking

gmh5225

Fetched on 2026/07/13 21:13

First open-source implementation of Google TurboQuant (ICLR 2026) -- near-optimal KV cache compression for LLM inference. 5x compression with near-zero quality loss. - View it on GitHub

Star

Rank

14120501

gmh5225

gmh5225 / turboquant