Gitstar Ranking
Users
Organizations
Repositories
Rankings
Users
Organizations
Repositories
Sign in with GitHub
gmh5225
Fetched on 2026/05/08 11:55
gmh5225
/
turboquant-2
TurboQuant: Near-optimal KV cache quantization for LLM inference (3-bit keys, 2-bit values) with Triton kernels + vLLM integration -
View it on GitHub
Star
0
Rank
13993518