An innovative library for efficient LLM inference via low-bit quantization - View it on GitHub
Star
352
Rank
91138