An innovative library for efficient LLM inference via low-bit quantization - View it on GitHub
Star
342
Rank
91692