Advanced quantization toolkit for LLMs and VLMs. Support for WOQ, MXFP4, NVFP4, GGUF, Adaptive Schemes and seamless integration with Transformers, vLLM, SGLang, and llm-compressor - View it on GitHub
Star
779
Rank
49967