High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability. - View it on GitHub
Star
3364
Rank
12162