High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability. - View it on GitHub
Star
1247
Rank
31432