A fast inference library for running LLMs locally on modern consumer-class GPUs - View it on GitHub
Star
2935
Rank
10914