A fast inference library for running LLMs locally on modern consumer-class GPUs - View it on GitHub
Star
3
Rank
3279731