A fast inference library for running LLMs locally on modern consumer-class GPUs - View it on GitHub
Star
2
Rank
3413805