A fast inference library for running LLMs locally on modern consumer-class GPUs - View it on GitHub
Star
0
Rank
11400826