A highly optimized LLM inference acceleration engine for Llama and its variants. - View it on GitHub
Star
904
Rank
43136