A highly optimized LLM inference acceleration engine for Llama and its variants. - View it on GitHub
Star
688
Rank
50153