A highly optimized LLM inference acceleration engine for Llama and its variants. - View it on GitHub
Star
906
Rank
46154