Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads - View it on GitHub
Star
0
Rank
12125866