speedup for lm-evaluation-harness; support tensor-parallel inference and data-parallel inference; support gptq, bitsandbytes, peft and exllamav2. - View it on GitHub
Star
0
Rank
12125876