Evaluating LLMs on the MixEval dataset using W&B Weave - View it on GitHub
Star
1
Rank
6006059