A tool for evaluating LLMs on the MATH and GSM8K dataset. - View it on GitHub
Star
6
Rank
1831432