Simple framework for training and evaluating math reasoning agents using local models, GRPO and vLLM. - View it on GitHub
Star
23
Rank
878080