A Test and Eval Runner for Coding Agents. Define repeatable evals and test them against multiple coding agents and configurations thereof. - View it on GitHub
Star
0
Rank
13256292