A dataset for training and evaluating LLMs on decision making about "when (not) to call" functions - View it on GitHub
Star
30
Rank
673547