A dataset for training and evaluating LLMs on decision making about "when (not) to call" functions - View it on GitHub
Star
50
Rank
482932