A dataset for training and evaluating LLMs on decision making about "when (not) to call" functions - View it on GitHub
Star
41
Rank
548464