A dataset for training and evaluating LLMs on decision making about "when (not) to call" functions - View it on GitHub
Star
32
Rank
650603