A dataset for training and evaluating LLMs on decision making about "when (not) to call" functions - View it on GitHub
Star
55
Rank
469153