[ICLR 2026] VitaBench: Benchmarking LLM Agents with Versatile Interactive Tasks in Real-world Applications - View it on GitHub
Star
0
Rank
13971272