AIRS-Bench: an AI Research Science benchmark for quantifying the end-to-end AI research abilities of LLM agents - View it on GitHub
Star
71
Rank
394659