AIRS-Bench: an AI Research Science benchmark for quantifying the end-to-end AI research abilities of LLM agents - View it on GitHub
Star
91
Rank
329708