Benchmarking the capabilities of LLM agents across the scientific research lifecycle: from replication to peer review and research design. - View it on GitHub
Star
4
Rank
2802857