WorldSense benchmark for grounded reasoning in language models - View it on GitHub
Star
14
Rank
1032188