google-research-datasets/GeniL

google-research-datasets

Fetched on 2026/03/01 20:06

GeniL dataset is an effort for detecting various types of generalization in language. This multilingual dataset covers sentences in EN, FR, ES, PT, AR, HI, BN, MS, and ID and is annotated by native speakers of each language. Each sentence is collected from a public corpora of language and contains at least one identity group name and an attribute. - View it on GitHub

Star

Rank

3391272

google-research-datasets

google-research-datasets / GeniL