Gitstar Ranking
Users
Organizations
Repositories
Rankings
Users
Organizations
Repositories
Sign in with GitHub
google-research-datasets
Fetched on 2025/01/09 12:01
google-research-datasets
/
common-crawl-domain-names
Corpus of domain names scraped from Common Crawl and manually annotated to add word boundaries (e.g. "commoncrawl" to "common crawl"). -
View it on GitHub
Star
17
Rank
979417