The RedPajama-Data repository contains code for preparing large datasets for training large language models. - View it on GitHub
Star
0
Rank
11398329