Vary-tiny codebase upon LAVIS (for training from scratch)and a PDF image-text pairs data (about 600k including English/Chinese) - View it on GitHub
Star
0
Rank
12484700