Implementation of MaMMUT, a simple vision-encoder text-decoder architecture for multimodal tasks from Google, in Pytorch - View it on GitHub
Star
99
Rank
259452