A Python implementation to extract multimodal features (visual and textual). - View it on GitHub
Star
5
Rank
1972273