COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning - View it on GitHub
Star
0
Rank
11399557