A unified framework to jointly model images, text, and human attention traces. - View it on GitHub
Star
77
Rank
304237