A unified framework to jointly model images, text, and human attention traces. - View it on GitHub
Star
78
Rank
325634