AutoGaze automatically removes redundant patches in a video, reducing #tokens in ViT/MLLM by 4x-100x. - View it on GitHub
Star
1
Rank
6062975