AutoGaze automatically removes redundant patches in a video, reducing #tokens in ViT/MLLM by 4x-100x. - View it on GitHub
Star
237
Rank
153888