Official Implementation for LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models - View it on GitHub
Star
0
Rank
11272351