VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops) - View it on GitHub
Star
1879
Rank
18398