VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops) - View it on GitHub
Star
6
Rank
2170219