OmniVinci is an omni-modal LLM for joint understanding of vision, audio, and language. - View it on GitHub
Star
472
Rank
77851