OmniVinci is an omni-modal LLM for joint understanding of vision, audio, and language.
Latest commits.
Builders behind this project.