Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models. TMLR 2025. - View it on GitHub
Star
86
Rank
308739