Skip to yearly menu bar Skip to main content


Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models

Victor Weixin Liang · Lili Yu · Liang Luo · Srini Iyer · Ning Dong · Chunting Zhou · Gargi Ghosh · Mike Lewis · Scott Yih · Luke Zettlemoyer · Victoria Lin

Abstract

Chat is not available.