Oral
in
Workshop: Modular, Collaborative and Decentralized Deep Learning
Soup-of-Experts: Pretraining Specialist Models via Parameters Averaging
Pierre Ablin · Angelos Katharopoulos · Skyler Seto · David Grangier
Abstract:
Chat is not available.
Successful Page Load