Training Mixture-of-Experts: A Focus on Expert-Token Matching
Masoumeh Zareapoor
2024 poster
in
Affinity Event: Tiny Papers Poster Session 8
in
Affinity Event: Tiny Papers Poster Session 8
Abstract
Recent advancements in sparse Mixture-of-Experts (MoE) models, particularly in the Vision MoE (VMoE) framework, have demonstrated promising results in enhancing vision task performance. However, a key challenge persists in optimally routing tokens (such as image patches) to the right experts, without incurring excessive computational costs. Addressing this, we apply the regularized optimal transport, which relies on the Sinkhorn algorithm to the Vision MoE (VMoE) framework, aiming at improving the token-expert matching process. The resulting model, Sinkhorn-VMoE (SVMoE), represents a meaningful step in optimizing efficiency and effectiveness of sparsely-gated MoE models.
Chat is not available.
Successful Page Load