Skip to yearly menu bar Skip to main content


poster
in
Affinity Workshop: Tiny Papers Poster Session 8

Training Mixture-of-Experts: A Focus on Expert-Token Matching

Masoumeh Zareapoor

#269
[ ] [ Project Page ]
Fri 10 May 7:30 a.m. PDT — 9:30 a.m. PDT

Abstract:

Recent advancements in sparse Mixture-of-Experts (MoE) models, particularly in the Vision MoE (VMoE) framework, have demonstrated promising results in enhancing vision task performance. However, a key challenge persists in optimally routing tokens (such as image patches) to the right experts, without incurring excessive computational costs. Addressing this, we apply the regularized optimal transport, which relies on the Sinkhorn algorithm to the Vision MoE (VMoE) framework, aiming at improving the token-expert matching process. The resulting model, Sinkhorn-VMoE (SVMoE), represents a meaningful step in optimizing efficiency and effectiveness of sparsely-gated MoE models.

Chat is not available.