A Near-Optimal Best-of-Both-Worlds Algorithm for Federated Bandits
Zicheng Hu · Zihao Wang · Cheng Chen
Abstract
This paper studies federated multi-armed bandit (MAB) problems where multiple agents working together to solve a common MAB problem through a communication network. We focus on the heterogeneous setting in which no single agent can identify the global best arm using only local biased observations. In this setting, different agents may select the same arm at the same time step but receive varying rewards. We propose a novel algorithm called \textsc{FedFTRL} for this problem, which is the first work to achieve near-optimal regret guarantees in both stochastic and adversarial environments. Notably, in the adversarial regime, our algorithm achieves $O(\sqrt{T})$ regret which is a significant improvement over the state-of-the-art regret of $O(T^{\frac{2}{3}})$ \citep{yi2023doubly}. We also provide numerical evaluations comparing our algorithm with baseline methods, demonstrating the effectiveness of our approach on both synthetic and real-world datasets.
Successful Page Load