Journal Track Poster Sat, Apr 25, 2026 • 6:30 AM – 9:00 AM PDT Pavilion 4 P4-#4615

Multi-Bellman operator for convergence of Q-learning with linear function approximation

Diogo S. Carvalho · Pedro A. Santos · Francisco S. Melo

Project Page

Abstract

We investigate the convergence of $Q$-learning with linear function approximation and introduce the multi-Bellman operator, an extension of the traditional Bellman operator. By analyzing the properties of this operator, we identify conditions under which the projected multi-Bellman operator becomes a contraction, yielding stronger fixed-point guarantees compared to the original Bellman operator. Building on these insights, we propose the multi-$Q$-learning algorithm, which achieves convergence and approximates the optimal solution with arbitrary precision. This contrasts with traditional $Q$-learning, which lacks such convergence guarantees. Finally, we empirically validate our theoretical results.

Video

Chat is not available.