Skip to yearly menu bar Skip to main content


Poster

Energy-Weighted Flow Matching for Offline Reinforcement Learning

Shiyuan Zhang · Weitong Zhang · Quanquan Gu

Hall 3 + Hall 2B #387
[ ]
Fri 25 Apr midnight PDT — 2:30 a.m. PDT

Abstract: This paper investigates energy guidance in generative modeling, where the target distribution is defined as q(x)p(x)exp(βE(x)), with p(x) being the data distribution and E(x) as the energy function. To comply with energy guidance, existing methods often require auxiliary procedures to learn intermediate guidance during the diffusion process. To overcome this limitation, we explore energy-guided flow matching, a generalized form of the diffusion process. We introduce energy-weighted flow matching (EFM), a method that directly learns the energy-guided flow without the need for auxiliary models. Theoretical analysis shows that energy-weighted flow matching accurately captures the guided flow. Additionally, we extend this methodology to energy-weighted diffusion models and apply it to offline reinforcement learning (RL) by proposing the Q-weighted Iterative Policy Optimization (QIPO). Empirically, we demonstrate that the proposed QIPO algorithm improves performance in offline RL tasks. Notably, our algorithm is the first energy-guided diffusion model that operates independently of auxiliary models and the first exact energy-guided flow matching model in the literature.

Live content is unavailable. Log in and register to view live content