ICLR Poster HyPoGen: Optimization-Biased Hypernetworks for Generalizable Policy Generation

Poster

HyPoGen: Optimization-Biased Hypernetworks for Generalizable Policy Generation

Hanxiang Ren · Li Sun · Xulong Wang · Pei Zhou · Zewen Wu · Siyan Dong · Difan Zou · Youyi Zheng · Yanchao Yang

Hall 3 + Hall 2B #31

[ Abstract ]

Sat 26 Apr midnight PDT — 2:30 a.m. PDT

Abstract:

Policy learning through behavior cloning poses significant challenges, particularly when demonstration data is limited. In this work, we present HyPoGen, a novel optimization-biased hypernetwork for policy generation. The proposed hypernetwork learns to synthesize optimal policy parameters solely from task specifications -- without accessing training data -- by modeling policy generation as an approximation of the optimization process executed over a finite number of steps and assuming these specifications serve as a sufficient representation of the demonstration data. By incorporating structural designs that bias the hypernetwork towards optimization, we can improve its generalization capability while only training on source task demonstrations. During the feed-forward prediction pass, the hypernetwork effectively performs an optimization in the latent (compressed) policy space, which is then decoded into policy parameters for action prediction. Experimental results on locomotion and manipulation benchmarks show that HyPoGen significantly outperforms state-of-the-art methods in generating policies for unseen target tasks without any demonstrations, achieving higher success rates and underscoring the potential of optimization-biased hypernetworks in advancing generalizable policy generation. Our code and data are available at: https://github.com/ReNginx/HyPoGen.

Live content is unavailable. Log in and register to view live content