Skip to yearly menu bar Skip to main content


Oral
in
Workshop: 7th Robot Learning Workshop: Towards Robots with Human-Level Abilities

RecFlow Policy: Fast and Accurate Visuomotor Policy Learning via Rectified Action Flow

Rong Xue · Jiageng Mao · Mingtong Zhang · Yue Wang

[ ] [ Project Page ]
 
presentation: 7th Robot Learning Workshop: Towards Robots with Human-Level Abilities
Sat 26 Apr 5:55 p.m. PDT — 3 a.m. PDT

Abstract:

We introduce RecFlow Policy, a fast, accurate, and scalable policy for robot learning, bridging the gap between generative modeling techniques and real-world robotic applications. Diffusion models have seen rapid adoption in robotic imitation learning, enabling autonomous execution of complex dexterous tasks. However, the dependence of multi-step iterative denoising makes action synthesis computationally expensive and slow, limiting their effectiveness in fast-reacting policies. RecFlow Policy replaces the diffusion process with a novel rectified flow parameterization, significantly enhancing both computational speed and policy accuracy. RecFlow Policy learns a deterministic coupling to achieve rapid policy inference. This deterministic nature allows for precise visuomotor control with minimal inference time, making it highly suitable for real-time robotic applications. Unlike conventional iterative training methods, our approach selectively refines the rectification process using expert demonstrations to reduce accumulated errors. Leveraging nearly straight flows, RecFlow Policy achieves high accuracy with just a single denoising step. To evaluate the effectiveness of RecFlow Policy, we conducted extensive experiments across both simulated and real-world tasks. Results show that our method matches or surpasses the performance of state-of-the-art diffusion-based methods while while offering greater simplicity and computational efficiency. Compared to Diffusion Policy, which involves numerous iterative steps and incurs significant computational overhead, our approach offers a streamlined and scalable solution for real-time visuomotor policy learning.

Chat is not available.