DOPPLER: Dual-Policy Learning for Device Assignment in Asynchronous Dataflow Graphs
Xinyu Yao ⋅ Daniel Bourgeois ⋅ Abhinav Jain ⋅ Yuxin Tang ⋅ Jiawen Yao ⋅ Zhimin Ding ⋅ Arlei Silva ⋅ Chris Jermaine
Abstract
We study the problem of assigning operations in a dataflow graph to devices to minimize execution time in a work-conserving system, with emphasis on complex machine learning workloads. Prior learning-based approaches face three limitations: (1) reliance on bulk-synchronous frameworks that under-utilize devices, (2) learning a single placement policy without modeling the system dynamics, and (3) depending solely on reinforcement learning during pre-training while ignoring optimization during deployment. We propose Doppler, a three-stage framework with two policies—$\mathsf{SEL}$ for selecting operations and $\mathsf{PLC}$ for placing them on devices. Doppler consistently outperforms baselines by reducing execution time and improving sampling efficiency through faster per-episode training. Our results show that Doppler achieves up to 52.7\% lower execution times than the best baseline. The code is available at https://github.com/xinyuyao/Doppler.
Successful Page Load