Non-Local Data Attribution for On-policy Reinforcement Learning
Abstract
Data attribution has become an important tool for understanding and improving model training, but its study in reinforcement learning (RL) remains limited. Prior work has shown that local data attribution computed within a single rollout provides useful signals for data selection and hence helps accelerate training. In this work, we move beyond local attribution and introduce non-local data attribution for on-policy RL, where attribution targets are defined using future rollouts generated by a better-performing policy. We formalize this setting via a replay-based leave-one-out objective (replay-LOO) that isolates optimization effects under fixed rollout buffers. Using the well-developed training data attribution methods in supervised learning, we are able to account for the training dynamics when estimating data influence. We show that non-local attribution achieves strong correlation with ground-truth LOO retraining effects in RL. Based on this property, we further demonstrate how non-local attribution can be used for effective data selection by reusing rollout buffers, leading to improved sample efficiency without additional environment interaction. Overall, our results highlight non-local attribution as a promising tool for data-centric reinforcement learning.