ICLR Poster Causal Contextual Bandits with Targeted Interventions

Poster

Causal Contextual Bandits with Targeted Interventions

Chandrasekar Subramanian · Balaraman Ravindran

Virtual

Keywords: [ causal inference ] [ contextual bandits ] [ bandits ] [ causality ]

[ Abstract ]

[ Visit Poster at Spot B3 in Virtual World ] [ Slides] [ OpenReview]

Abstract:

We study a contextual bandit setting where the learning agent has the ability to perform interventions on targeted subsets of the population, apart from possessing qualitative causal side-information. This novel formalism captures intricacies in real-world scenarios such as software product experimentation where targeted experiments can be conducted. However, this fundamentally changes the set of options that the agent has, compared to standard contextual bandit settings, necessitating new techniques. This is also the first work that integrates causal side-information in a contextual bandit setting, where the agent aims to learn a policy that maps contexts to arms (as opposed to just identifying one best arm). We propose a new algorithm, which we show empirically performs better than baselines on experiments that use purely synthetic data and on real world-inspired experiments. We also prove a bound on regret that theoretically guards performance.

Chat is not available.