Oral
in
Workshop: Workshop on Spurious Correlation and Shortcut Learning: Foundations and Solutions
BAYESIAN INVARIANCE ENVIRONMENT DATA
Luhuan Wu · Mingzhang Yin · Yixin Wang · John Cunningham · David Blei
Keywords: [ invariance; multi-environment; bayesian modeling; variational inference ]
Identifying invariant features – those that stably predict the outcome across diverse environments – is crucial for improving model generalization and uncovering causal mechanisms. While previous methods primarily address this problem through hypothesis testing or regularized optimization, they often lack a principled characterization of the underlying data generative process and struggle with high-dimensional data. In this work, we develop a Bayesian model that encodes an invariance assumption in the generative process of multi-environment data. Within this framework, we perform posterior inference to estimate the invariant features and establish theoretical guarantees on posterior consistency and contraction rates. To address the challenges in high-dimensional settings, we design a scalable variational inference algorithm. We demonstrate the superior inference accuracy and scalability of our method compared to existing approaches in simulations and a gene-perturbation study.