Poster
Robust Root Cause Diagnosis using In-Distribution Interventions
Lokesh Nagalapatti · Ashutosh Srivastava · Sunita Sarawagi · Amit Sharma
Hall 3 + Hall 2B #465
Diagnosing the root cause of an anomaly in a complex interconnected system isa pressing problem in today’s cloud services and industrial operations. We propose In-Distribution Interventions (IDI), a novel algorithm that predicts root causeas nodes that meet two criteria: 1) Anomaly: root cause nodes should take onanomalous values; 2) Fix: had the root cause nodes assumed usual values, thetarget node would not have been anomalous. Prior methods of assessing the fixcondition rely on counterfactuals inferred from a Structural Causal Model (SCM)trained on historical data. But since anomalies are rare and fall outside the training distribution, the fitted SCMs yield unreliable counterfactual estimates. IDIovercomes this by relying on interventional estimates obtained by solely probing the fitted SCM at in-distribution inputs. We present a theoretical analysiscomparing and bounding the errors in assessing the fix condition using interventional and counterfactual estimates. We then conduct experiments by systematically varying the SCM’s complexity to demonstrate the cases where IDI’s interventional approach outperforms the counterfactual approach and vice versa.Experiments on both synthetic and PetShop RCD benchmark datasets demonstrate that IDI consistently identifies true root causes more accurately and robustly than nine existing state-of-the-art RCD baselines. Code will be releasedat https://github.com/nlokeshiisc/IDI_release.
Live content is unavailable. Log in and register to view live content