Poster
in
Workshop: The Future of Machine Learning Data Practices and Repositories
The Landscape of Causal Discovery Data: Grounding Causal Discovery in Real-World Applications
Philippe Brouillard · Chandler Squires · Jonas Wahl · Konrad P Kording · Karen Sachs · Alexandre Drouin · Dhanya Sridhar
Causal discovery aims to automatically uncover causal relationships from data, a capability with significant potential across many scientific disciplines. However, its real-world applicability remains limited, mainly due to poor data practices, such as an overreliance on unrealistic datasets and inadequate evaluation metrics. This paper systematically reviews the recent causal discovery literature, highlighting the disconnect between current benchmarking practices and practical applications. We present applications from biology, neuroscience, and Earth sciences—fields where causal discovery holds promise for addressing key challenges. We catalog available simulated and real-world datasets from these domains and discuss common assumption violations that have spurred the development of new methods. Finally, we recommend that the causal discovery community adopt more adequate metrics and use a more diverse range of realistic datasets.