2018 PGM PGM 2018

Structure Learning Under Missing Data

Abstract

Causal discovery is the problem of learning the structure of a graphical causal model that approximates the true generating process that gave rise to observed data. In practical problems, including in causal discovery problems, missing data is a very common issue. In such cases, learning the true causal graph entails estimating the full data distribution, samples from which are not directly available. Attempting to instead apply existing structure learning algorithms to samples drawn from the observed data distribution, containing systematically missing entries, may well result in incorrect inferences due to selection bias. Inthis paperwe discuss adjustmentsthat mustbemade toexistingstructure learningalgorithms to properly account for missing data. We first give an algorithm for the simpler setting where the underlying graph is unknown, but the missing data model is known. We then discuss approaches to the much more difficult case where only the observed data is given with no other additional information on the missingness model known. We validate our approach by simulations, showing that it outperforms standard structure learning algorithms in all of these settings.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning
📈 Trend Setter — Probabilistic Modeling
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning
🐣 Hot Topic Early Bird — causal discovery