INCOLLECTION

Boosting nearest neighbors for the efficient estimation of posteriors

Machine Learning and Knowledge Discovery in Databases | pages 314-329, 2012

Author

D'Ambrosio, Roberto and Nock, Richard and Ali, Wafa Bel Haj and Nielsen, Frank and Barlaud, Michel

Abstract

Constraint-based search methods, which are a major approach to learning Bayesian networks, are expected to be effective in causal discovery tasks. However, such methods often suffer from impracticality of classical hypothesis testing for conditional independence when the sample size is insufficiently large. We present a new conditional independence (CI) testing method that is designed to be effective for small samples. Our method uses the minimum free energy principle, which originates from thermodynamics, with the “Data Temperature” assumption recently proposed by us. This CI method incorporates the maximum entropy principle and converges to classical hypothesis tests in asymptotic regions. In our experiments using repository datasets (Alarm/Insurance/Hailfinder/Barley/Mildew), the results show that our method improves the learning performance of the well known PC algorithm in the view of edge-reversed errors in addition to extra/missing errors.

Related Members