A New Method for Vertical Parallelisation of TAN Learning Based on Balanced Incomplete Block Designs

The framework of Bayesian networks is a widely popular formalism for performing belief update under uncertainty. Structure re- stricted Bayesian network models such as the Naive Bayes Model and Tree-Augmented Naive Bayes (TAN) Model have shown impressive per- formance for solving classi cation t...

詳細記述

書誌詳細
主要な著者: Madsen, Anders L., Jensen, Frank, Salmerón Cerdán, Antonio, Karlsen, Martin, Langseth, Helge, Nielsen, Thomas D.
フォーマット: info:eu-repo/semantics/article
言語:English
出版事項: 2017
オンライン・アクセス:http://hdl.handle.net/10835/4857
その他の書誌記述
要約:The framework of Bayesian networks is a widely popular formalism for performing belief update under uncertainty. Structure re- stricted Bayesian network models such as the Naive Bayes Model and Tree-Augmented Naive Bayes (TAN) Model have shown impressive per- formance for solving classi cation tasks. However, if the number of vari- ables or the amount of data is large, then learning a TAN model from data can be a time consuming task. In this paper, we introduce a new method for parallel learning of a TAN model from large data sets. The method is based on computing the mutual information scores between pairs of variables given the class variable in parallel. The computations are organised in parallel using balanced incomplete block designs. The results of a preliminary empirical evaluation of the proposed method on large data sets show that a signi cant performance improvement is pos- sible through parallelisation using the method presented in this paper.