Abstract
This paper describes a parallel version of the PC algorithm for learning the structure of a Bayesian network from data. The PC algorithm is a constraint-based algorithm consisting of five steps where the first step is to perform a set of (conditional) independence tests while the remaining four steps relate to identifying the structure of the Bayesian network using the results of the (conditional) independence tests. In this paper, we describe a new approach to parallelisation of the (conditional) independence testing as experiments illustrate that this is by far the most time consuming step. The proposed parallel PC algorithm is evaluated on data sets generated at random from five different real-world Bayesian networks. The results demonstrate that significant time performance improvements are possible using the proposed algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
References
Andreassen, S., Jensen, F.V., Andersen, S.K., Falck, B., Kjærulff, U., Woldbye, M., Sørensen, A.R., Rosenfalck, A., Jensen, F.: MUNIN – an expert EMG assistant. In: Computer-Aided Electromyography and Expert Systems, Chapter 21. Elsevier Science (1989)
Andreassen, S., Hovorka, R., Benn, J., Olesen, K.G., Carson, E.R.: A model-based approach to insulin adjustment. In: Stefanelli, S., Hasman, A., Fieschi, M., Talmon, J. (eds.) Proceedings of the Third Conference on Artificial Intelligence in Medicine. Lecture Notes in Medical Informatics, pp. 239–248. Springer, Heidelberg (1991)
Basak, A., Brinster, I., Ma, X., Mengshoel, O.J.: Accelerating Bayesian network parameter learning using hadoop and MapReduce. In: Proceedings of the 1st International Workshop on Big Data, Streams a nd Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications, pp. 101–108 (2012)
Chen, W., Zong, L., Huang, W., Ou, G., Wang, Y., Yang, D.: An empirical study of massively parallel Bayesian networks learning for sentiment extraction from unstructured text. In: Du, X., Fan, W., Wang, J., Peng, Z., Sharaf, M.A. (eds.) APWeb 2011. LNCS, vol. 6612, pp. 424–435. Springer, Heidelberg (2011)
Chu, C.-T., Kim, S.K., Lin, Y.-A., Yu, Y., Bradski, G., Ng, A.Y., Olukotun, K.: Map-reduce for machine learning on multicore. In: NIPS, pp. 281–288 (2006)
de Jongh, M.: Algorithms for constraint-based learning of Bayesian network structures with large numbers of variables. Ph.D. thesis, Uni\(\dot{\rm o}\)f Pittsburgh (2014)
Fang, Q., Yue, K., Fu, X., Wu, H., Liu, W.: A MapReduce-based method for learning Bayesian network from massive data. In: Ishikawa, Y., Li, J., Wang, W., Zhang, R., Zhang, W. (eds.) APWeb 2013. LNCS, vol. 7808, pp. 697–708. Springer, Heidelberg (2013)
Jensen, F.V., Skaanning, C., Kjærulff, U.: The SACSO system for troubleshooting of printing systems. In: Proceedings of the Seventh Scandinavian Conference on Artificial Intelligence (2001)
Jensen, F.V., Nielsen, T.D.: Bayesian Networks and Decision Graphs, 2nd edn. Springer, New York (2007)
Kalisch, M., Buhlmann, P.: Estimating high-dimensional directed acyclic graphs with the PC-algorithm. J. Mach. Learn. Res. 8, 613–636 (2008)
Kjærulff, U.B., Madsen, A.L.: Bayesian Networks and Influence Diagrams: A Guide to Construction and Analysis, 2nd edn. Springer, New York (2013)
Knuth, D.E.: The Art of Computer Programming, Volume 4, Fascicle 3. Addison-Wesley, Reading (2005)
Madsen, A.L., Jensen, F., Salmeron, A., Karlsen, M., Langseth, H., Nielsen, T.D.: A new method for vertical parallelisation of tan learning based on balanced incomplete block designs. In: Proceedings of PGM, pp. 302–317 (2014)
Nikolova, O., Aluru, S.: Parallel discovery of direct causal relations and Markov boundaries with applications to gene networks. In: 2011 International Conference IEEE Parallel Processing (ICPP), pp. 512–521 (2011)
Papanikolaou, A.: Presents Modern Risk-based Methods and Applications to Ship Design, Operation, and Regulations. Springer, Heidelberg (2009)
Scutari, M.: Learning Bayesian Networks with the bnlearn R Package. J. Stat. Softw. 35(3), 1–22 (2010)
Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction, and Search. Adaptive Computation and Machine Learning, 2nd edn. MIT Press, Cambridge (2000)
Stinson, D.: Combinatorial Designs - Constructions and Analysis. Springer, New York (2003)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Madsen, A.L., Jensen, F., Salmerón, A., Langseth, H., Nielsen, T.D. (2015). Parallelisation of the PC Algorithm. In: Puerta, J., et al. Advances in Artificial Intelligence. CAEPIA 2015. Lecture Notes in Computer Science(), vol 9422. Springer, Cham. https://doi.org/10.1007/978-3-319-24598-0_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-24598-0_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24597-3
Online ISBN: 978-3-319-24598-0
eBook Packages: Computer ScienceComputer Science (R0)