Task-Parallel FP-Growth on Cluster Computers

Özdogan, Gülistan Özdemir; Abul, Osman

doi:10.1007/978-90-481-9794-1_71

Gülistan Özdemir Özdogan⁷ &
Osman Abul⁷

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 62))

879 Accesses
2 Citations

Abstract

Frequent itemset mining (FIM) is one of the most deeply studied data mining task. A number of algorithms, employing different approaches and advanced data structures, have already been proposed to solve the task efficiently. Even the fastest serial FIM algorithms fail to scale up with the rapid growth of database sizes. Hence, parallel FIM algorithms are the only viable solutions in many domains as serial so- lutions have almost reached the physical barriers. To this end, parallel versions of a few serial FIM algorithms, including FP-Growth, have al- ready been developed. In this study, we develop three different parallel FP-Growth implementations for cluster computers. They, all MPI based, are (i) Static Parallel FP-Growth, (ii) Dynamic Parallel FP-Growth, and (iii) (Tree-Sharing) Dynamic Parallel FP-Growth. All the three variants are task-parallel, i.e., not based on horizontal or vertical partitioning of database. The algorithms are experimentally evaluated on a 16-node cluster computer. Our results demonstrate the utility of the algorithms.

Supported by TUBITAK, grant number 108E016

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

R. Agrawal, T. Imielienski, and A. Swami. Mining association rules between sets of items in large databases. In SIGMOD ’93, pages 207–216, 1993.
Google Scholar
R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In VLDB’94, pages 487–499, 1994.
Google Scholar
J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data (SIGMOD 2000), pages 1–12, 2000.
Google Scholar
Y-J. Lan and Y. Qiu. Parallel frequent itemsets mining algorithms without interme-diate results. In Proceedings of 2005 International Conference on Machine Learning and Cybernetics, pages 2102–2107, 2005.
Google Scholar
H. Li, Y. Wang, D. Zhang, M. Zhang, and E.Y. Chang. Pfp: Parallel fp-growth for query recommendation. In Proceedings of the 2008 ACM Conference on Recommender Systems, pages 107–114, 2008.
Google Scholar
G.O. Ozdogan, O. Abul, and A. Yazici. Paralel veri madenciligi algoritmalari. In Proceedings of the first National High-Performance and Grid Computing Conference, pages 131–137, 2009 (in Turkish).
Google Scholar
I. Pramudiono and M. Kitsuregawa. Parallel fp-growth on pc cluster. In Proceedings of the 7th Pacific-Asia Conference of Knowledge Discovery and Data Mining, pages 467–473, 2003.
Google Scholar
A. Savasere, E. Omiecinski, and S. Navathe. An e±cient algorithm for mining association rules in large databases. In Proceedings of the 21st International Conference on Very Large Databases (VLDB’95), pages 432–444, 1995.
Google Scholar
O.R. Zaiane, M. El-Hajj, and P. Lu. Fast parallel association rule mining without candidacy generation. In Proceedings of the 2001 IEEE International Conference on Data Mining, pages 665–668, 2001.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, TOBB University of Economics and Technology, Ankara, Turkey
Gülistan Özdemir Özdogan & Osman Abul

Authors

Gülistan Özdemir Özdogan
View author publications
You can also search for this author in PubMed Google Scholar
Osman Abul
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gülistan Özdemir Özdogan .

Editor information

Editors and Affiliations

, EEE Dept., Imperial College, Exhibition Road, London, SW72BT, United Kingdom
Erol Gelenbe
, EEE Dept., Imperial College, Exhibition Rd., London, SW72AZ, United Kingdom
Ricardo Lent
, EEE Dept., Imperial College, Exhibition Rd., London, SW72AZ, United Kingdom
Georgia Sakellari
, School of Biomedical Eng., Sci. and Heal, Drexel University, Bossone 702, 3120 Market Street, Philadelphia, 19104, Pennsylvania, USA
Ahmet Sacan
, Dept. of Computer Engineering, Middle East Technical University, Ankara, 06531, Turkey
Hakki Toroslu
Fac. Engineering, Dept. Computer Engineering, Middle East Technical University - METU, Ankara, 06531, Turkey
Adnan Yazici

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Özdogan, G.Ö., Abul, O. (2011). Task-Parallel FP-Growth on Cluster Computers. In: Gelenbe, E., Lent, R., Sakellari, G., Sacan, A., Toroslu, H., Yazici, A. (eds) Computer and Information Sciences. Lecture Notes in Electrical Engineering, vol 62. Springer, Dordrecht. https://doi.org/10.1007/978-90-481-9794-1_71

Download citation

DOI: https://doi.org/10.1007/978-90-481-9794-1_71
Published: 18 August 2010
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-9793-4
Online ISBN: 978-90-481-9794-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics