Comparative Analysis of Different Versions of Association Rule Mining Algorithm on AWS-EC2

Saabith, Ahamed Lebbe Sayeth; Sundararajan, Elankovan; Bakar, Azuraliza Abu

doi:10.1007/978-3-319-25939-0_6

Ahamed Lebbe Sayeth Saabith²⁰,
Elankovan Sundararajan²⁰ &
Azuraliza Abu Bakar²¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9429))

Included in the following conference series:

International Visual Informatics Conference

1236 Accesses
1 Citations

Abstract

Data mining is an essential step of knowledge discovery in databases (KDD) process by analyzing the huge amount of data from different perspectives and summarizing it into potentially valuable, valid, novel, interesting, and previously unknown information. Due to the importance of extracting knowledge from the massive data repositories, data mining is an essential components in various fields. Association rule mining (ARM), is one of the most important and well researched techniques of data mining, It aims to extract essential relationships, frequent patterns, associations among itemsets in the transaction databases or other data repositories. Many algorithm have been proposed to find the frequent itemset efficiently. In this research, we have chosen four well established frequent itemset mining methods which are Apriori, Apriori TID, Eclat, and FP-Growth to analyze their performance on cloud environment. Cloud computing is a new paradigm to analyze big data efficiently and cost effectively. In this study we analyzed the algorithms on Amazon web service (AWS) platform using elastic cloud computing (EC2) service. We thereafter compare the four algorithms based on their execution time by varying the minimum support (min_sup) values.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Tan, P.: Introduction to Data Mining, vol. 1. Pearson Addison Wesley, Boston (2007)
Google Scholar
Hand, D.J.: Principles of Data Mining, vol. 30, no. 7. MIT press, Cambridge (2007)
Google Scholar
Ngai, E.W.T., Xiu, L., Chau, D.C.K.: Application of data mining techniques in customer relationship management: a literature review and classification. Expert Syst. Appl. 36(2 PART 2), 2592–2602 (2009)
Article Google Scholar
Shaw, M.J.B.C., Subramaniam, C., Tan, G.W., Welge, M.E.: Knowledge management and data mining for marketing. Decis. Support Syst. 31(1), 127–137 (2001)
Article Google Scholar
Obenshain, M.K.: Application of data mining techniques to healthcare data. Infect. Control Hosp. Epidemiol. 25(8), 690–695 (2004)
Article Google Scholar
Antonie, M., Coman, A., Zaiane, O.R.: Application of data mining techniques for medical image classification. In: Proceedings of the Second International Workshop on Multimedia Data Mining (MDM/KDD 2001), pp. 94–101 (2001)
Google Scholar
Srivastava, J., Cooley, R., Deshpande, M., Tan, P.-N.: Web usage mining: discovery and applications of usage patterns from web data. ACM SIGKDD 1(2), 12–23 (2000)
Article Google Scholar
Han, J., Kamber, M.: Data Mining, Southeast Asia Edition: Concepts and Techniques. Morgan Kaufmann, Los Altos (2006)
Google Scholar
Hipp, J., Güntzer, U., Nakhaeizadeh, G.: Algorithms for association rule mining - a general survey and comparison. ACM SIGKDD Explor. Newsl. 2(1), 58–64 (2000)
Article Google Scholar
Zhang, C., Zhang, S.: Association Rule Mining: Models and Algorithms, vol. 2307. Springer, Berlin (2002)
Google Scholar
Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. ACM SIGMOD Rec. 22, 207–216 (1993)
Article Google Scholar
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of 20th International Conference on Very Large Data bases, VLDB (1994)
Google Scholar
Witten, I., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, Los Altos (2005)
Google Scholar
Ambulkar, B., Borkar, V.: Data mining in cloud computing. In: MPGI National Multi Conference, pp. 23–26 (2012)
Google Scholar
Petre, R.S.: Data mining in cloud computing. Datab. Syst. J. 3(3), 67–71 (2012)
MathSciNet Google Scholar
Zaki, M.J.: Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 12(3), 372–390 (2000)
Article MathSciNet Google Scholar
Zaki, M.J., Gouda, K.: Fast vertical mining using diffsets. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and data mining - KDD 2003, p. 326 (2003)
Google Scholar
Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min. Knowl. Discov. 8(1), 53–87 (2004)
Article MathSciNet Google Scholar
Borgelt, C.: Keeping things simple: finding frequent item sets by recursive elimination. In: Proceedings of the 1st International Workshop on Open Source Data Mining: Frequent Pattern Mining Implementations, pp. 66–70 (2005)
Google Scholar
Deng, Z.-H., Lv, S.-L.S.: Fast mining frequent itemsets using nodesets. Expert Syst. Appl. 41(10), 4505–4512 (2014)
Article Google Scholar
Krishna, T.: Effectiveness of various FPM algorithms in data mining. ijcsit.org 02(01), 01–05 (2014)
Article Google Scholar
Patel Tushar, S., Mayur, P., Dhara, L., Jahnvi, K., Piyusha, D., Ashish, P., Reecha, P., Tushar, S.P., Mayur, P., Dhara, L.: An analytical study of various frequent itemset mining algorithms. Res. J. Comput. Inf. Technol. Sci. 1(1), 2–5 (2013)
Google Scholar
Pramod, S., Vyas, O.P.: Survey on frequent itemset mining algorithms. Int. J. Comput. Appl. 1(5), 1–6 (2010)
Google Scholar
Prithiviraj, P., Porkodi, R.: A comparative analysis of association rule mining algorithms in data mining: a study. Open J. Comput. Sci. Eng. Surv. 3(1), 98–119 (2015)
Google Scholar
Tiwari, M., Jha, M.B., Yadav, O.: Performance analysis of data mining algorithms in Weka. IOSR J. Comput. Eng. ISSN 6, 661–2278 (2012)
Google Scholar
Trivedi, M.M.: Review and analysis of various efficient frequent pattern algorithms. Int. J. Technol. Res. Eng. 2(2), 139–143 (2014)
Google Scholar
Garg, K., Kumar, D.: Comparing the performance of frequent pattern mining algorithms. Int. J. Comput. Appl. 69(25), 21–28 (2013)
Google Scholar
Sinha, G., Ghosh, S.M.: Identification of best algorithm in association rule mining based on performance. Int. J. Comput. Sci. Mob. Comput. 3(11), 38–45 (2014)
Google Scholar
Nichol, M.B., Knight, T.K., Dow, T., Wygant, G., Borok, G., Hauch, O., O’Connor, R.: Quality of anticoagulation monitoring in nonvalvular atrial fibrillation patients: comparison of anticoagulation clinic versus usual care. Ann. Pharmacother. 42(1), 62–70 (2008)
Article Google Scholar
Yu, L.C., Chan, C.L., Lin, C.C., Lin, I.C.: Mining association language patterns using a distributional semantic model for negative life event classification. J. Biomed. Inform. 44(4), 509–518 (2011)
Article Google Scholar
Zhao, Q., Bhowmick, S.S.: Association Rule Mining: a Survey. Nanyang Technological University, Singapore (2003)
Google Scholar
Said, A.M., Dominic, P.D.D., Abdullah, A.B.: A comparative study of fp-growth variations. Int. J. Comput. Sci. Netw. Secur. 9(5), 266–272 (2009)
Google Scholar
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. ACM SIGMOD Rec. 29(2), 1–12 (2000)
Article Google Scholar
Zaiane, O.R., El-Hajj, M., Lu, P.: Fast parallel association rule mining without candidacy generation. In: Proceedings 2001 IEEE International Conference on Data Mining, pp. 665–668 (2001)
Google Scholar
Borgelt, C., Borgelt, C., Kruse, R., Kruse, R.: Induction of association rules: apriori implementation. In: 15th Conference on Computational Statistics Physica Verlag, Heidelberg, Germany 2002, vol. 1, pp. 1–6 (2002)
Google Scholar
Amazon, A.W.S., Miller, F.P., Vandome, A.F., McBrewster, J.: Amazon web services, vol. 12, pp. 1–3 (November 2012). http://aws.Amaz.com/es/ec2/
Murty, J.: Programming Amazon Web Services: S3, EC2, SQS, FPS, and SimpleDB. O’Reilly Media Inc, Sebastopol (2008)
Google Scholar
Robinson, D.: Amazon Web Services Made Simple: Learn how Amazon EC2, S3, SimpleDB and SQS Web Services Enables You to Reach Business Goals Faster. Emereo Pty Ltd, Brisbane (2008)
Google Scholar
Goethals, B.: Frequent itemset mining implementations repository (2003). http://fimi.ua.ac.be/
Fournier-Viger, P.: SPMF- an open-source data mining library (2003). http://www.philippe-fournier-viger.com/spmf/

Download references

Acknowledgment

We wish to thank Universiti Kebangsaan Malaysia (UKM) and Ministry of Higher Education Malaysia for supporting this work by research Grants (ERGS/1/2013/ICT07/UKM/02/3).

Author information

Authors and Affiliations

Faculty of Information Science and Technology, Centre for Software Technology and Management, Universiti Kebangsaan Malaysia, UKM, 43600, Bangi, Selangor-DE, Malaysia
Ahamed Lebbe Sayeth Saabith & Elankovan Sundararajan
Faculty of Information Science and Technology, Center for Artificial Intelligence and Technology, Universiti Kebangsaan Malaysia, UKM, 43600, Bangi, Selangor-DE, Malaysia
Azuraliza Abu Bakar

Authors

Ahamed Lebbe Sayeth Saabith
View author publications
You can also search for this author in PubMed Google Scholar
Elankovan Sundararajan
View author publications
You can also search for this author in PubMed Google Scholar
Azuraliza Abu Bakar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ahamed Lebbe Sayeth Saabith .

Editor information

Editors and Affiliations

Fac Info Science and Techn, Universiti Kebangsaan Malaysia, Selangor, Malaysia
Halimah Badioze Zaman
University of Cambridge, Cambridge, United Kingdom
Peter Robinson
Center for Digital Video Process, Dublin 9, Ireland
Alan F. Smeaton
Computer Science and Information Enginee, National Central University, Jhongli City, Taiwan
Timothy K. Shih
Kingston University, Kingston upon Thames, United Kingdom
Sergio Velastin
Universiti Kebangsaan Malaysia, Institute of Visual Informatics, Bangi, Malaysia
Azizah Jaafar
Universiti Kebangsaan Malaysia, Institute of Visual Informatics, Bangi, Malaysia
Nazlena Mohamad Ali

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Saabith, A.L.S., Sundararajan, E., Bakar, A.A. (2015). Comparative Analysis of Different Versions of Association Rule Mining Algorithm on AWS-EC2. In: Badioze Zaman, H., et al. Advances in Visual Informatics. IVIC 2015. Lecture Notes in Computer Science(), vol 9429. Springer, Cham. https://doi.org/10.1007/978-3-319-25939-0_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-25939-0_6
Published: 27 October 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25938-3
Online ISBN: 978-3-319-25939-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics