Skip to main content

Comparative Analysis of Different Versions of Association Rule Mining Algorithm on AWS-EC2

  • Conference paper
  • First Online:
Advances in Visual Informatics (IVIC 2015)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9429))

Included in the following conference series:

Abstract

Data mining is an essential step of knowledge discovery in databases (KDD) process by analyzing the huge amount of data from different perspectives and summarizing it into potentially valuable, valid, novel, interesting, and previously unknown information. Due to the importance of extracting knowledge from the massive data repositories, data mining is an essential components in various fields. Association rule mining (ARM), is one of the most important and well researched techniques of data mining, It aims to extract essential relationships, frequent patterns, associations among itemsets in the transaction databases or other data repositories. Many algorithm have been proposed to find the frequent itemset efficiently. In this research, we have chosen four well established frequent itemset mining methods which are Apriori, Apriori TID, Eclat, and FP-Growth to analyze their performance on cloud environment. Cloud computing is a new paradigm to analyze big data efficiently and cost effectively. In this study we analyzed the algorithms on Amazon web service (AWS) platform using elastic cloud computing (EC2) service. We thereafter compare the four algorithms based on their execution time by varying the minimum support (min_sup) values.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Tan, P.: Introduction to Data Mining, vol. 1. Pearson Addison Wesley, Boston (2007)

    Google Scholar 

  2. Hand, D.J.: Principles of Data Mining, vol. 30, no. 7. MIT press, Cambridge (2007)

    Google Scholar 

  3. Ngai, E.W.T., Xiu, L., Chau, D.C.K.: Application of data mining techniques in customer relationship management: a literature review and classification. Expert Syst. Appl. 36(2 PART 2), 2592–2602 (2009)

    Article  Google Scholar 

  4. Shaw, M.J.B.C., Subramaniam, C., Tan, G.W., Welge, M.E.: Knowledge management and data mining for marketing. Decis. Support Syst. 31(1), 127–137 (2001)

    Article  Google Scholar 

  5. Obenshain, M.K.: Application of data mining techniques to healthcare data. Infect. Control Hosp. Epidemiol. 25(8), 690–695 (2004)

    Article  Google Scholar 

  6. Antonie, M., Coman, A., Zaiane, O.R.: Application of data mining techniques for medical image classification. In: Proceedings of the Second International Workshop on Multimedia Data Mining (MDM/KDD 2001), pp. 94–101 (2001)

    Google Scholar 

  7. Srivastava, J., Cooley, R., Deshpande, M., Tan, P.-N.: Web usage mining: discovery and applications of usage patterns from web data. ACM SIGKDD 1(2), 12–23 (2000)

    Article  Google Scholar 

  8. Han, J., Kamber, M.: Data Mining, Southeast Asia Edition: Concepts and Techniques. Morgan Kaufmann, Los Altos (2006)

    Google Scholar 

  9. Hipp, J., Güntzer, U., Nakhaeizadeh, G.: Algorithms for association rule mining - a general survey and comparison. ACM SIGKDD Explor. Newsl. 2(1), 58–64 (2000)

    Article  Google Scholar 

  10. Zhang, C., Zhang, S.: Association Rule Mining: Models and Algorithms, vol. 2307. Springer, Berlin (2002)

    Google Scholar 

  11. Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. ACM SIGMOD Rec. 22, 207–216 (1993)

    Article  Google Scholar 

  12. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of 20th International Conference on Very Large Data bases, VLDB (1994)

    Google Scholar 

  13. Witten, I., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, Los Altos (2005)

    Google Scholar 

  14. Ambulkar, B., Borkar, V.: Data mining in cloud computing. In: MPGI National Multi Conference, pp. 23–26 (2012)

    Google Scholar 

  15. Petre, R.S.: Data mining in cloud computing. Datab. Syst. J. 3(3), 67–71 (2012)

    MathSciNet  Google Scholar 

  16. Zaki, M.J.: Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 12(3), 372–390 (2000)

    Article  MathSciNet  Google Scholar 

  17. Zaki, M.J., Gouda, K.: Fast vertical mining using diffsets. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and data mining - KDD 2003, p. 326 (2003)

    Google Scholar 

  18. Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min. Knowl. Discov. 8(1), 53–87 (2004)

    Article  MathSciNet  Google Scholar 

  19. Borgelt, C.: Keeping things simple: finding frequent item sets by recursive elimination. In: Proceedings of the 1st International Workshop on Open Source Data Mining: Frequent Pattern Mining Implementations, pp. 66–70 (2005)

    Google Scholar 

  20. Deng, Z.-H., Lv, S.-L.S.: Fast mining frequent itemsets using nodesets. Expert Syst. Appl. 41(10), 4505–4512 (2014)

    Article  Google Scholar 

  21. Krishna, T.: Effectiveness of various FPM algorithms in data mining. ijcsit.org 02(01), 01–05 (2014)

    Article  Google Scholar 

  22. Patel Tushar, S., Mayur, P., Dhara, L., Jahnvi, K., Piyusha, D., Ashish, P., Reecha, P., Tushar, S.P., Mayur, P., Dhara, L.: An analytical study of various frequent itemset mining algorithms. Res. J. Comput. Inf. Technol. Sci. 1(1), 2–5 (2013)

    Google Scholar 

  23. Pramod, S., Vyas, O.P.: Survey on frequent itemset mining algorithms. Int. J. Comput. Appl. 1(5), 1–6 (2010)

    Google Scholar 

  24. Prithiviraj, P., Porkodi, R.: A comparative analysis of association rule mining algorithms in data mining: a study. Open J. Comput. Sci. Eng. Surv. 3(1), 98–119 (2015)

    Google Scholar 

  25. Tiwari, M., Jha, M.B., Yadav, O.: Performance analysis of data mining algorithms in Weka. IOSR J. Comput. Eng. ISSN 6, 661–2278 (2012)

    Google Scholar 

  26. Trivedi, M.M.: Review and analysis of various efficient frequent pattern algorithms. Int. J. Technol. Res. Eng. 2(2), 139–143 (2014)

    Google Scholar 

  27. Garg, K., Kumar, D.: Comparing the performance of frequent pattern mining algorithms. Int. J. Comput. Appl. 69(25), 21–28 (2013)

    Google Scholar 

  28. Sinha, G., Ghosh, S.M.: Identification of best algorithm in association rule mining based on performance. Int. J. Comput. Sci. Mob. Comput. 3(11), 38–45 (2014)

    Google Scholar 

  29. Nichol, M.B., Knight, T.K., Dow, T., Wygant, G., Borok, G., Hauch, O., O’Connor, R.: Quality of anticoagulation monitoring in nonvalvular atrial fibrillation patients: comparison of anticoagulation clinic versus usual care. Ann. Pharmacother. 42(1), 62–70 (2008)

    Article  Google Scholar 

  30. Yu, L.C., Chan, C.L., Lin, C.C., Lin, I.C.: Mining association language patterns using a distributional semantic model for negative life event classification. J. Biomed. Inform. 44(4), 509–518 (2011)

    Article  Google Scholar 

  31. Zhao, Q., Bhowmick, S.S.: Association Rule Mining: a Survey. Nanyang Technological University, Singapore (2003)

    Google Scholar 

  32. Said, A.M., Dominic, P.D.D., Abdullah, A.B.: A comparative study of fp-growth variations. Int. J. Comput. Sci. Netw. Secur. 9(5), 266–272 (2009)

    Google Scholar 

  33. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. ACM SIGMOD Rec. 29(2), 1–12 (2000)

    Article  Google Scholar 

  34. Zaiane, O.R., El-Hajj, M., Lu, P.: Fast parallel association rule mining without candidacy generation. In: Proceedings 2001 IEEE International Conference on Data Mining, pp. 665–668 (2001)

    Google Scholar 

  35. Borgelt, C., Borgelt, C., Kruse, R., Kruse, R.: Induction of association rules: apriori implementation. In: 15th Conference on Computational Statistics Physica Verlag, Heidelberg, Germany 2002, vol. 1, pp. 1–6 (2002)

    Google Scholar 

  36. Amazon, A.W.S., Miller, F.P., Vandome, A.F., McBrewster, J.: Amazon web services, vol. 12, pp. 1–3 (November 2012). http://aws.Amaz.com/es/ec2/

  37. Murty, J.: Programming Amazon Web Services: S3, EC2, SQS, FPS, and SimpleDB. O’Reilly Media Inc, Sebastopol (2008)

    Google Scholar 

  38. Robinson, D.: Amazon Web Services Made Simple: Learn how Amazon EC2, S3, SimpleDB and SQS Web Services Enables You to Reach Business Goals Faster. Emereo Pty Ltd, Brisbane (2008)

    Google Scholar 

  39. Goethals, B.: Frequent itemset mining implementations repository (2003). http://fimi.ua.ac.be/

  40. Fournier-Viger, P.: SPMF- an open-source data mining library (2003). http://www.philippe-fournier-viger.com/spmf/

Download references

Acknowledgment

We wish to thank Universiti Kebangsaan Malaysia (UKM) and Ministry of Higher Education Malaysia for supporting this work by research Grants (ERGS/1/2013/ICT07/UKM/02/3).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ahamed Lebbe Sayeth Saabith .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Saabith, A.L.S., Sundararajan, E., Bakar, A.A. (2015). Comparative Analysis of Different Versions of Association Rule Mining Algorithm on AWS-EC2. In: Badioze Zaman, H., et al. Advances in Visual Informatics. IVIC 2015. Lecture Notes in Computer Science(), vol 9429. Springer, Cham. https://doi.org/10.1007/978-3-319-25939-0_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-25939-0_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-25938-3

  • Online ISBN: 978-3-319-25939-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics