Advertisement

Comparative Analysis of Different Versions of Association Rule Mining Algorithm on AWS-EC2

  • Ahamed Lebbe Sayeth SaabithEmail author
  • Elankovan Sundararajan
  • Azuraliza Abu Bakar
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9429)

Abstract

Data mining is an essential step of knowledge discovery in databases (KDD) process by analyzing the huge amount of data from different perspectives and summarizing it into potentially valuable, valid, novel, interesting, and previously unknown information. Due to the importance of extracting knowledge from the massive data repositories, data mining is an essential components in various fields. Association rule mining (ARM), is one of the most important and well researched techniques of data mining, It aims to extract essential relationships, frequent patterns, associations among itemsets in the transaction databases or other data repositories. Many algorithm have been proposed to find the frequent itemset efficiently. In this research, we have chosen four well established frequent itemset mining methods which are Apriori, Apriori TID, Eclat, and FP-Growth to analyze their performance on cloud environment. Cloud computing is a new paradigm to analyze big data efficiently and cost effectively. In this study we analyzed the algorithms on Amazon web service (AWS) platform using elastic cloud computing (EC2) service. We thereafter compare the four algorithms based on their execution time by varying the minimum support (min_sup) values.

Keywords

KDD ARM Cloud computing AWS-EC2 Data mining 

Notes

Acknowledgment

We wish to thank Universiti Kebangsaan Malaysia (UKM) and Ministry of Higher Education Malaysia for supporting this work by research Grants (ERGS/1/2013/ICT07/UKM/02/3).

References

  1. 1.
    Tan, P.: Introduction to Data Mining, vol. 1. Pearson Addison Wesley, Boston (2007)Google Scholar
  2. 2.
    Hand, D.J.: Principles of Data Mining, vol. 30, no. 7. MIT press, Cambridge (2007)Google Scholar
  3. 3.
    Ngai, E.W.T., Xiu, L., Chau, D.C.K.: Application of data mining techniques in customer relationship management: a literature review and classification. Expert Syst. Appl. 36(2 PART 2), 2592–2602 (2009)CrossRefGoogle Scholar
  4. 4.
    Shaw, M.J.B.C., Subramaniam, C., Tan, G.W., Welge, M.E.: Knowledge management and data mining for marketing. Decis. Support Syst. 31(1), 127–137 (2001)CrossRefGoogle Scholar
  5. 5.
    Obenshain, M.K.: Application of data mining techniques to healthcare data. Infect. Control Hosp. Epidemiol. 25(8), 690–695 (2004)CrossRefGoogle Scholar
  6. 6.
    Antonie, M., Coman, A., Zaiane, O.R.: Application of data mining techniques for medical image classification. In: Proceedings of the Second International Workshop on Multimedia Data Mining (MDM/KDD 2001), pp. 94–101 (2001)Google Scholar
  7. 7.
    Srivastava, J., Cooley, R., Deshpande, M., Tan, P.-N.: Web usage mining: discovery and applications of usage patterns from web data. ACM SIGKDD 1(2), 12–23 (2000)CrossRefGoogle Scholar
  8. 8.
    Han, J., Kamber, M.: Data Mining, Southeast Asia Edition: Concepts and Techniques. Morgan Kaufmann, Los Altos (2006)Google Scholar
  9. 9.
    Hipp, J., Güntzer, U., Nakhaeizadeh, G.: Algorithms for association rule mining - a general survey and comparison. ACM SIGKDD Explor. Newsl. 2(1), 58–64 (2000)CrossRefGoogle Scholar
  10. 10.
    Zhang, C., Zhang, S.: Association Rule Mining: Models and Algorithms, vol. 2307. Springer, Berlin (2002)Google Scholar
  11. 11.
    Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. ACM SIGMOD Rec. 22, 207–216 (1993)CrossRefGoogle Scholar
  12. 12.
    Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of 20th International Conference on Very Large Data bases, VLDB (1994)Google Scholar
  13. 13.
    Witten, I., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, Los Altos (2005)Google Scholar
  14. 14.
    Ambulkar, B., Borkar, V.: Data mining in cloud computing. In: MPGI National Multi Conference, pp. 23–26 (2012)Google Scholar
  15. 15.
    Petre, R.S.: Data mining in cloud computing. Datab. Syst. J. 3(3), 67–71 (2012)MathSciNetGoogle Scholar
  16. 16.
    Zaki, M.J.: Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 12(3), 372–390 (2000)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Zaki, M.J., Gouda, K.: Fast vertical mining using diffsets. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and data mining - KDD 2003, p. 326 (2003)Google Scholar
  18. 18.
    Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min. Knowl. Discov. 8(1), 53–87 (2004)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Borgelt, C.: Keeping things simple: finding frequent item sets by recursive elimination. In: Proceedings of the 1st International Workshop on Open Source Data Mining: Frequent Pattern Mining Implementations, pp. 66–70 (2005)Google Scholar
  20. 20.
    Deng, Z.-H., Lv, S.-L.S.: Fast mining frequent itemsets using nodesets. Expert Syst. Appl. 41(10), 4505–4512 (2014)CrossRefGoogle Scholar
  21. 21.
    Krishna, T.: Effectiveness of various FPM algorithms in data mining. ijcsit.org 02(01), 01–05 (2014)CrossRefGoogle Scholar
  22. 22.
    Patel Tushar, S., Mayur, P., Dhara, L., Jahnvi, K., Piyusha, D., Ashish, P., Reecha, P., Tushar, S.P., Mayur, P., Dhara, L.: An analytical study of various frequent itemset mining algorithms. Res. J. Comput. Inf. Technol. Sci. 1(1), 2–5 (2013)Google Scholar
  23. 23.
    Pramod, S., Vyas, O.P.: Survey on frequent itemset mining algorithms. Int. J. Comput. Appl. 1(5), 1–6 (2010)Google Scholar
  24. 24.
    Prithiviraj, P., Porkodi, R.: A comparative analysis of association rule mining algorithms in data mining: a study. Open J. Comput. Sci. Eng. Surv. 3(1), 98–119 (2015)Google Scholar
  25. 25.
    Tiwari, M., Jha, M.B., Yadav, O.: Performance analysis of data mining algorithms in Weka. IOSR J. Comput. Eng. ISSN 6, 661–2278 (2012)Google Scholar
  26. 26.
    Trivedi, M.M.: Review and analysis of various efficient frequent pattern algorithms. Int. J. Technol. Res. Eng. 2(2), 139–143 (2014)Google Scholar
  27. 27.
    Garg, K., Kumar, D.: Comparing the performance of frequent pattern mining algorithms. Int. J. Comput. Appl. 69(25), 21–28 (2013)Google Scholar
  28. 28.
    Sinha, G., Ghosh, S.M.: Identification of best algorithm in association rule mining based on performance. Int. J. Comput. Sci. Mob. Comput. 3(11), 38–45 (2014)Google Scholar
  29. 29.
    Nichol, M.B., Knight, T.K., Dow, T., Wygant, G., Borok, G., Hauch, O., O’Connor, R.: Quality of anticoagulation monitoring in nonvalvular atrial fibrillation patients: comparison of anticoagulation clinic versus usual care. Ann. Pharmacother. 42(1), 62–70 (2008)CrossRefGoogle Scholar
  30. 30.
    Yu, L.C., Chan, C.L., Lin, C.C., Lin, I.C.: Mining association language patterns using a distributional semantic model for negative life event classification. J. Biomed. Inform. 44(4), 509–518 (2011)CrossRefGoogle Scholar
  31. 31.
    Zhao, Q., Bhowmick, S.S.: Association Rule Mining: a Survey. Nanyang Technological University, Singapore (2003)Google Scholar
  32. 32.
    Said, A.M., Dominic, P.D.D., Abdullah, A.B.: A comparative study of fp-growth variations. Int. J. Comput. Sci. Netw. Secur. 9(5), 266–272 (2009)Google Scholar
  33. 33.
    Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. ACM SIGMOD Rec. 29(2), 1–12 (2000)CrossRefGoogle Scholar
  34. 34.
    Zaiane, O.R., El-Hajj, M., Lu, P.: Fast parallel association rule mining without candidacy generation. In: Proceedings 2001 IEEE International Conference on Data Mining, pp. 665–668 (2001)Google Scholar
  35. 35.
    Borgelt, C., Borgelt, C., Kruse, R., Kruse, R.: Induction of association rules: apriori implementation. In: 15th Conference on Computational Statistics Physica Verlag, Heidelberg, Germany 2002, vol. 1, pp. 1–6 (2002)Google Scholar
  36. 36.
    Amazon, A.W.S., Miller, F.P., Vandome, A.F., McBrewster, J.: Amazon web services, vol. 12, pp. 1–3 (November 2012). http://aws.Amaz.com/es/ec2/
  37. 37.
    Murty, J.: Programming Amazon Web Services: S3, EC2, SQS, FPS, and SimpleDB. O’Reilly Media Inc, Sebastopol (2008)Google Scholar
  38. 38.
    Robinson, D.: Amazon Web Services Made Simple: Learn how Amazon EC2, S3, SimpleDB and SQS Web Services Enables You to Reach Business Goals Faster. Emereo Pty Ltd, Brisbane (2008)Google Scholar
  39. 39.
    Goethals, B.: Frequent itemset mining implementations repository (2003). http://fimi.ua.ac.be/
  40. 40.
    Fournier-Viger, P.: SPMF- an open-source data mining library (2003). http://www.philippe-fournier-viger.com/spmf/

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Ahamed Lebbe Sayeth Saabith
    • 1
    Email author
  • Elankovan Sundararajan
    • 1
  • Azuraliza Abu Bakar
    • 2
  1. 1.Faculty of Information Science and Technology, Centre for Software Technology and ManagementUniversiti Kebangsaan Malaysia, UKMBangiMalaysia
  2. 2.Faculty of Information Science and Technology, Center for Artificial Intelligence and TechnologyUniversiti Kebangsaan Malaysia, UKMBangiMalaysia

Personalised recommendations