Skip to main content
Log in

How to exploit high performance computing in population-based metaheuristics for solving association rule mining problem

  • Published:
Distributed and Parallel Databases Aims and scope Submit manuscript

Abstract

The application of population-based metaheuristics approaches to the association rules mining problem is explored in this paper. The combination of GPU and cluster-based parallel computing techniques is investigated for the purpose of accelerating the process of extracting the correlations between items in sizeable data instances. We propose four parallel-based approaches that benefit from the cluster intensive computing in the generation process and the massively GPU threading. This is by evaluating the association rules in parallel on GPU. To validate the proposed approaches, the most used population-based metaheuristics (GA, PSO, and BSO) have been executed on a cluster of GPUs to solve benchmarks of large and big ARM instances. We used Intel Xeon 64bit quad-core processor E5520 coupled to an Nvidia Tesla C2075 GPU device. The results show that the BSO outperforms GA and PSO. They also show that the proposed solution outperforms the HPC-based ARM approaches when exploring Webdocs instance (the largest instance existing on the web). To our knowledge, this is the first work that explores the combination of GPU and cluster-based parallel computing with the population-based metaheuristics in association rule mining.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. Ibnbadis is a cluster of CERIST research center, Algiers, Algeria.

  2. http://fimi.ua.ac.be/data/.

  3. https://sourceforge.net/projects/ibmquestdatagen/.

  4. http://fimi.ua.ac.be/data/webdocs.

  5. Ibnbadis is a cluster of CERIST research center, Algiers, Algeria.

References

  1. Olafsson, S., Li, X., Wu, S.: Operations research and data mining. Eur. J. Oper. Res. 187(3), 1429–1448 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  2. Djenouri, Y., Habbas, Z., Djenouri, D.: Data mining-based decomposition for solving the MAXSAT problem: toward a new approach. IEEE Intell. Syst. 32(4), 48–58 (2017)

    Article  Google Scholar 

  3. Martnez-Ballesteros, M., Nepomuceno-Chamorro, I.A., Riquelme, J.C.: Discovering gene association networks by multi-objective evolutionary quantitative association rules. J. Comput. Syst. Sci. 80(1), 118–136 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  4. Liu, K., Hogan, W.R., Crowley, R.S.: Natural language processing methods and systems for biomedical ontology learning. J. Biomed. Inform. 44(1), 163–179 (2011)

    Article  Google Scholar 

  5. Boukerche, A., Samarah, S.: A novel algorithm for mining association rules in wireless ad hoc sensor networks. IEEE Trans. Parallel Distrib. Syst. 19(7), 865–877 (2008)

    Article  Google Scholar 

  6. Agrawal, R., Imielinski, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 207–216 (1993)

  7. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 1–12 (2000)

  8. Zhou, X., Huang, Y.: An improved parallel association rules algorithm based on MapReduce framework for big data. In: 2014 11th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), pp. 284–288. IEEE (2014, August)

  9. Ravi, V.T., Agrawal, G.: Performance issues in parallelizing data-intensive applications on a multi-core cluster. In: Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, pp. 308–315. IEEE Computer Society (2009, May)

  10. Cryans, J.D., Rattich, S., Champagne, R.: Adaptation of APriori to MapReduce to build a warehouse of relations between named entities across the web. In: 2010 Second International Conference on Advances in Databases Knowledge and Data Applications (DBKDA), pp. 185–189. IEEE (2010, April)

  11. Jiang, W., Ravi, V.T., Agrawal, G.: A Map-Reduce system with an alternate API for multi-core environments. In: Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, pp. 84–93. IEEE Computer Society (2010, May)

  12. Li, H., Wang, Y., Zhang, D., Zhang, M., Chang, E.Y.: Pfp: parallel fp-growth for query recommendation. In: Proceedings of the 2008 ACM conference on Recommender systems, pp. 107–114. ACM (2008, October)

  13. Zhou, J., Yu, K.-M., Wu, B.-C.: Parallel frequent patterns mining algorithm on GPU. In: 2010 IEEE International Conference on Systems Man and Cybernetics (SMC). IEEE (2010)

  14. Djenouri, Y., Bendjoudi, A., Mehdi, M., Nouali-Taboudjemat, N., Habbas, Z.: GPU-based bees swarm optimization for association rules mining. J. Supercomput. 71(4), 1318–1344 (2015)

    Article  Google Scholar 

  15. Cano, A., Luna, J.M., Ventura, S.: High performance evaluation of evolutionary-mined association rules on GPUs. J. Supercomput. 66(3), 1438–1461 (2013)

    Article  Google Scholar 

  16. Djenouri, Y., Comuzzi, M.: Combining Apriori heuristic and bio-inspired algorithms for solving the frequent itemsets mining problem. Inf. Sci. 420, 1–15 (2017)

    Article  Google Scholar 

  17. Kuo, R.J., Chao, C.M., Chiu, Y.T.: Application of particle swarm optimization to association rule mining. Appl. Soft Comput. 11(1), 326–336 (2011)

    Article  Google Scholar 

  18. Djenouri, Y., Drias, H., Habbas, Z.: Bees swarm optimisation using multiple strategies for association rule mining. Int. J. Bio-Inspir. Comput. 6(4), 239–249 (2014)

    Article  Google Scholar 

  19. Mata, J., Alvarez, J., Riquelme, J.: An evolutionary algorithm to discover numeric association rules. In: Proceedings of the ACM Symposium on Applied Computing SAC, pp. 590–594 (2002)

  20. Romero, C., Zafra, A., Luna, J.M., Ventura, S.: Association rule mining using genetic programming to provide feedback to instructors from multiple-choice quiz data. Expert Syst. 30(2), 162–172 (2013)

    Article  Google Scholar 

  21. Djenouri, Y., Comuzzi, M.: GA-Apriori: Combining Apriori heuristic and genetic algorithms for solving the frequent itemsets mining problem. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 138–148. Springer, Cham (2017, May)

  22. Martinez-Ballesteros, M., Bacardit, J., Troncoso, A., Riquelme, J.C.: Enhancing the scalability of a genetic algorithm to discover quantitative association rules in large-scale datasets. Integr. Comput.-Aided Eng. 22(1), 21–39 (2015)

    Google Scholar 

  23. Wang, B., Merrick, K.E., Abbass, H.A.: Co-operative coevolutionary neural networks for mining functional association rules. IEEE Trans. Neural Netw. Learn. Syst. 28(6), 1331–1344 (2017)

    Article  Google Scholar 

  24. Fan, Z., Qiu, F., Kaufman, A., Yoakum-Stover, S.: GPU cluster for high performance computing. In: Proceedings of the 2004 ACM/IEEE conference on Supercomputing, p. 47. IEEE Computer Society (2004, November)

  25. Sarath, K.N.V.D., Ravi, V.: Association rule mining using binary particle swarm optimization. Eng. Appl. Artif. Intell. 26(8), 1832–1840 (2013)

    Article  Google Scholar 

  26. Beiranvand, V., Mobasher-Kashani, M., Bakar, A.A.: Multi-objective PSO algorithm for mining numerical association rules without a priori discretization. Expert Syst. Appl. 41(9), 4259–4273 (2014)

    Article  Google Scholar 

  27. Agrawal, J., Agrawal, S., Singhai, A., Sharma, S.: SET-PSO-based approach for mining positive and negative association rules. Knowl. Inf. Syst. 45(2), 453–471 (2015)

    Article  Google Scholar 

  28. Djenouri, Y., Drias, H., Habbas, Z., Mosteghanemi, H.: Bees swarm optimization for web association rule mining. In: 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology (WI-IAT), vol. 3, pp. 142–146). IEEE (2012, December)

  29. Djenouri, Y., Drias, H., Chemchem, A.: A hybrid bees swarm optimization and tabu search algorithm for association rule mining. In: 2013 World Congress on Nature and Biologically Inspired Computing (NaBIC), pp. 120–125. IEEE (2013, August)

  30. Djenouri, Y., Drias, H., Habbas, Z.: Hybrid intelligent method for association rules mining using multiple strategies. Int. J. Appl. Metaheuristic Comput. (IJAMC) 5(1), 46–64 (2014)

    Article  Google Scholar 

  31. Fang, W. et al.: Frequent itemset mining on graphics processors. In: Proceedings of the fifth international workshop on data management on new hardware. ACM (2009)

  32. Adil, S.H., Qamar, S.: Implementation of association rule mining using CUDA. In: International Conference on Emerging Technologies, 2009. ICET 2009. IEEE (2009)

  33. Silvestri, C., Orlando, S.: gpudci: exploiting gpus in frequent itemset mining. In: 2012 20th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP). IEEE (2012)

  34. Orlando, S. et al.: Adaptive and resource-aware mining of frequent sets. In: 2002 IEEE International Conference on Data Mining, 2002. ICDM 2003. Proceedings. IEEE (2002)

  35. Zhang, F., Zhang, Y., Bakos, J.: Gpapriori: Gpu-accelerated frequent itemset mining. In: 2011 IEEE International Conference on Cluster Computing (CLUSTER). IEEE (2011)

  36. Djenouri, Y., Bendjoudi, A., Mehdi, M., Habbas, Z.: Reducing thread divergence in GPU-based bees swarm optimization applied to association rule mining. Pract. Exp. Concurr. Comput. 29(9) (2016)

  37. Yoo, J.S., Boulware, D.: A framework of spatial co-location mining on MapReduce. In: 2013 IEEE International Conference on Big Data, pp. 44–44. IEEE (2013, October)

  38. Ding, Q., Ding, Q., Perrizo, W.: PARMAn efficient algorithm to mine association rules from spatial data. IEEE Trans. Syst. Man Cybern. Part B 38(6), 1513–1524 (2008)

    Article  Google Scholar 

  39. Taleb, A., Yahya, A., Taleb, N.: Parallel genetic algorithm model to extract association rules. In: DBKDA 2013, The Fifth International Conference on Advances in Databases, Knowledge, and Data Applications, pp. 56–64 (2013, January)

  40. Bull, L., Studley, M., Bagnall, A., Whittley, I.: Learning classifier system ensembles with rule-sharing. IEEE Trans. Evolut. Comput. 11(4), 496–502 (2007)

    Article  Google Scholar 

  41. Chen, Y., Li, F., Fan, J.: Mining association rules in big data with NGEP. Clust. Comput. 18(2), 577–585 (2015)

    Article  Google Scholar 

  42. Sousa, T., Silva, A., Neves, A.: Particle swarm based data mining algorithms for classification tasks. Parallel Comput. 30(5), 767–783 (2004)

    Article  Google Scholar 

  43. Djenouri, Y., Bendjoudi, A., Djenouri, D., Habbas, Z.: Parallel BSO algorithm for association rules mining using master/worker paradigm. In: International Conference on Parallel Processing and Applied Mathematics, pp. 258–268. Springer, New York (2015, September)

  44. Orgerie, A.C., Assuncao, M.D.D., Lefevre, L.: A survey on techniques for improving the energy efficiency of large-scale distributed systems. ACM Comput. Surv. (CSUR) 46(4), 47 (2014)

    Article  Google Scholar 

  45. Lucchese, C., Orlando, S., Perego, R., Silvestri, F.: WebDocs: a real-life huge transactional dataset. In: Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementation (2004, November)

  46. Guvenir, H. Altay, Uysal, I.: Bilkent university function approximation repository. http://funapp.cs.bilkent.edu.tr/DataSets. Accessed 12 Mar 2012 (2000)

  47. Kaur, B., Jindal, S.: Content based image retrieval with graphical processing unit. In: Int. Conf. on Recent Trends in Information, Telecommunication and Computing, ITC (2014, April)

  48. Nobile, M.S., Cazzaniga, P., Besozzi, D., Mauri, G.: GPU-accelerated simulations of mass-action kinetics models with cupSODA. J. Supercomput. 69(1), 17–24 (2014)

    Article  Google Scholar 

  49. Parthasarathy, S., Zaki, M.J., Ogihara, M., Li, W.: Parallel data mining for association rules on shared-memory systems. Knowl. Inf. Syst. 3(1), 1–29 (2001)

    Article  MATH  Google Scholar 

  50. Agrawal, R., Shafer, J.C.: Parallel mining of association rules. IEEE Trans. Knowl. Data Eng. 8(6), 962–969 (1996)

    Article  Google Scholar 

  51. Ryoo, S., Rodrigues, C.I., Stone, S.S., Stratton, J.A., Ueng, S.Z., Baghsorkhi, S.S., Wen-mei, W.H.: Program optimization carving for GPU computing. J. Parallel Distrib. Comput. 68(10), 1389–1401 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Youcef Djenouri.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Djenouri, Y., Djenouri, D., Habbas, Z. et al. How to exploit high performance computing in population-based metaheuristics for solving association rule mining problem. Distrib Parallel Databases 36, 369–397 (2018). https://doi.org/10.1007/s10619-018-7218-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10619-018-7218-4

Keywords

Navigation