Abstract
The application of population-based metaheuristics approaches to the association rules mining problem is explored in this paper. The combination of GPU and cluster-based parallel computing techniques is investigated for the purpose of accelerating the process of extracting the correlations between items in sizeable data instances. We propose four parallel-based approaches that benefit from the cluster intensive computing in the generation process and the massively GPU threading. This is by evaluating the association rules in parallel on GPU. To validate the proposed approaches, the most used population-based metaheuristics (GA, PSO, and BSO) have been executed on a cluster of GPUs to solve benchmarks of large and big ARM instances. We used Intel Xeon 64bit quad-core processor E5520 coupled to an Nvidia Tesla C2075 GPU device. The results show that the BSO outperforms GA and PSO. They also show that the proposed solution outperforms the HPC-based ARM approaches when exploring Webdocs instance (the largest instance existing on the web). To our knowledge, this is the first work that explores the combination of GPU and cluster-based parallel computing with the population-based metaheuristics in association rule mining.
Similar content being viewed by others
Notes
Ibnbadis is a cluster of CERIST research center, Algiers, Algeria.
Ibnbadis is a cluster of CERIST research center, Algiers, Algeria.
References
Olafsson, S., Li, X., Wu, S.: Operations research and data mining. Eur. J. Oper. Res. 187(3), 1429–1448 (2008)
Djenouri, Y., Habbas, Z., Djenouri, D.: Data mining-based decomposition for solving the MAXSAT problem: toward a new approach. IEEE Intell. Syst. 32(4), 48–58 (2017)
Martnez-Ballesteros, M., Nepomuceno-Chamorro, I.A., Riquelme, J.C.: Discovering gene association networks by multi-objective evolutionary quantitative association rules. J. Comput. Syst. Sci. 80(1), 118–136 (2014)
Liu, K., Hogan, W.R., Crowley, R.S.: Natural language processing methods and systems for biomedical ontology learning. J. Biomed. Inform. 44(1), 163–179 (2011)
Boukerche, A., Samarah, S.: A novel algorithm for mining association rules in wireless ad hoc sensor networks. IEEE Trans. Parallel Distrib. Syst. 19(7), 865–877 (2008)
Agrawal, R., Imielinski, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 207–216 (1993)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 1–12 (2000)
Zhou, X., Huang, Y.: An improved parallel association rules algorithm based on MapReduce framework for big data. In: 2014 11th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), pp. 284–288. IEEE (2014, August)
Ravi, V.T., Agrawal, G.: Performance issues in parallelizing data-intensive applications on a multi-core cluster. In: Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, pp. 308–315. IEEE Computer Society (2009, May)
Cryans, J.D., Rattich, S., Champagne, R.: Adaptation of APriori to MapReduce to build a warehouse of relations between named entities across the web. In: 2010 Second International Conference on Advances in Databases Knowledge and Data Applications (DBKDA), pp. 185–189. IEEE (2010, April)
Jiang, W., Ravi, V.T., Agrawal, G.: A Map-Reduce system with an alternate API for multi-core environments. In: Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, pp. 84–93. IEEE Computer Society (2010, May)
Li, H., Wang, Y., Zhang, D., Zhang, M., Chang, E.Y.: Pfp: parallel fp-growth for query recommendation. In: Proceedings of the 2008 ACM conference on Recommender systems, pp. 107–114. ACM (2008, October)
Zhou, J., Yu, K.-M., Wu, B.-C.: Parallel frequent patterns mining algorithm on GPU. In: 2010 IEEE International Conference on Systems Man and Cybernetics (SMC). IEEE (2010)
Djenouri, Y., Bendjoudi, A., Mehdi, M., Nouali-Taboudjemat, N., Habbas, Z.: GPU-based bees swarm optimization for association rules mining. J. Supercomput. 71(4), 1318–1344 (2015)
Cano, A., Luna, J.M., Ventura, S.: High performance evaluation of evolutionary-mined association rules on GPUs. J. Supercomput. 66(3), 1438–1461 (2013)
Djenouri, Y., Comuzzi, M.: Combining Apriori heuristic and bio-inspired algorithms for solving the frequent itemsets mining problem. Inf. Sci. 420, 1–15 (2017)
Kuo, R.J., Chao, C.M., Chiu, Y.T.: Application of particle swarm optimization to association rule mining. Appl. Soft Comput. 11(1), 326–336 (2011)
Djenouri, Y., Drias, H., Habbas, Z.: Bees swarm optimisation using multiple strategies for association rule mining. Int. J. Bio-Inspir. Comput. 6(4), 239–249 (2014)
Mata, J., Alvarez, J., Riquelme, J.: An evolutionary algorithm to discover numeric association rules. In: Proceedings of the ACM Symposium on Applied Computing SAC, pp. 590–594 (2002)
Romero, C., Zafra, A., Luna, J.M., Ventura, S.: Association rule mining using genetic programming to provide feedback to instructors from multiple-choice quiz data. Expert Syst. 30(2), 162–172 (2013)
Djenouri, Y., Comuzzi, M.: GA-Apriori: Combining Apriori heuristic and genetic algorithms for solving the frequent itemsets mining problem. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 138–148. Springer, Cham (2017, May)
Martinez-Ballesteros, M., Bacardit, J., Troncoso, A., Riquelme, J.C.: Enhancing the scalability of a genetic algorithm to discover quantitative association rules in large-scale datasets. Integr. Comput.-Aided Eng. 22(1), 21–39 (2015)
Wang, B., Merrick, K.E., Abbass, H.A.: Co-operative coevolutionary neural networks for mining functional association rules. IEEE Trans. Neural Netw. Learn. Syst. 28(6), 1331–1344 (2017)
Fan, Z., Qiu, F., Kaufman, A., Yoakum-Stover, S.: GPU cluster for high performance computing. In: Proceedings of the 2004 ACM/IEEE conference on Supercomputing, p. 47. IEEE Computer Society (2004, November)
Sarath, K.N.V.D., Ravi, V.: Association rule mining using binary particle swarm optimization. Eng. Appl. Artif. Intell. 26(8), 1832–1840 (2013)
Beiranvand, V., Mobasher-Kashani, M., Bakar, A.A.: Multi-objective PSO algorithm for mining numerical association rules without a priori discretization. Expert Syst. Appl. 41(9), 4259–4273 (2014)
Agrawal, J., Agrawal, S., Singhai, A., Sharma, S.: SET-PSO-based approach for mining positive and negative association rules. Knowl. Inf. Syst. 45(2), 453–471 (2015)
Djenouri, Y., Drias, H., Habbas, Z., Mosteghanemi, H.: Bees swarm optimization for web association rule mining. In: 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology (WI-IAT), vol. 3, pp. 142–146). IEEE (2012, December)
Djenouri, Y., Drias, H., Chemchem, A.: A hybrid bees swarm optimization and tabu search algorithm for association rule mining. In: 2013 World Congress on Nature and Biologically Inspired Computing (NaBIC), pp. 120–125. IEEE (2013, August)
Djenouri, Y., Drias, H., Habbas, Z.: Hybrid intelligent method for association rules mining using multiple strategies. Int. J. Appl. Metaheuristic Comput. (IJAMC) 5(1), 46–64 (2014)
Fang, W. et al.: Frequent itemset mining on graphics processors. In: Proceedings of the fifth international workshop on data management on new hardware. ACM (2009)
Adil, S.H., Qamar, S.: Implementation of association rule mining using CUDA. In: International Conference on Emerging Technologies, 2009. ICET 2009. IEEE (2009)
Silvestri, C., Orlando, S.: gpudci: exploiting gpus in frequent itemset mining. In: 2012 20th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP). IEEE (2012)
Orlando, S. et al.: Adaptive and resource-aware mining of frequent sets. In: 2002 IEEE International Conference on Data Mining, 2002. ICDM 2003. Proceedings. IEEE (2002)
Zhang, F., Zhang, Y., Bakos, J.: Gpapriori: Gpu-accelerated frequent itemset mining. In: 2011 IEEE International Conference on Cluster Computing (CLUSTER). IEEE (2011)
Djenouri, Y., Bendjoudi, A., Mehdi, M., Habbas, Z.: Reducing thread divergence in GPU-based bees swarm optimization applied to association rule mining. Pract. Exp. Concurr. Comput. 29(9) (2016)
Yoo, J.S., Boulware, D.: A framework of spatial co-location mining on MapReduce. In: 2013 IEEE International Conference on Big Data, pp. 44–44. IEEE (2013, October)
Ding, Q., Ding, Q., Perrizo, W.: PARMAn efficient algorithm to mine association rules from spatial data. IEEE Trans. Syst. Man Cybern. Part B 38(6), 1513–1524 (2008)
Taleb, A., Yahya, A., Taleb, N.: Parallel genetic algorithm model to extract association rules. In: DBKDA 2013, The Fifth International Conference on Advances in Databases, Knowledge, and Data Applications, pp. 56–64 (2013, January)
Bull, L., Studley, M., Bagnall, A., Whittley, I.: Learning classifier system ensembles with rule-sharing. IEEE Trans. Evolut. Comput. 11(4), 496–502 (2007)
Chen, Y., Li, F., Fan, J.: Mining association rules in big data with NGEP. Clust. Comput. 18(2), 577–585 (2015)
Sousa, T., Silva, A., Neves, A.: Particle swarm based data mining algorithms for classification tasks. Parallel Comput. 30(5), 767–783 (2004)
Djenouri, Y., Bendjoudi, A., Djenouri, D., Habbas, Z.: Parallel BSO algorithm for association rules mining using master/worker paradigm. In: International Conference on Parallel Processing and Applied Mathematics, pp. 258–268. Springer, New York (2015, September)
Orgerie, A.C., Assuncao, M.D.D., Lefevre, L.: A survey on techniques for improving the energy efficiency of large-scale distributed systems. ACM Comput. Surv. (CSUR) 46(4), 47 (2014)
Lucchese, C., Orlando, S., Perego, R., Silvestri, F.: WebDocs: a real-life huge transactional dataset. In: Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementation (2004, November)
Guvenir, H. Altay, Uysal, I.: Bilkent university function approximation repository. http://funapp.cs.bilkent.edu.tr/DataSets. Accessed 12 Mar 2012 (2000)
Kaur, B., Jindal, S.: Content based image retrieval with graphical processing unit. In: Int. Conf. on Recent Trends in Information, Telecommunication and Computing, ITC (2014, April)
Nobile, M.S., Cazzaniga, P., Besozzi, D., Mauri, G.: GPU-accelerated simulations of mass-action kinetics models with cupSODA. J. Supercomput. 69(1), 17–24 (2014)
Parthasarathy, S., Zaki, M.J., Ogihara, M., Li, W.: Parallel data mining for association rules on shared-memory systems. Knowl. Inf. Syst. 3(1), 1–29 (2001)
Agrawal, R., Shafer, J.C.: Parallel mining of association rules. IEEE Trans. Knowl. Data Eng. 8(6), 962–969 (1996)
Ryoo, S., Rodrigues, C.I., Stone, S.S., Stratton, J.A., Ueng, S.Z., Baghsorkhi, S.S., Wen-mei, W.H.: Program optimization carving for GPU computing. J. Parallel Distrib. Comput. 68(10), 1389–1401 (2008)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Djenouri, Y., Djenouri, D., Habbas, Z. et al. How to exploit high performance computing in population-based metaheuristics for solving association rule mining problem. Distrib Parallel Databases 36, 369–397 (2018). https://doi.org/10.1007/s10619-018-7218-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10619-018-7218-4