Harmony Search for Data Mining with Big Data

  • Jerzy BalickiEmail author
  • Piotr Dryja
  • Waldemar Korłub
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9842)


In this paper, some harmony search algorithms have been proposed for data mining with big data. Three areas of big data processing have been studied to apply new metaheuristics. The first problem is related to MapReduce architecture that can be supported by a team of harmony search agents in grid infrastructure. The second dilemma involves development of harmony search in preprocessing of data series before data mining. Moreover, harmony search as a classification algorithm is studied as the third application. Finally, some outcomes for numerical experiments are submitted.


  1. 1.
    Apache Hadoop. Accessed 8 Mar 2016
  2. 2.
    Balicki, J.: Negative selection with ranking procedure in tabu-based multi-criterion evolutionary algorithm for task assignment. In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M., Dongarra, J. (eds.) ICCS 2006. LNCS, vol. 3993, pp. 863–870. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  3. 3.
    Balicki, J.: An adaptive quantum-based multiobjective evolutionary algorithm for efficient task assignment in distributed systems. In: Mastorakis, N. et al. (eds.) Proceedings of the 13th WSEAS International Conference on Computers Recent Advances in Computer Engineering, Rhodes, Greece, pp. 417–422 (2009)Google Scholar
  4. 4.
    Balicki, J., Kitowski, Z.: Multicriteria evolutionary algorithm with tabu search for task assignment. In: Zitzler, E., Deb, K., Thiele, L., Coello Coello, C.A., Corne, D.W. (eds.) EMO 2001. LNCS, vol. 1993, pp. 373–384. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  5. 5.
    Balicki, J., Korłub, W., Szymanski, J., Zakidalski, M.: Big data paradigm developed in volunteer grid system with genetic programming scheduler. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2014, Part I. LNCS, vol. 8467, pp. 771–782. Springer, Heidelberg (2014)CrossRefGoogle Scholar
  6. 6.
    Balicki, J., Korlub, W., Krawczyk, H., et al.: Genetic programming with negative selection for volunteer computing system optimization. In: Paja, W.A., Wilamowski, B.M. (eds.) Proceedings the 6th International Conference on Human System Interactions, 2013, Gdańsk, Poland, pp. 271–278 (2013)Google Scholar
  7. 7.
    Banerjee, S., Agarwal, N.: Analyzing collective behavior from blogs using swarm intelligence. Knowl. Inf. Syst. 33(3), 523–547 (2012)CrossRefGoogle Scholar
  8. 8.
    Birney, E.: The making of ENCODE: lessons for big-data projects. Nature 489, 49–51 (2012)CrossRefGoogle Scholar
  9. 9.
    BOINC. Accessed 25 Feb 2015
  10. 10.
    Bollen, J., Mao, H., Zeng, X.: Twitter mood predicts the stock market. J. Comput. Sci. 2(1), 1–8 (2010)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Bughin, J., Chui, M., Manyika, J.: Clouds, big data, and smart assets: ten tech-enabled business trends to watch. McKinSey Q. (2010)Google Scholar
  12. 12.
    Cao, L., Gorodetsky, V., Mitkas, P.A.: Agent mining: the synergy of agents and data mining. IEEE Intell. Syst. 24, 64–72 (2009)CrossRefGoogle Scholar
  13. 13.
    Chang, E.Y., Bai, H., Zhu, K.: Parallel algorithms for mining large-scale rich-media data. In: Proceedings of the ACM International Conference on Multimedia, pp. 917–918 (2009)Google Scholar
  14. 14.
    Chen, R., Sivakumar, K., Kargupta, H.: Collective mining of Bayesian networks from distributed heterogeneous data. Knowl. Inf. Syst. 6(2), 164–187 (2004)CrossRefGoogle Scholar
  15. 15.
    Comcute. Accessed 25 Jan 2016
  16. 16.
    Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51, 1–13 (2008)CrossRefGoogle Scholar
  17. 17.
    Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 71–80 (2000)Google Scholar
  18. 18.
    Eggermont, J.: Data mining using genetic programming: classification and symbolic regression. Ph.D thesis (2005)Google Scholar
  19. 19.
    Gillick, D., Faria, A., DeNero, J.: MapReduce: distributed computing for machine learning. Berkley, 18 December 2006Google Scholar
  20. 20.
    Gunarathne, T., et al.: Cloud computing paradigms for pleasingly parallel biomedical applications. In: Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, Chicago, Illinois, pp. 460–469 (2010)Google Scholar
  21. 21.
    Guojun, L., Ming, Z., Fei, Y.: Large-scale social network analysis based on MapReduce. In: Proceedings of the International Conference on Computational Aspects of Social Networks, 2010, pp. 487–490 (2010)Google Scholar
  22. 22.
    Jennings, N.R., Wooldridge, M.: Applications of intelligent agents. In: Jennings, N.R., Wooldridge, M. (eds.) Intelligent Agents, pp. 3–28. New York, Springer (1998)Google Scholar
  23. 23.
    Koza, J.R., et al.: Genetic Programming IV. Routine Human-Competitive Machine Intelligence. Kluwer Academic Publishers, New York (2003)zbMATHGoogle Scholar
  24. 24.
    Leyton-Brown, K., Shoham, Y.: Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations. Cambridge University Press, Cambridge (2008)zbMATHGoogle Scholar
  25. 25.
    Li, H.X., Chosler, R.: Application of multilayered multi-agent data mining architecture to bank domain. In: Proceedings of the International Conference on Wireless Communications, Networking and Mobile Computing, pp. 6721–6724 (2007)Google Scholar
  26. 26.
    Mardani, S., Akbari, M.K., Sharifian, S.: Fraud detection in process aware information systems using MapReduce. In: Proceedings on Information and Knowledge Technology, pp. 88–91 (2014)Google Scholar
  27. 27.
    Marz, N., Warren, J.: Big Data - Principles and Best Practices of Scalable Realtime Data Systems. Manning Publications Co., New York (2014)Google Scholar
  28. 28.
    O’Leary, D.E.: Artificial intelligence and big data. IEEE Intell. Syst. 28, 96–99 (2013)CrossRefGoogle Scholar
  29. 29.
    Ostrowski, D.A.: MapReduce design patterns for social networking analysis. In: Proceedings of the International Conference on Semantic Computing, pp. 316–319 (2014)Google Scholar
  30. 30.
    Raymer, M.L., Punch, W.F., Goodman, E.D., Kuhn, L.A.: Genetic programming for improved data mining: application to the biochemistry of protein interactions. In: Proceedings of the 1st Conference on Genetic Programming, pp. 375–380. MIT Press, Cambridge (1996)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2016

Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 2.5 International License (, which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Authors and Affiliations

  1. 1.Faculty of Mathematics and Information ScienceWarsaw University of TechnologyWarsawPoland
  2. 2.Faculty of Telecommunications, Electronics and InformaticsGdańsk University of TechnologyGdańskPoland

Personalised recommendations