Skip to main content

Adapting a Multi-SOM Clustering Algorithm to Large Banking Data

  • Conference paper

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 745))

Abstract

It the recent years, Big Data (BD) has attracted researchers in many domains as a new concept providing opportunities to improve research applications including business, science, engineering. Big Data Analytics is becoming a practice that many researchers adopt to construct valuable information from BD. This paper presents the BD technologies and how BD is useful in Cluster Analysis. Then, a clustering approach named multi-SOM is studied. In doing so, a banking dataset is analyzed integrating R statistical tool with BD technologies that include Hadoop Distributed File System, HBase and Map Reduce. Hence, we aim to decrease the time execution of multi-SOM clustering method in determining the number of clusters using R and Hadoop. Results show the performance of integrating R and Hadoop to handle big data using multi-SOM clustering algorithm and to overcome the weaknesses of R.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  • Chan, J.O.: Big data customer knowledge management. Commun. IIMA 14(3) (2014). Article 5

    Google Scholar 

  • Chen, C.P., Zhang, C.-Y.: Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf. Sci. 275, 314–347 (2014)

    Article  Google Scholar 

  • Demchenko, Y., Grosso, P., De Laat, C., Membrey, P.: Addressing big data issues in scientific data infrastructure. In: International Conference on Collaboration Technologies and Systems (CTS) IEEE, pp. 48–55 (2013)

    Google Scholar 

  • Duhon, B.: It’s all in our heads. Assoc. Inf. Image Manage. Int. 12(8), 8–13 (1998)

    Google Scholar 

  • Douglas, L.: 3D data management: controlling data volume, velocity and variety, 6 Feb 2001

    Google Scholar 

  • Franke, B., Plante, J.-F., Roscher, R., et al.: Statistical inference, learning and models in big data. Int. Stat. Rev. 84(3), 371–389 (2016)

    Article  MathSciNet  Google Scholar 

  • Gandomi, A., Haider, M.: Beyond the hype: big data concepts, methods, and analytics. Int. J. Inf. Manage. 35(2), 137–144 (2015)

    Article  Google Scholar 

  • García, S., Ramírez-Gallego, S., Luengo, J., et al.: Big data preprocessing: methods and prospects. Big Data Anal. 1, 9 (2016)

    Article  Google Scholar 

  • Ghouila, A., BenYahia, S., Malouche, D., Jmel, H., Laouini, D., Guerfali, Z., Abdelhak, S.: Application of multi-SOM clustering approach to macrophage gene expression analysis. Infect. Genet. Evol. 9, 328–329 (2009)

    Article  Google Scholar 

  • Ihaka, R., Gentleman, R.: R: A language for data analysis and graphics. J. Comput. Graph. Stat. 5, 299–314 (1996)

    Google Scholar 

  • Khan, Z., Vorley, T.: Big data text analytics: an enabler of knowledge management. J. Knowl. Manage. 21, 18–34 (2017)

    Article  Google Scholar 

  • Khanchouch, I., Charrad, M., Limam, M.: A comparative study of multi-SOM algorithms for determining the optimal number of clusters. Int. J. Future Comput. Commun. 4(3), 198–202 (2014)

    Article  Google Scholar 

  • Khanchouch, I., Charrad, M., Limam, M.: An improved multi-SOM algorithm for determining the optimal number of clusters. In: Computer and Information Science, pp. 189–201. Springer (2015)

    Google Scholar 

  • Kohonen, T.: Automatic formation of topological maps of patterns in a self-organizing system. In: Proceedings of the 2SCIA, Scand, Conference on Image Analysis, pp. 214–220 (1981)

    Google Scholar 

  • Lamirel, J.C.: Using artificial neural networks for mapping of science and technology: a multi self-organizing maps approach. Scientometrics 51, 267–292 (2001)

    Article  Google Scholar 

  • Lamirel, J.C.: Multisom: a multimap extension of the som model. Application to information discovery in an iconographic context, pp. 1790–1795 (2002)

    Google Scholar 

  • Liao, Z., Yin, Q., Huang, Y., Sheng, L.: Management and application of mobile big data. Int. J. Embed. Syst. 7(1), 63–70 (2014)

    Article  Google Scholar 

  • Sajana, T., Sheela Rani, C.M., Narayana, K.V.: A survey on clustering techniques for big data mining. Indian J. Sci. Technol. 9 (2016)

    Google Scholar 

  • Shah, T., Rabhi, F., Ray, P.: Investigating an ontology-based approach for big data analysis of inter-dependent medical and oral health conditions. Cluster Comput. 18(1), 351–367 (2015)

    Article  Google Scholar 

  • Sheikholeslami, G., Chatterjee, S., Zhang, A.: WaveCluster: a wavelet-based clustering approach for spatial data in very large databases. Int. J. Very Large Data Bases (VLDB J.) 8, 289–304 (2000)

    Article  Google Scholar 

  • Shen, J., Chang, S.I., Lee, E.S., Deng, Y., Brown, S.J.: Determination of cluster number in clustering microarray data. Appl. Math. Comput. 1172–1185 (2005)

    Article  MathSciNet  Google Scholar 

  • Sivarajah, U., Kamal, M.M., Irani, Z., Weerakkody, V.: Critical big data analysis challenges and analytical methods. J. Bus. Res. 70, 263–286 (2017)

    Article  Google Scholar 

  • Tukey, J.W.: The Future of Data Analysis. Ann. Math. Stat. 33, 1–67 (1962). https://doi.org/10.1214/aoms/1177704711, http://projecteuclid.org/euclid.aoms/1177704711

    Article  MathSciNet  MATH  Google Scholar 

  • ur Rehman, M.H., Liew, C.S., Abbas, A., et al.: Big data reduction methods: a survey. Data Science and Engineering l.1, 265–284 (2016)

    Article  Google Scholar 

  • Wu, Y., Yuan, G.-X., Ma, K.-L.: Visualizing flow of uncertainty through analytical processes. IEEE Trans. Visual. Comput. Graph. 18(12), 2526–2535 (2012)

    Article  Google Scholar 

  • Yang, C., Huang, Q., Li, Z., Liu, K., Hu, F.: Big data and cloud computing: innovation opportunities and challenges. Int. J. Digital Earth 10, 13–53 (2016)

    Article  Google Scholar 

Download references

Acknowledgement

We are gratefully thankful to Mohamed Rahal for his helpful comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Imèn Khanchouch .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Cite this paper

Khanchouch, I., Limam, M. (2018). Adapting a Multi-SOM Clustering Algorithm to Large Banking Data. In: Rocha, Á., Adeli, H., Reis, L.P., Costanzo, S. (eds) Trends and Advances in Information Systems and Technologies. WorldCIST'18 2018. Advances in Intelligent Systems and Computing, vol 745. Springer, Cham. https://doi.org/10.1007/978-3-319-77703-0_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-77703-0_17

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-77702-3

  • Online ISBN: 978-3-319-77703-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics