Adapting a Multi-SOM Clustering Algorithm to Large Banking Data
It the recent years, Big Data (BD) has attracted researchers in many domains as a new concept providing opportunities to improve research applications including business, science, engineering. Big Data Analytics is becoming a practice that many researchers adopt to construct valuable information from BD. This paper presents the BD technologies and how BD is useful in Cluster Analysis. Then, a clustering approach named multi-SOM is studied. In doing so, a banking dataset is analyzed integrating R statistical tool with BD technologies that include Hadoop Distributed File System, HBase and Map Reduce. Hence, we aim to decrease the time execution of multi-SOM clustering method in determining the number of clusters using R and Hadoop. Results show the performance of integrating R and Hadoop to handle big data using multi-SOM clustering algorithm and to overcome the weaknesses of R.
KeywordsBig data Big data analytics Clustering multiSOM RHadoop
We are gratefully thankful to Mohamed Rahal for his helpful comments and suggestions.
- Chan, J.O.: Big data customer knowledge management. Commun. IIMA 14(3) (2014). Article 5Google Scholar
- Demchenko, Y., Grosso, P., De Laat, C., Membrey, P.: Addressing big data issues in scientific data infrastructure. In: International Conference on Collaboration Technologies and Systems (CTS) IEEE, pp. 48–55 (2013)Google Scholar
- Duhon, B.: It’s all in our heads. Assoc. Inf. Image Manage. Int. 12(8), 8–13 (1998)Google Scholar
- Douglas, L.: 3D data management: controlling data volume, velocity and variety, 6 Feb 2001Google Scholar
- Ihaka, R., Gentleman, R.: R: A language for data analysis and graphics. J. Comput. Graph. Stat. 5, 299–314 (1996)Google Scholar
- Khanchouch, I., Charrad, M., Limam, M.: An improved multi-SOM algorithm for determining the optimal number of clusters. In: Computer and Information Science, pp. 189–201. Springer (2015)Google Scholar
- Kohonen, T.: Automatic formation of topological maps of patterns in a self-organizing system. In: Proceedings of the 2SCIA, Scand, Conference on Image Analysis, pp. 214–220 (1981)Google Scholar
- Lamirel, J.C.: Multisom: a multimap extension of the som model. Application to information discovery in an iconographic context, pp. 1790–1795 (2002)Google Scholar
- Sajana, T., Sheela Rani, C.M., Narayana, K.V.: A survey on clustering techniques for big data mining. Indian J. Sci. Technol. 9 (2016)Google Scholar