Determination of Customer Satisfaction using Improved K-means algorithm

Abstract

Effective management of customer’s knowledge leads to efficient Customer Relationship Management (CRM). To accurately predict customer’s behaviour, clustering, especially K-means, is one of the most important data mining techniques used in customer relationship management marketing, with which it is possible to identify customers’ behavioural patterns and, subsequently, to align marketing strategies with customer preferences so as to maintain the customers. However, it has been observed in various studies on K-means clustering that customers with different behavioural indicators in clustering may seem to be the same, implying that customer behavioural indicators do not play any significant role in customer clustering. Therefore, if the level of customer participation depends on behavioural parameters such as their satisfaction, it can have a negative effect on the K-means clusters and has no acceptable result. In this paper, customer behavioural features—malicious feature—is considered in customer clustering, as well as a method for finding the optimal number of clusters and the initial values of cluster centres to obtain more accurate results. Finally, according to the organizations’ need to extract knowledge from customers’ views through ranking customers based on factors affecting customer value, a method is proposed for modelling their behaviour and extracting knowledge for customer relationship management. The results of the evaluation of the customers of Hamkaran System’s Company show that the improved K-means method proposed in this paper outperforms K-means in terms of speed and accuracy.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3

References

  1. Alsaç A, Çolak M, Keskin GA (2017) An integrated customer relationship management and Data Mining framework for customer classification and risk analysis in health sector. In: IEEE International Conference on Industrial Technology and Management (ICITM), pp 41–46

  2. Alvandi M, Fazli S, Abdoli FS (2012) K-mean clustering method for analysis customer lifetime value with LRFM relationship model in banking services. Int Res J Appl Basic Sci 3(11):2294–2302

    Google Scholar 

  3. Anitha P, Patil MM (2019) RFM model for customer purchase behavior using K-means algorithm. J King Saud Univ Comput Inf Sci. https://doi.org/10.1016/j.jksuci.2019.12.011

    Article  Google Scholar 

  4. Ansari A, Riasi A (2016) Customer clustering using a combination of fuzzy c-means and genetic algorithms. Int J Bus Manag 11(7):59–66

    Article  Google Scholar 

  5. Arthur D, Vassilvitskii S (2007) K-means++: the advantages of careful seeding. In: Proceedings of the eighteenth annual ACM-SIAM symposium on discrete algorithms, pp 1027–1035. Society for Industrial and Applied Mathematics

  6. Bablani A, Edla DR, Kuppili V, Ramesh D (2020) A multi stage EEG data classification using K-means and feed forward neural network. Clin Epidemiol Glob Health. https://doi.org/10.1016/j.cegh.2020.01.008

    Article  Google Scholar 

  7. Bagirov AM (2008) Modified global K-means algorithm for minimum sum-of-squares clustering problems. Pattern Recogn 41(10):3192–3199

    MATH  Article  Google Scholar 

  8. Bagirov AM, Ugon J, Webb D (2011) Fast modified global K-means algorithm for incremental cluster construction. Pattern Recogn 44(4):866–876

    MATH  Article  Google Scholar 

  9. Bai L, Liang J, Guo Y (2018) An ensemble clusterer of multiple fuzzy k means clusterings to recognize arbitrarily shaped clusters. IEEE Trans Fuzzy Syst 26(6):3524–3533

    Google Scholar 

  10. Baxter R, He H, Williams G, Hawkins S, Gu L (2002) An empirical comparison of outlier detection methods. In: Sixth Pacific-Asia conference on knowledge discovery and data mining (PAKDD-02)

  11. Carnein M, Trautmann H (2019) Customer segmentation based on transactional data using stream clustering. In: Pacific-Asia conference on knowledge discovery and data mining, pp 280–292. Springer, Cham

    Google Scholar 

  12. Celebi ME, Kingravi HA, Vela PA (2013) A comparative study of efficient initialization methods for the K-means clustering algorithm. Expert Syst Appl 40(1):200–210

    Article  Google Scholar 

  13. Chen Y, Hu P, Wang W (2018) Improved K-means algorithm and its implementation based on mean shift. In: 2018 11th International congress on image and signal processing, biomedical engineering and informatics (CISP-BMEI), pp 1–5. IEEE

  14. Chiang WY (2018) Applying data mining for online CRM marketing strategy. Br Food J. https://doi.org/10.1108/BFJ-02-2017-0075

    Article  Google Scholar 

  15. Christy AJ, Umamakeswari A, Priyatharsini L, Neyaa A (2018) RFM ranking—an effective approach to customer segmentation. J King Saud Univ Comput Inf Sci. https://doi.org/10.1016/j.jksuci.2018.09.004

    Article  Google Scholar 

  16. Danesh M, Naghibzadeh M, Totonchi MRA, Danesh M, Minaei B, Shirgahi H (2011) Data clustering based on an efficient hybrid of K-harmonic means, PSO and GA. In: Transactions on computational collective intelligence IV, pp 125–140. Springer, Berlin, Heidelberg

    Google Scholar 

  17. Deng CH, Zhao WL (2018) Fast K-means based on k-NN Graph. In: 2018 IEEE 34th international conference on data engineering (ICDE), pp 1220–1223. IEEE

  18. Dong G, Jin Y, Wang S, Li W, Tao Z, Guo S (2019) DB-K means: an intrusion detection algorithm based on DBSCAN and K-means. In: 2019 20th Asia-Pacific network operations and management symposium (APNOMS), pp 1–4. IEEE

  19. Dyche J (2002) The CRM handbook: a business guide to customer relationship management. Addison-Wesley Professional, Boston

    Google Scholar 

  20. Erdil A, Öztürk A (2016) Improvement a quality oriented model for customer relationship management: a case study for shipment industry in Turkey. Procedia Soc Behav Sci 229:346–353

    Article  Google Scholar 

  21. Erisoglu M, Calis N, Sakallioglu S (2011) A new algorithm for initial cluster centers in K-means algorithm. Pattern Recogn Lett 32(14):1701–1705

    Article  Google Scholar 

  22. Eszergár-Kiss D, Caesar B (2017) Definition of user groups applying Ward’s method. Transp Res Procedia 22:25–34

    Article  Google Scholar 

  23. Fadaei A, Khasteh SH (2019) Enhanced K-means re-clustering over dynamic networks. Expert Syst Appl 132:126–140

    Article  Google Scholar 

  24. Feng Q, Zhu X, Pan JS (2015) Global linear regression coefficient classifier for recognition. Optik Int J Light Electron Opt 126(21):3234–3239

    Article  Google Scholar 

  25. Fränti P, Sieranoja S (2019) How much can K-means be improved by using better initialization and repeats? Pattern Recogn 93:95–112

    Article  Google Scholar 

  26. Gayathri A, Mohanavalli S (2011) Enhanced customer relationship management using fuzzy clustering. Int J Comput Sci Eng Technol 1(4):163–167

    Google Scholar 

  27. Govender P, Sivakumar V (2019) Application of K-means and hierarchical clustering techniques for analysis of air pollution: a review (1980–2019). Atmos Pollut Res

  28. Gu Y, Li K, Guo Z, Wang Y (2019) Semi-supervised K-means DDoS detection method using hybrid feature selection algorithm. IEEE Access 7:64351–64365

    Article  Google Scholar 

  29. Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier, Amsterdam

    Google Scholar 

  30. He BH, Song GF (2009) Knowledge management and data mining for supply chain risk management. In: IEEE international conference on management and service science, 2009, pp 1–4

  31. Hu J, Li M, Zhu E, Wang S, Liu X, Zhai Y (2019) Consensus multiple kernel K-means clustering with late fusion alignment and matrix-induced regularization. IEEE Access 7:136322–136331

    Article  Google Scholar 

  32. Hussain SF, Haris M (2019) A K-means based co-clustering (kCC) algorithm for sparse, high dimensional data. Expert Syst Appl 118:20–34

    Article  Google Scholar 

  33. Ismkhan H (2018) Ik-means−+: an iterative clustering algorithm based on an enhanced version of the K-means. Pattern Recogn 79:402–413

    Article  Google Scholar 

  34. Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv 31(3):264–323

    Article  Google Scholar 

  35. Jiang ZL, Guo N, Jin Y, Lv J, Wu Y, Liu Z, Fang J, Yiu SM, Wang X (2020) Efficient two-party privacy-preserving collaborative K-means clustering protocol supporting both storage and computation outsourcing. Inf Sci 518:168–180

    MathSciNet  Article  Google Scholar 

  36. Jones PJ, James MK, Davies MJ, Khunti K, Catt M, Yates T, Rowlands AV, Mirkes EM (2020) FilterK: a new outlier detection method for K-means clustering of physical activity. J Biomed Inf 103397:1–29

    Google Scholar 

  37. Kafashpour A, Tavakoli A, Alizadeh S (2012) Customers segmentation base on lifetime value, use RFM data mining. Iran J Public Manag 5(15):63–84

    Google Scholar 

  38. Kanungo T, Mount DM, Netanyahu NS, Piatko CD, Silverman R, Wu AY (2002) An efficient K-means clustering algorithm: analysis and implementation. IEEE Trans Pattern Anal Mach Intell 24(7):881–892

    MATH  Article  Google Scholar 

  39. Karczmarek P, Kiersztyn A, Pedrycz W, Al E (2020) K-means-based isolation forest. Knowledge-Based Syst 105659:1–15

    Google Scholar 

  40. Katsavounidis I, Kuo CCJ, Zhang Z (1994) A new initialization technique for generalized Lloyd iteration. IEEE Signal Process Lett 1(10):144–146

    Article  Google Scholar 

  41. Khalili-Damghani K, Abdi F, Abolmakarem S (2018) Hybrid soft computing approach based on clustering, rule mining, and decision tree analysis for customer segmentation problem: real case of customer-centric industries. Appl Soft Comput 73:816–828

    Article  Google Scholar 

  42. Kumar KM, Reddy ARM (2017) An efficient K-means clustering filtering algorithm using density based initial cluster centers. Inf Sci 418:286–301

    MathSciNet  Article  Google Scholar 

  43. Kumar V, Shah D, Venkatesan R (2006) Managing retailer profitability—one customer at a time! J Retail 82(4):277–294

    Article  Google Scholar 

  44. Lai JZ, Huang TJ (2010) Fast global K-means clustering using cluster membership and inequality. Pattern Recogn 43(5):1954–1963

    MATH  Article  Google Scholar 

  45. Laudon KC, Laudon JP (2015) Management information systems: managing the digital firm plus MyMISLab with Pearson eText–access card package. Prentice Hall Press, Upper Saddle River

    Google Scholar 

  46. Li DC, Dai WL, Tseng WT (2011) A two-stage clustering method to analyze customer characteristics to build discriminative customer management: a case of textile manufacturing business. Expert Syst Appl 38(6):7186–7191

    Article  Google Scholar 

  47. Li X, Qin B, Zhu Z, Lin Q (2017) Study on application of data mining in customer acquisition. In: DEStech transactions on social science, education and human science, (eemt)

  48. Liao SH, Chu PH, Hsiao PY (2012) Data mining techniques and applications—a decade review from 2000 to 2011. Expert Syst Appl 39(12):11303–11311

    Article  Google Scholar 

  49. Likas A, Vlassis N, Verbeek JJ (2003) The global K-means clustering algorithm. Pattern Recogn 36(2):451–461

    Article  Google Scholar 

  50. Lin CY (2020) A reversible privacy-preserving clustering technique based on K-means algorithm. Appl Soft Comput 87:105995

    Article  Google Scholar 

  51. MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, vol. 1, no. 14, pp 281–297

  52. Maghfirah MM, Adji TB, Setiawan NA (2015) Appropriate data mining technique and algorithm for using in analysis of customer relationship management (CRM) in bank industry. In: Seminar Nasional Aplikasi Teknologi Informasi (SNATI), vol. 1, no. 1

  53. Manxi W, Liandong W, Chenfeng W, Xiaoguang G, Ruohai D (2018). Finding community structure of Bayesian networks by improved K-means algorithm. In: 2018 IEEE 3rd international conference on image, vision and computing (ICIVC), pp 865–869. IEEE

  54. Maryani I, Riana D (2017) Clustering and profiling of customers using RFM for customer relationship management recommendations. In: IEEE 5th International Conference on Cyber and IT Service Management (CITSM), pp 1–6

  55. Min Z, Kai-fei D (2015) Improved research to K-means initial cluster centers. In: 2015 Ninth international conference on frontier of computer science and technology, pp 349–353. IEEE

  56. Mojena R (1977) Hierarchical grouping methods and stopping rules: an evaluation. Comput J 20(4):359–363

    MATH  Article  Google Scholar 

  57. Mukhlas A, Ahmad A, Zainun Z Berhad MP (2016) Data mining technique: towards supporting local co-operative society in customer profiling, market analysis and prototype construction. In: IEEE international conference on information and communication technology, pp 109–114

  58. Nguyen B, De Baets B (2019) Kernel-based distance metric learning for supervised K-means clustering. IEEE Trans Neural Netw Learn Syst 30(10):3084–3095

    MathSciNet  Article  Google Scholar 

  59. Nithya A, Appathurai A, Venkatadri N, Ramji DR, Palagan CA (2020) Kidney disease detection and segmentation using artificial neural network and multi-kernel K-means clustering for ultrasound images. Measurement 149:106952

    Article  Google Scholar 

  60. Olson DL (2017) Recency frequency and monetary model. In: Descriptive data mining. Springer, Singapore

    Google Scholar 

  61. Pawar RG (2016) Data mining: techniques for enhancing customer relationship management in fast moving consumer goods industries. Int Res J Multidiscip Stud 2(2):1–5

    Google Scholar 

  62. Peker S, Kocyigit A, Eren PE (2017) LRFMP model for customer segmentation in the grocery retail industry: a case study. Market Intell Plan 35(4):544–559

    Article  Google Scholar 

  63. Prabha D, Subramanian RS (2017) A survey on customer relationship management. In: 4th IEEE international conference on advanced computing and communication systems (ICACCS), pp 1–5

  64. Qadadeh W, Abdallah S (2018) Customers segmentation in the insurance company (TIC) dataset. Procedia Comput Sci 144:277–290

    Article  Google Scholar 

  65. Qiao J, Cai X, Xiao Q, Chen Z, Kulkarni P, Ferris C, Kamarthi S, Sridhar S (2019) Data on MRI brain lesion segmentation using K-means and Gaussian mixture model-expectation maximization. Data Brief 27:104628

    Article  Google Scholar 

  66. Rajeh SM, Koudehi FA, Seyedhosseini SM, Farazmand R (2014) A model for customer segmentation based on loyalty using data mining approach and fuzzy concept in Iranian Bank. Int J Bus Behav Sci 4(9):118–136

    Google Scholar 

  67. Redmond SJ, Heneghan C (2007) A method for initialising the K-means clustering algorithm using kd-trees. Pattern Recogn Lett 28(8):965–973

    Article  Google Scholar 

  68. Riveros NAM, Espitia BAC, Pico LEA (2019) Comparison between K-means and self-organizing maps algorithms used for diagnosis spinal column patients. Inform Med Unlocked 16:100206

    Article  Google Scholar 

  69. Schubert E, Sander J, Ester M, Kriegel HP, Xu X (2017) DBSCAN revisited, revisited: why and how you should (still) use DBSCAN. ACM Trans Database Syst 42(3):19

    MathSciNet  Article  Google Scholar 

  70. Sharma V, Bala M (2020) An improved task allocation strategy in cloud using modified K-means clustering technique. Egypt Inform J. https://doi.org/10.1016/j.eij.2020.02.001

    Article  Google Scholar 

  71. Shatnawi MQ, Yassein MB, Al-natour H (2017) Customer relationship management at Jordan University of science and technology: case study, issues and recommendations. In: IEEE international conference on engineering and technology (ICET), pp 1–6. IEEE

  72. Shmueli G, Bruce PC, Yahav I, Patel NR, Lichtendahl KC Jr (2017) Data mining for business analytics: concepts, techniques, and applications in R. Wiley, Hoboken

    Google Scholar 

  73. Sohrabi J, Hadavandi E (2011) Data mining in banking industry. Iranian Jahad Publishing, Amir Kabir University of Technology, Tehran, pp 25–70

    Google Scholar 

  74. Subbalakshmi C, Krishna GR, Rao SKM, Rao PV (2015) A Method to find optimum number of clusters based on fuzzy silhouette on dynamic data set. Procedia Comput Sci 46:346–353

    Article  Google Scholar 

  75. Szekely GJ, Rizzo ML (2005) Hierarchical clustering via joint between-within distances: extending ward’s minimum variance method. J Classif 22(2)

    MathSciNet  MATH  Article  Google Scholar 

  76. Szulanski G (1996) Exploring internal stickiness: impediments to the transfer of best practice within the firm. Strateg Manag J 17(S2):27–43

    Article  Google Scholar 

  77. Tzortzis G, Likas A (2014) The MinMax K-means clustering algorithm. Pattern Recogn 47(7):2505–2516

    Article  Google Scholar 

  78. Wang H, Zhang J (2010) Study of customer segmentation for auto services companies based on RFM model. School of Management, Wuhan University of Technology, Wuhan

    Google Scholar 

  79. Wang S, Zhu E, Hu J, Li M, Zhao K, Hu N, Liu X (2019) Efficient multiple kernel K-means clustering with late fusion. IEEE Access 7:61109–61120

    Article  Google Scholar 

  80. Xiaofeng Z, Xiaohong H (2017) Research on intrusion detection based on improved combination of K-means and multi-level SVM. In: 2017 IEEE 17th international conference on communication technology (ICCT), pp 2042–2045. IEEE

  81. Khajvand M, Tarokh MJ (2011) Analyzing customer segmentation based on customer value components (case study: a private bank)

  82. Yu SS, Chu SW, Wang CM, Chan YK, Chang TC (2018) Two improved K-means algorithms. Appl Soft Comput 68:747–755

    Article  Google Scholar 

  83. Yuliari NPP, Putra IKGD, Rusjayanti NKD (2015) Customer segmentation through fuzzy C-means and fuzzy RFM method. J Theor Appl Inf Technol 78(3):380–385

    Google Scholar 

  84. Zahrotun L (2017) Implementation of data mining technique for customer relationship management (CRM) on online shop tokodiapers.com with fuzzy c-means clustering. In: IEEE 2nd international conferences on information technology, information systems and electrical engineering (ICITISEE), pp 299–303

  85. Zhang GY, Wang CD, Huang D, Zheng WS, Zhou YR (2018) TW-Co-K-means: two-level weighted collaborative K-means for multi-view clustering. Knowl Based Syst 150:127–138

    Article  Google Scholar 

  86. Žiberna A (2020) K-means-based algorithm for blockmodeling linked networks. Soc Netw 61:153–169

    Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Sima Emadi.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Human and animal rights

This article does not contain any studies with human participants or animals performed by the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Communicated by V. Loia.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zare, H., Emadi, S. Determination of Customer Satisfaction using Improved K-means algorithm. Soft Comput (2020). https://doi.org/10.1007/s00500-020-04988-4

Download citation

Keywords

  • Customer relationship management
  • K-means
  • Customer life cycle value
  • Data mining
  • Customer satisfaction