Advertisement

Coherent Method for Determining the Initial Cluster Center

  • Bikram Keshari MishraEmail author
  • Amiya Kumar Rath
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 711)

Abstract

Several aspects of research works are now carried out on clustering of objects where the main focus is on finding the near-optimal cluster centers and obtaining the best possible clusters into which the objects fall into so that the desired expectations are met. This is because a bad selection of cluster center may result in dragging a data very far away from its actual cluster resulting in deficient clustering. Hence, we have accentuated on determining the near-optimal cluster centers and also position the data in their real clusters. We have explored three kinds of clustering techniques, viz. K-Means, FEKM-, and TLBO-based clusterings applied on quite a few data sets. Analysis was made considering two factors, namely cluster validation and average quantization error. Dunn’s index, Davies–Bouldin index, silhouette coefficient, and C index were used for quantitative evaluation of the clustering results. As per our anticipation, almost all validity indices provide promising outcome for both FEKM- and TLBO-based clusterings than K-Means inferring superior cluster formation. Further tests support that FEKM- and TLBO-based clustering has smaller value of quantization error than K-Means.

Keywords

Optimal centroid Cluster validation K-Means FEKM- and TLBO-based clustering 

Notes

Acknowledgements

We are extremely thankful to Sagarika Swain who provided expertise that greatly assisted the work. The authors also express gratitude to the editors and the anonymous referees for any productive suggestions on the paper.

References

  1. 1.
    Jain, A.K., Topchy, A., Law, M.H.C., and Buhmann J.M., “Landscape of clustering algorithms”, ‘in Proc. IAPR International conference on pattern recognition, Cambridge, UK’, pp. 260–263, 2004.Google Scholar
  2. 2.
    L. Kaufman, P.J. Rousseeuw, “Finding Groups in Data: An Introduction to Cluster Analysis”, John Wiley & Sons, 1990.Google Scholar
  3. 3.
    Huang Z, “Extensions to the k-means algorithm for clustering large data sets with categorical values,” Data Mining and Knowledge Discovery”, Vol. 2, pp. 283–304, 1998.Google Scholar
  4. 4.
    A.K. Jain, M.N. Murty, P.J. Flynn, “Data Clustering: A Review”, ACM Computing Surveys, Vol. 31, No. 3, pp 264–323, September, 1999.Google Scholar
  5. 5.
    Vladimir Estivill Castro, “Why so many clustering algorithms—A Position Paper”, ‘SIGKDD Explorations’, vol. 4, issue 1, pp 65–75,2002.Google Scholar
  6. 6.
    H. Xiong, G. Pandey, M. Steinbach and V. Kumar, “Enhancing Data Analysis with Noise Removal”, “IEEE Transactions on Knowledge and Data Engineering”, volume: 18, Issue: 3, pp. 304–319, 2006.Google Scholar
  7. 7.
    M Erisoglu, N Calis, S Sakallioglu, “A new algorithm for initial cluster centers in k-means algorithm”, “Pattern Recognition Letters”, volume 32, Issue 14, Pages 1701–1705, 2011.CrossRefGoogle Scholar
  8. 8.
    C.S. Li, “Cluster Center Initialization Method for K-means Algorithm Over Data Sets with Two Clusters”, “2011 International Conference on Advances in Engineering, Elsevier”, pp. 324–328, vol. 24, 2011.Google Scholar
  9. 9.
    Fuyuan Cao, Jiye Liang, Guang Jiang, “An initialization method for the K-Means algorithm using neighborhood model”, ‘Computers and Mathematics with Applications’, pp. 474–483, 2009.Google Scholar
  10. 10.
    R.V. Rao, V. J. Savsani and D.P. Vakharia, ‘Teaching–learning-based optimization: A novel method for constrained mechanical design optimization problems’. Computer-Aided Design 43, pp. 303–315, 2011.CrossRefGoogle Scholar
  11. 11.
    M. Halkidi, Y. Batistakis, M. Vazirgiannis, Clustering validity checking methods: Part ii, SIGMOD Record 31 (3) 2002, pp. 19–27.CrossRefGoogle Scholar
  12. 12.
    B.K. Mishra, N.R. Nayak, A.K. Rath and S. Swain, “Far Efficient K-Means Clustering Algorithm”, “Proceedings of the International Conference on Advances in Computing, Communications and Informatics”, ACM, pp. 106–110, 2012.Google Scholar
  13. 13.
    B.K. Mishra, N.R. Nayak and A.K. Rath, Assessment of basic clustering techniques using teaching-learning-based optimization, Int. J. Knowledge Engineering and Soft Data Paradigms, Vol. 5, No. 2, pp. 106–122, 2016.CrossRefGoogle Scholar
  14. 14.
    B.K. Mishra, N.R. Nayak, A.K. Rath and S. Swain, “Improving the Efficiency of Clustering by Using an Enhanced Clustering Methodology”, “International Journal of Advances in Engineering & Technology”, Vol. 4, Issue 2, pp. 415–424, 2012.Google Scholar
  15. 15.
    C. Merz and P. Murphy, UCI Repository of Machine Learning Databases, Available: http://ftp.ics.uci.edu/pub/machine-learning-databases.
  16. 16.
    Bagirov, A.M and Yearwood, J, “A new non-smooth optimization algorithm for minimum sum-of-squares clustering problems”, EJOR 170, 2 (2006), pp. 578–596.Google Scholar
  17. 17.
    H.S. Park and C.H. Jun,“A simple and fast algorithm for K-medoids clustering”, “Expert System with Applications”, pp. 3336–3341, 2009.CrossRefGoogle Scholar
  18. 18.
    J. C. Dunn, ‘A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters’. ‘J. Cybernetics’, vol. 3, pp. 32–57, 1973.Google Scholar
  19. 19.
    D. L. Davies and D. W. Bouldin, ‘A Cluster Separation Measure’, ‘IEEE Trans Pattern Analysis & Machine Intelligence’, vol. 1, pp 224–227, 1979.CrossRefGoogle Scholar
  20. 20.
    P. J. Rousseeuw, “Silhouettes: a graphical aid to the interpretation and validation of cluster analysis”, “Journal of Computational and Applied Mathematics”, vol. 20, pp. 53–65, 1987.CrossRefGoogle Scholar
  21. 21.
    L. J. Hubert and J. R. Levin, “A general statistical framework for accessing categorical clustering in free recall”, “Psychological Bulletin 83”, pp. 1072–1080, 1976.CrossRefGoogle Scholar
  22. 22.
    Shi Na, L. Xumin and G. Yong, “Research on K-Means clustering algorithm-An Improved K-Means Clustering Algorithm”. “IEEE 3rd International Symposium on Intelligent Information Technology and Security Informatics”, pp. 63–67, 2010.Google Scholar
  23. 23.
    R. Xu and D. Wunsch, ‘Survey of Clustering Algorithms’, “IEEE Transactions on Neural networks”, vol. 16, no. 3, 2005.Google Scholar
  24. 24.
    Y. M. Cheung, ‘A New Generalized K-Means Clustering Algorithm’. ‘Pattern Recognition Letters’, vol. 24, issue 15, pp. 2883–2893. 2003.CrossRefGoogle Scholar
  25. 25.
    K. A. Abdul Nazeer, M. P. Sebastian, “Improving the Accuracy and Efficiency of the k-means Clustering Algorithm”, “Proceedings of the World Congress on Engineering”, Vol I, 2009.Google Scholar
  26. 26.
    B. Amiri, (2012). ‘Application of Teaching-Learning-Based Optimization Algorithm on Cluster Analysis’. Journal of Basic and Applied Scientific Research, 2(11), pp. 11795–11802.Google Scholar
  27. 27.
    A. Naik. S. C Satpathy and K. Parvathi, ‘Improvement of initial cluster centre of c-means using Teaching learning based optimization’. ‘2nd International Conference on Communication, Computing & Security’, pp. 428–435, 2012.Google Scholar
  28. 28.
    J. Mac Queen, “Some methods for classification and analysis of multivariate observations”, “Fifth Berkeley Symposium on Mathematics, Statistics and Probability”, pp. 281–297, University of California Press, 1967.Google Scholar
  29. 29.
    R.V. Rao and V. Patel, ‘An elitist teaching-learning-based optimization algorithm for solving complex constrained optimization problems’. International Journal of Industrial Engineering Computations, pp. 535–560, 2012.CrossRefGoogle Scholar
  30. 30.
    S. C Satpathy and A. Naik, ‘Data Clustering Based on Teaching-Learning-Based Optimization’. SEMCCO, LNCS 7077, pp. 148–156, 2011.Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringVeer Surendra Sai University of TechnologyBurlaIndia

Personalised recommendations