Advertisement

Clustering of Economic Data with Modified K-Mean Technique

  • Trung T. PhamEmail author
Conference paper
Part of the Lecture Notes in Networks and Systems book series (LNNS, volume 69)

Abstract

This paper presents a newly modified K-Mean technique for clustering data that are not situated around a single point center. When the clusters are elongated, the traditional K-Mean technique cannot yield meaningful results. In modifying the K-Mean technique to allow a center to be a line segment, elongated clusters can be extracted for analysis. The distance function is modified to measure the distance between a point and a set (line segment). The modified technique can be easily extended to multidimensional data where the center is shaped as a hyperplane, and the clusters of data that are situated around the hyperplane can be easily extracted and modeled into a regression model. The technique is applied to economic data of Chile, where the clusters are shown to be of irregular shapes, and where it is common to find regression model representing data sets.

Keywords

Clustering technique Modified K-mean Line segment center Distance function Regression model 

Notes

Acknowledgment

Part of this study was supported by the Chilean R&D Agency CONICYT, under the research grant FONDEF IT15I10042 for the duration of 2016–2018. Economic data used in this paper were obtained from the Central Bank of Chile.

References

  1. 1.
    Aggarwal, C.C., Reddy, C.K.: Data Clustering: Algorithms and Applications. Chapman & Hall/CRC, Boca Raton (2013)CrossRefGoogle Scholar
  2. 2.
    Gan, G., Ma, C., Wu, J.: Data Clustering: Theory, Algorithms, and Applications. Society for Industrial and Applied Mathematics (SIAM), Philadelphis (2007)CrossRefGoogle Scholar
  3. 3.
    Akhiezer, N.I., Glazman, I.M.: Theory of Linear Operators in Hilbert Space. Dover Publications, New York (1993)zbMATHGoogle Scholar
  4. 4.
    Young, N.: An Introduction to Hilbert Space. Cambridge University Press, Cambridge (1988)CrossRefGoogle Scholar
  5. 5.
    Schroeder, L.D., Sjoquist, D.L., Stephan, P.E.: Understanding Regression Analysis: An Introductory Guide. SAGE Publications, Thousand Oaks (2017)CrossRefGoogle Scholar
  6. 6.
    Treiman, D.J.: Quantitative Data Analysis: Doing Social Research to Test Ideas. Jossey-Bass, San Francisco (2009)Google Scholar
  7. 7.
    Berkhin, P.: A survey of clustering data mining techniques. In: Grouping Multidimensional Data, pp. 25–71. Springer, Heidelberg (2006)Google Scholar
  8. 8.
    Popat, S.K., Emmanuel, M.: Review and comparative study of clustering techniques. Int. J. Comput. Sci. Inf. Technol. 5(1), 805–812 (2014)Google Scholar
  9. 9.
    Wu, J.: Advances in K-Means Clustering: A Data Mining Thinking. Springer-Verlag, Berlin (2012)CrossRefGoogle Scholar
  10. 10.
    Wang, J., Wang, J., Song, J., Xu, X.S., Shen, H.T., Li, S.: Optimized cartesian K-means. IEEE Trans. Knowl. Data Eng. 27(1), 180–192 (2015)CrossRefGoogle Scholar
  11. 11.
    Memon, K.H., Lee, D.H.: Generalised fuzzy c-means clustering algorithm with local information. IET Image Process. 11(1), 1–12 (2017)CrossRefGoogle Scholar
  12. 12.
    Sato, M., Sato, Y.: Fuzzy Clustering Models and Applications. Physica-Verlag, Heidelberg (2002)zbMATHGoogle Scholar
  13. 13.
    Huang, W., Ribeiro, A.: Hierarchical clustering given confidence intervals of metric distances. IEEE Trans. Signal Process. 66(10), 2600–2615 (2018)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Zhou, S., Xu, Z., Liu, F.: Method for determining the optimal number of clusters based on agglomerative hierarchical clustering. IEEE Trans. Neural Netw. Learn. Syst. 28(12), 3007–3017 (2017)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Nguyen, H.D., McLachlan, G.J., Orban, P., Bellec, P., Janke, A.L.: Maximum pseudolikelihood estimation for model-based clustering of time series data. Neural Comput. 29(4), 990–1020 (2017)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Chen, L., Jiang, Q., Wang, S.: Model-based method for projective clustering. IEEE Trans. Knowl. Data Eng. 24(7), 1291–1305 (2012)CrossRefGoogle Scholar
  17. 17.
    Kutner, M.H., Nachtsheim, C.K., Neter, J.: Applied Linear Regression Models. McGraw-Hill Education, New York (2004)Google Scholar
  18. 18.
    Darlington, R.B., Hayes, A.F.: Regression Analysis and Linear Models: Concepts, Applications, and Implementation. The Guilford Press, New York (2016)Google Scholar
  19. 19.
    Breuer, J.: Introduction to the Theory of Sets. Dover Publications, New York (2006)zbMATHGoogle Scholar
  20. 20.
    Cunningham, D.W.: Set Theory: A First Course. Cambridge University Press, Cambridge (2016)CrossRefGoogle Scholar
  21. 21.
    Brand, L.: Vector Analysis. Dover Publications, New York (2006)zbMATHGoogle Scholar
  22. 22.
    Alabiso, C., Weiss, I.: A Primer on Hilbert Space Theory: Linear Spaces, Topological Spaces, Metric Spaces, Normed Spaces, and Topological Groups. Springer, New York (2015)CrossRefGoogle Scholar
  23. 23.
    Barvinok, A.: A Course in Convexity. American Mathematical Society, Providence (2002)CrossRefGoogle Scholar
  24. 24.
    Berkovitz, L.D.: Convexity and Optimization in Rn. Wiley-Interscience, New York (2001)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Universidad de TalcaTalcaChile

Personalised recommendations