Data clustering based on principal curves

  • Elson Claudio Correa Moraes
  • Danton Diego FerreiraEmail author
  • Giovani Bernardes Vitor
  • Bruno Henrique Groenner Barbosa
Regular Article


In this contribution we present a new method for data clustering based on principal curves. Principal curves consist of a nonlinear generalization of principal component analysis and may also be regarded as continuous versions of 1D self-organizing maps. The proposed method implements the k-segment algorithm for principal curves extraction. Then, the method divides the principal curves into two or more curves, according to the number of clusters defined by the user. Thus, the distance between the data points and the generate curves is calculated and, afterwards, the classification is performed according to the smallest distance found. The method was applied to nine databases with different dimensionality and number of classes. The results were compared with three clustering algorithms: the k-means algorithm and the 1-D and 2-D self-organizing map algorithms. Experiments show that the method is suitable for clusters with elongated and spherical shapes and achieved significantly better results in some data sets than other clustering algorithms used in this work.


Principal curves Clustering Self-organizing maps Segments 

Mathematics Subject Classification

68Txx Artificial intelligence 



  1. Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828CrossRefGoogle Scholar
  2. Carvalho AM, Adão P, Mateus P (2014) Hybrid learning of Bayesian multinets for binary classification. Pattern Recognit 47(10):3438–3450CrossRefzbMATHGoogle Scholar
  3. Chang K, Ghosh J (1998a) Principal curve classifier: a nonlinear approach to pattern classification. In: IEEE world congress on computational intelligence. IEEE international joint conference on neural networks proceedings, pp 695–700Google Scholar
  4. Chang K, Ghosh J (1998b) Principal curves for nonlinear feature extraction and classification. Appl Artif Neural Netw Image Process III 3307:120–129Google Scholar
  5. Chen Z, Ellis T (2014) A self-adaptive gaussian mixture model. Comput Vis Image Underst 122:35–46CrossRefGoogle Scholar
  6. Cleju I, Fränti P, Wu X (2005) Clustering based on principal curve. In: Kalviainen H, Parkkinen J, Kaarna A (eds) Image analysis, Lecture Notes in Computer Science, vol 3540. Springer, Berlin, pp 872–881Google Scholar
  7. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297zbMATHGoogle Scholar
  8. Cuingnet R, Rosso C, Chupin M, Lehéricy S, Dormont D, Benali H, Samson Y, Colliot O (2011) Spatial regularization of \(\{\text{ SVM }\}\) for the detection of diffusion alterations associated with stroke outcome. Med Image Anal 15(5):729–737CrossRefGoogle Scholar
  9. Duda RO, Hart PE, Stork DG (2000) Pattern classification, 2nd edn. Wiley, HobokenzbMATHGoogle Scholar
  10. Ferreira DD, de Seixas JM, Cerqueira AS, Duque CA (2013) Exploiting principal curves for power quality monitoring. Electr Power Syst Res 100:1–6CrossRefGoogle Scholar
  11. Ferreira DD, de Seixas JM, Duque CA, Cerqueira AS (2014) A direct approach for disturbance detection based on principal curves. In: IEEE 16th international conference on harmonics and quality of power, pp 747–751Google Scholar
  12. Ferreira DD, de Seixas JM, Cerqueira AS, Duque CA, Bollen MHJ, Ribeiro PF (2015) A new power quality deviation index based on principal curves. Electr Power Syst Res 125:8–14CrossRefGoogle Scholar
  13. Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7:179–188CrossRefGoogle Scholar
  14. Gersho A, Gray RM (1992) Vector quantization and signal compression. Kluwer Academic Publishers, BostonCrossRefzbMATHGoogle Scholar
  15. Hastie TJ, Stuetzle W (1989) Principal curves. J Am Stat Assoc 84(406):502–516MathSciNetCrossRefzbMATHGoogle Scholar
  16. Jain AK (2010) Data clustering: 50 years beyond k-means. Pattern Recognit Lett 31(8):651–666CrossRefGoogle Scholar
  17. Jolliffe IT (2002) Principal component analysis, 2nd edn. Springer, New YorkzbMATHGoogle Scholar
  18. Kégl B, Krzyzak A, Linder T, Zeger K (2000) Learning and design of principal curves. IEEE Trans Pattern Anal Mach Intell 22(3):281–297CrossRefGoogle Scholar
  19. Lichman M (2013) UCI machine learning repository.
  20. Plathottam SJ, Salehfar H (2016) Induction machine transient energy loss minimization using neural networks. In: 2016 North American Power Symposium (NAPS), pp 1–5Google Scholar
  21. Rosa GH, Costa KAP, Júnior LAP, Papa JP, Falcão AX, Tavares JMRS (2014) On the training of artificial neural networks with radial basis function using optimum-path forest clustering. In: 2014 22nd International conference on pattern recognition, pp 1472–1477Google Scholar
  22. Rosenblatt F (1962) Principles of neurodynamics: perceptrons and the theory of brain mechanisms. Spartan, Washington DCzbMATHGoogle Scholar
  23. Shelhamer E, Long J, Darrell T (2016) Fully convolutional networks for semantic segmentation. arXiv:1605.06211
  24. Stanford D, Raftery A (2000) Finding curvilinear features in spatial point patterns: principal curve clustering with noise. IEEE Trans Pattern Anal Mach Intell 22(6):601–609CrossRefGoogle Scholar
  25. Theodoridis S, Koutroumbas K (2009) Pattern recognition, 4th edn. Elsevier, AmsterdamzbMATHGoogle Scholar
  26. Vatanen T, Osmala M, Raiko T, Lagus K, Sysi-Aho M, Orešič M, Honkela T, Lähdesmäki H (2015) Self-organization and missing values in \(\{\text{ SOM }\}\) and \(\{\text{ GTM }\}\). Neurocomputing 147:60–70CrossRefGoogle Scholar
  27. Verbeek JJ, Vlassis N, Krose B (2002) A K-segments Algorithm for Finding Principal Curves. Pattern Recognit Lett 23:1009–1017CrossRefzbMATHGoogle Scholar
  28. Wang H, Lee TCM (2006) Automatic parameter selection for a K-segments algorithm for computing principal curves. Pattern Recognit Lett 27:1142–1150CrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Engineering DepartmentFederal University of Lavras (UFLA)LavrasBrazil
  2. 2.Computer EngineeringFederal University of ItajubáItabiraBrazil

Personalised recommendations