Clustering of Incomplete Data and Evaluation of Clustering Quality

  • Vladimir V. Ryazanov
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7441)


Two approaches to solving the problem of clustering with gaps for a specified number of clusters are considered. The first approach is based on restoring the values of unknown attributes and solving the problem of clustering of calculated complete data. The second approach is based on solving a finite set of tasks of clustering of corresponding to incomplete data complete sample descriptions and the construction of collective decision. For both approaches, the clustering quality criteria have been proposed as functions of incomplete descriptions. Results of practical experiments are considered.


clustering missing data gaps clustering estimation 


  1. 1.
    Little, R.J.A., Rubin, D.B.: Statistical Analysis with Missing Data. Wiley, New York (1987)zbMATHGoogle Scholar
  2. 2.
    Zloba, E.: Statistical methods of reproducing of missing data. J. Computer Modelling & New Technologies 6(1), 51–61 (2002)Google Scholar
  3. 3.
    Zhang, S.: Parimputation: From imputation and null-imputation to partially imputation. IEEE Intelligent Informatics Bulletin 9(1), 32–38 (2008)Google Scholar
  4. 4.
    Honghai, F., Guoshun, C., Cheng, Y., Bingru, Y., Yumei, C.: A SVM Regression Based Approach to Filling in Missing Values. In: Khosla, R., Howlett, R.J., Jain, L.C. (eds.) KES 2005. LNCS (LNAI), vol. 3683, pp. 581–587. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  5. 5.
    Sarkar, M., Leong, T.-Y.: Fuzzy k-means Clustering with Missing Values. In. AMIA Symp., pp. 588–592 (2001)Google Scholar
  6. 6.
    Honda, K., Ichihashi, H.: Linear Fuzzy Clustering Techniques With Missing Values and Their Application to Local Principal Component Analysis. IEEE Transactions on Fuzzy Systems 12(2), 183–193 (2004)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Wagstaff, K.: Clustering with missing values: No imputation required. In: Meeting of the International Federation of Classification Societies “Classification, Clustering, and Data Mining”, pp. 649–658. Springer (2004)Google Scholar
  8. 8.
    Ryazanov, V.: Some Imputation Algorithms for Restoration of Missing Data. In: San Martin, C., Kim, S.-W. (eds.) CIARP 2011. LNCS, vol. 7042, pp. 372–379. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  9. 9.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley Interscience (2001)Google Scholar
  10. 10.
    Ryazanov, V.V.: The committee synthesis of pattern recognition and classification algorithms, Zh. Vychisl. Mat. i Mat. Fiziki 21(6), 1533–1543 (1981) (in Russian) (Printed in Great Britain, 1982. Pergamon Press. Ltd.)Google Scholar
  11. 11.
    Biryukov, A.S., Ryazanov, V.V., Shmakov, A.S.: Solving Clusterization Problems Using Groups of Algorithms. Zh. Vychisl. Mat. i Mat. Fiziki 48(1), 176–192 (2008) (Printed in Great Britain, 2008. Pergamon Press. Ltd.)MathSciNetzbMATHGoogle Scholar
  12. 12.
    Mangasarian, O.L., Wolberg, W.H.: Cancer diagnosis via linear programming. SIAM News 23(5), 1–18 (1990)Google Scholar
  13. 13.
    Arseev, A.S., Kotochigov, K.L., Ryazanov, V.V.: Universal criteria for clustering and stability problems. In: 13th All-Russian Conference “Mathematical Methods for Pattern Recognition”, pp. 63–64. S.-Peterburg (2007) (in Russian)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Vladimir V. Ryazanov
    • 1
  1. 1.Institution of Russian Academy of Sciences Dorodnicyn Computing Centre of RASMoscowRussia

Personalised recommendations