Advertisement

Sophisticated LVQ Classification Models - Beyond Accuracy Optimization

  • Thomas VillmannEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10087)

Abstract

Learning vector quantization models (LVQ) belong to the most successful machine learning classifiers. LVQs are intuitively designed and generally allow an easy interpretation according to the class dependent prototype principle. Originally, LVQs try to optimize the classification accuracy during adaptation, which can be misleading in case of imbalanced data. Further, it might be required by the application that other statistical classification evaluation measures should be considered, e.g. sensitivity and specificity like frequently demanded in bio-medical applications. In this article we present recent approaches, how to modify LVQ to integrate those sophisticated evaluation measures as objectives to be optimized. Particularly, we show that all differentiable functions built fro contingency tables can be incorporated into a LVQ-scheme as well as receiver operating characteristic curve optimization.

Notes

Acknowledgement

The author thanks Marika Kaden (University of Applied Sciences Mittweida) for the numerical simulations and helpful discussions as well as Michael Biehl (University Groningen) for stimulating discussions regarding the ROC- and AUC-interpretation of classifiers with continuous discriminant functions for machine learning approaches.

References

  1. 1.
    Berger, J.O.: Statistical Decision Theory and Bayesian Analysis. Springer Series in Statistics, 3rd edn. Springer, New York (1993)Google Scholar
  2. 2.
    Kohonen, T.: Learning vector quantization for pattern recognition. Report TKK-F-A601, Helsinki University of Technology, Espoo, Finland (1986)Google Scholar
  3. 3.
    Kohonen, T.: Self-organizing Maps. Springer Series in Information Sciences, vol. 30. Springer, Heidelberg (1995). (Second Extended Edition 1997)zbMATHGoogle Scholar
  4. 4.
    Linde, Y., Buzo, A., Gray, R.M.: An algorithm for vector quantizer design. IEEE Trans. Commun. 28, 84–95 (1980)CrossRefGoogle Scholar
  5. 5.
    Martinetz, T.M., Berkovich, S.G., Schulten, K.J.: ‘Neural-gas’ network for vector quantization and its application to time-series prediction. IEEE Trans. Neural Netw. 4(4), 558–569 (1993)CrossRefGoogle Scholar
  6. 6.
    Zador, P.L.: Asymptotic quantization error of continuous signals and the quantization dimension. IEEE Trans. Inf. Theor. IT–28, 149–159 (1982)MathSciNetzbMATHGoogle Scholar
  7. 7.
    Villmann, T., Claussen, J.-C.: Magnification control in self-organizing maps and neural gas. Neural Comput. 18(2), 446–469 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Schölkopf, B., Smola, A.: Learning with Kernels. MIT Press, Cambridge (2002)zbMATHGoogle Scholar
  9. 9.
    Haykin, S.: Neural Networks. A Comprehensive Foundation. Macmillan, New York (1994)zbMATHGoogle Scholar
  10. 10.
    Hermann, W., Barthel, H., Hesse, S., Grahmann, F., Kühn, H.-J., Wagner, A., Villmann, T.: Comparison of clinical types of Wilson’s disease and glucose metabolism in extrapyramidal motor brain regions. J. Neurol. 249(7), 896–901 (2002)CrossRefGoogle Scholar
  11. 11.
    Villmann, T., Blaser, G., Körner, A., Albani, C.: Relevanzlernen und statistische Diskriminanzverfahren zur ICD-10 Klassizierung von SCL90-Patienten–Prolen bei Therapiebeginn. In: Plöttner, G. (ed.) Aktuelle Entwicklungen in der Psychotherapieforschung, pp. 99–118. Leipziger Universitätsverlag, Leipzig, Germany (2004)Google Scholar
  12. 12.
    Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recogn. 30(7), 1149–1155 (1997)CrossRefGoogle Scholar
  13. 13.
    Huang, J., Ling, C.X.: Using AUC and accuracy in evaluating learning algorithms. IEEE Trans. Knowl. Data Eng. 17(3), 299–310 (2005)CrossRefGoogle Scholar
  14. 14.
    Hanley, J.A., McNeil, B.J.: The meaning and use of the area under a receiver operating characteristic. Radiology 143, 29–36 (1982)CrossRefGoogle Scholar
  15. 15.
    Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett. 27, 861–874 (2006)CrossRefGoogle Scholar
  16. 16.
    Sato, A., Yamada, K.: Generalized learning vector quantization. In: Touretzky, D.S., Mozer, M.C., Hasselmo, M.E. (eds.) Advances in Neural Information Processing Systems 8, Proceedings of the 1995 Conference, pp. 423–429. MIT Press, Cambridge, MA, USA (1996)Google Scholar
  17. 17.
    Kohonen, T.: Learning vector quantization. Neural Netw. 1(Supplement 1), 303 (1988)Google Scholar
  18. 18.
    Kohonen, T.: Improved versions of learning vector quantization. In: Proceedings of IJCNN-90, International Joint Conference on Neural Networks, San Diego, Piscataway, vol. I, pp. 545–550. IEEE Service Center (1990)Google Scholar
  19. 19.
    Kaden, M., Riedel, M., Hermann, W., Villmann, T.: Border-sensitive learning in generalized learning vector quantization: an alternative to support vector machines. Soft Comput. 19(9), 2423–2434 (2015)CrossRefGoogle Scholar
  20. 20.
    Graf, S., Lushgy, H.: Foundations of quantization for random vectors. LNM-1730. Springer, Berlin (2000)Google Scholar
  21. 21.
    Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 22, 400–407 (1951)MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Hammer, B., Villmann, T.: Generalized relevance learning vector quantization. Neural Netw. 15(8–9), 1059–1068 (2002)CrossRefGoogle Scholar
  23. 23.
    Villmann, T., Haase, S., Kaden, M.: Kernelized vector quantization in gradient-descent learning. Neurocomputing 147, 83–95 (2015)CrossRefGoogle Scholar
  24. 24.
    Japkowicz, N., Stephen, S.: The class imbalance problem: a systematic study. Intell. Data Anal. 6(5), 429–449 (2002)zbMATHGoogle Scholar
  25. 25.
    Lin, W.-J., Chen, J.J.: Class-imbalanced classifiers for high-dimensional data. Briefings Bioinform. 14(1), 13–26 (2013)CrossRefGoogle Scholar
  26. 26.
    Sachs, L.: Angewandte Statistik, 7th edn. Springer, Heidelberg (1992)Google Scholar
  27. 27.
    Mould, R.F.: Introductory Medical Statistics, 3rd edn. Institute of Physics Publishing, London (1998)Google Scholar
  28. 28.
    Kaden, M., Hermann, W., Villmann, T.: Optimization of general statistical accuracy measures for classification based on learning vector quantization. In: Verleysen, M. (ed.) Proceedings of European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN), pp. 47–52, Louvain-La-Neuve, Belgium (2014). i6doc.com
  29. 29.
    Rijsbergen, C.J.: Information Retrieval, 2nd edn. Butterworths, London (1979)Google Scholar
  30. 30.
    Knauer, U., Backhaus, A., Seiffert, U.: Beyond standard metrics - on the selection and combination of distance metrics for an improved classification of hyperspectral data. In: Villmann, T., Schleif, F.-M., Kaden, M., Lange, M. (eds.) Advances in Self-organizing Maps and Learning Vector Quantization: Proceedings of 10th International Workshop WSOM 2014, Mittweida. Advances in Intelligent Systems and Computing, vol. 295, pp. 167–177. Springer, Berlin (2014)Google Scholar
  31. 31.
    Pastor-Pellicer, J., Zamora-Martínez, F., España-Boquera, S., Castro-Bleda, M.J.: F-measure as the error function to train neural networks. In: Rojas, I., Joya, G., Gabestany, J. (eds.) IWANN 2013. LNCS, vol. 7902, pp. 376–384. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-38679-4_37 CrossRefGoogle Scholar
  32. 32.
    Hanley, J.A., McNeil, B.J.: A method of comparing the area under receiver operating characteristic curves derived from the same case. Radiology 148(3), 839–843 (1983)CrossRefGoogle Scholar
  33. 33.
    Keilwagen, J., Grosse, I., Grau, J.: Area under precision-recall curves for weighted and unweighted data. PLOSONE 9(3/e92209), 1–13 (2014)Google Scholar
  34. 34.
    Lasko, T.A., Bhagwat, J.G., Zou, K.H., Ohno-Machado, L.: The use of receiver operating characteristic curves in biomedical informatics. J. Biomed. Inform. 38, 404–415 (2005)CrossRefGoogle Scholar
  35. 35.
    Vanderlooy, S., Hüllermeier, E.: A critical analysis of variants of the AUC. Mach. Learn. 72, 247–262 (2008)CrossRefGoogle Scholar
  36. 36.
    Boyd, K., Eng, K.H., Page, C.D.: Erratum: area under the precision-recall curve: point estimates and confidence intervals. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013. LNCS, vol. 8190, pp. E1–E1. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-40994-3_55 CrossRefGoogle Scholar
  37. 37.
    Wilcoxon, F.: Andividual comparisons by ranking methods. Biometrics 1, 80–83 (1945)CrossRefGoogle Scholar
  38. 38.
    Mann, H.B., Whitney, D.R.: On a test whether one of two random variables is stochastically larger than the other. Ann. Math. Stat. 18, 50–60 (1947)MathSciNetCrossRefzbMATHGoogle Scholar
  39. 39.
    Yan, L., Dodier, R., Mozer, M.C., Wolniewicz, R.: Optimizing classifier performance via approximation to the Wilcoxon-Mann-Witney statistics. In: Proceedings of the 20th International Conference on Machine Learning, Menlo Park, pp. 848–855. AAAI Press (2003)Google Scholar
  40. 40.
    Kaden, M., Lange, M., Nebel, D., Riedel, M., Geweniger, T., Villmann, T.: Aspects in classification learning - review of recent developments in Learning Vector Quantization. Found. Comput. Decis. Sci. 39(2), 79–105 (2014)MathSciNetzbMATHGoogle Scholar
  41. 41.
    Schneider, P., Hammer, B., Biehl, M.: Adaptive relevance matrices in learning vector quantization. Neural Comput. 21, 3532–3561 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  42. 42.
    Villmann, T., Schleif, F.-M., Kaden, M., Lange, M. (eds.) Advances in Self-organizing Maps and Learning Vector Quantization - Proceedings of the 10th International Workshop, WSOM, Mittweida, Germany. Advances in Intelligent Systems and Computing, vol. 295. Springer, Heidelberg (2014)Google Scholar
  43. 43.
    Villmann, T., Kaden, M., Bohnsack, A., Saralajew, S., Villmann, J.-M., Drogies, T., Hammer, B.: Self-adjusting reject options in prototype based classification. In: Advances in Self-Organizing Maps and Learning Vector Quantization: Proceedings of 11th International Workshop WSOM 2016. Advances in Intelligent Systems and Computing, vol. 428, pp. 269–279. Springer, Berlin-Heidelberg (2016)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.Computational Intelligence GroupUniversity of Applied Sciences MittweidaMittweidaGermany

Personalised recommendations