Skip to main content

Correntropy in Data Classification

  • Conference paper
  • First Online:
Dynamics of Information Systems: Mathematical Foundations

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 20))

Abstract

In this chapter, the usability of the correntropy-based similarity measure in the paradigm of statistical data classification is addressed. The basic theme of the chapter is to compare the performance of the correntropic loss function with the conventional quadratic loss function. Moreover, the issues related to the non-convexity of the correntropic loss function are considered while proposing new classification methods. The proposed methods incorporate the correntropic loss function via the notions of convolution smoothing and simulated annealing optimization algorithms. Two nonparametric classification methods based on the correntropic loss function are proposed and compared with the conventional parametric and nonparametric methods. Specifically, the classification performance of the proposed artificial neural network-based methods are not only compared with their conventional counterparts but also with the kernel-based soft margin support vector machines. Experimental studies with Monte Carlo-based simulations show the validity of the proposed methods in the data classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    This function has its roots from correntropy function (see [22] for more details).

  2. 2.

    A general approach for solving non-convex problems via convolution smoothing was proposed by Styblinski and Tang [30] in 1990.

  3. 3.

    http://archive.ics.uci.edu/ml/.

References

  1. Alizamir, S., Rebennack, S., Pardalos, P.M.: Improving the neighborhood selection strategy in simulated annealing using the optimal stopping problem. In: Tan, C.M. (ed.) Simulated Annealing. Springer, New York, pp. 63–382 (2008)

    Google Scholar 

  2. Anthony, M., Bartlett, P.L.: Neural Network Learning: Theoretical Foundations. Cambridge University Press, UK (2009)

    Google Scholar 

  3. Antonov, G.E., Katkovnik, V.J.: Generalization of the concept of statistical gradient. Avtomat. i Vycisl. Tehn. (Riga) 4, 25–30 (1972)

    Google Scholar 

  4. Bazaraa, M.S., Sherali, H.D., Shetty, C.M.: Nonlinear Programming: Theory And Algorithms. Wiley, New York (2006)

    Book  MATH  Google Scholar 

  5. Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pp. 144–152. ACM (1992)

    Google Scholar 

  6. Catoni, O.: Metropolis, simulated annealing and IET algorithms: Theory and experiments. J. Complex. 12, 595–623 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  7. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)

    MATH  Google Scholar 

  8. Fan, R.E., Chen, P.H., Lin, C.J.: Working set selection using second order information for training support vector machines. J. Mach. Learn. Res. 6, 1889–1918 (2005)

    MathSciNet  MATH  Google Scholar 

  9. Gunn, S.R.: Support vector machines for classification and regression. ISIS technical report, 14 (1998)

    Google Scholar 

  10. Heisele, B., Ho, P., Poggio, T.: Face recognition with support vector machines: Global versus component-based approach. In: Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001. vol. 2, pp. 688–694. IEEE (2001)

    Google Scholar 

  11. Hornick, K., Stinchcombe, M., White, H.: Multilayer feedforward networks are universal approximators. Neural Netw. 2(5), 359–366 (1989)

    Article  Google Scholar 

  12. Kim, K.I., Jung, K., Park, S.H., Kim, H.J.: Support vector machines for texture classification. IEEE Trans. Pattern Anal. Mach. Intell. 24(11), 1542–1550 (2002)

    Article  Google Scholar 

  13. Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220(4598), 671 (1983)

    MathSciNet  MATH  Google Scholar 

  14. Lundy, M., Mees, A.: Convergence of an annealing algorithm. Math. Progr. 34(1), 111–124 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  15. McCulloch, W.S., Pitts, W.: A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biol. 5(4), 115–133 (1943)

    MathSciNet  MATH  Google Scholar 

  16. Mehrotra, K., Mohan, C.K., Ranka, S.: Elements of Artificial Neural Networks. MIT Press, Cambridge (1997)

    Google Scholar 

  17. Michalewicz, Z., Fogel, D.B.: How to Solve It: Modern Heuristics. Springer, New York (2004)

    MATH  Google Scholar 

  18. Michie, D., Spiegelhalter, D.J., Taylor, C.C.: Machine Learning, Neural and Statistical Classification. Ellis Horwood, New York (1994)

    Google Scholar 

  19. Minsky, M., Seymour, P.: Perceptrons. MIT Press, Cambridge (1969)

    MATH  Google Scholar 

  20. Pardalos, P., Pitsoulis, L., Mavridou, T., Resende, M.: Parallel search for combinatorial optimization: Genetic algorithms, simulated annealing, tabu search and grasp. Parallel Algorithms for Irregularly Structured Problems, pp. 317–331. Wiley, Hoboken (1995)

    Google Scholar 

  21. Pardalos, P.M., Boginski, V.L., Vazacopoulos, A.: Data Mining in Biomedicine. Springer, New York (2007)

    Book  MATH  Google Scholar 

  22. Principe, J.C.: Information Theoretic Learning: Renyi’s Entropy And Kernel Perspectives. Springer, New York (2010)

    MATH  Google Scholar 

  23. Reeves, C.R.: Modern heuristic techniques for combinatorial problems. Wiley, New York (1993)

    MATH  Google Scholar 

  24. Robbins, H., Monro, S.L.: A stochastic approximation method. Ann. Math. Stat. 22, 400–407 (1951)

    Article  MathSciNet  MATH  Google Scholar 

  25. Rosenblatt, F.: The perceptron: A probabilistic model for information storage and organization in the brain. Psychol. Rev. 65(6), 386 (1958)

    Article  MathSciNet  Google Scholar 

  26. Rubinstein, R.Y.: Smoothed functionals in stochastic optimization. Math. Oper. Res., 26–33 (1983)

    Google Scholar 

  27. Santamaría, I., Pokharel, P.P., Principe, J.C.: Generalized correlation function: Definition, properties, and application to blind equalization. IEEE Trans. Signal Process. 54(6), 2187–2197 (2006)

    Article  Google Scholar 

  28. Schölkopf, B., Burges, C., Vapnik, V.: Extracting support data for a given task. In: Proceedings, First International Conference on Knowledge Discovery & Data Mining, pp. 252–257. AAAI Press, Menlo Park (1995)

    Google Scholar 

  29. Singh, A., Principe, J.C.: A loss function for classification based on a robust similarity metric. In: The 2010 International Joint Conference on Neural Networks (IJCNN), pp. 1–6. IEEE (2010).

    Google Scholar 

  30. Styblinski, M.A., Tang, T.S.: Experiments in nonconvex optimization: stochastic approximation with function smoothing and simulated annealing. Neural Netw. 3(4), 467–483 (1990)

    Article  Google Scholar 

  31. Syed, M.N., Pardalos, P.M.: Neural network models in combinatorial optimization. Handbook of Combinatorial Optimization. In press

    Google Scholar 

  32. Tong, S., Koller, D.: Support vector machine active learning with applications to text classification. J. Mach. Learn. Res. 2, 45–66 (2002)

    MATH  Google Scholar 

  33. Vapnik, V., Golowich, S.E., Smola, A.: Support vector method for function approximation, regression estimation, and signal processing. Adv. Neural Inf. Process. Syst. 9, 281–287 (1996)

    Google Scholar 

  34. Vapnik, V.N.: An overview of statistical learning theory. IEEE Trans. Neural Netw. 10(5), 988–999 (1999)

    Article  Google Scholar 

  35. Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, New York (2000)

    MATH  Google Scholar 

  36. Weston, J., Watkins, C.: Multi-class support vector machines. Technical report, Technical Report CSD-TR-98-04, Department of Computer Science, University of London, Royal Holloway (1998)

    Google Scholar 

  37. Zhang, J., Xanthopoulos, P., Chien, J., Tomaino, V., Pardalos, P.M.: Minimum prediction error models and causal relations between multiple time series. In: Cochran, J.J. (ed.) Wiley Encyclopedia of Operations Research and Management Science 3, 1843–1850 (2011)

    Google Scholar 

Download references

Acknowledgements

This work is partially supported by DTRA and NSF grants.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mujahid N. Syed .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer Science+Business Media New York

About this paper

Cite this paper

Syed, M.N., Principe, J.C., Pardalos, P.M. (2012). Correntropy in Data Classification. In: Sorokin, A., Murphey, R., Thai, M., Pardalos, P. (eds) Dynamics of Information Systems: Mathematical Foundations. Springer Proceedings in Mathematics & Statistics, vol 20. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-3906-6_5

Download citation

Publish with us

Policies and ethics