Abstract
In this chapter, the usability of the correntropy-based similarity measure in the paradigm of statistical data classification is addressed. The basic theme of the chapter is to compare the performance of the correntropic loss function with the conventional quadratic loss function. Moreover, the issues related to the non-convexity of the correntropic loss function are considered while proposing new classification methods. The proposed methods incorporate the correntropic loss function via the notions of convolution smoothing and simulated annealing optimization algorithms. Two nonparametric classification methods based on the correntropic loss function are proposed and compared with the conventional parametric and nonparametric methods. Specifically, the classification performance of the proposed artificial neural network-based methods are not only compared with their conventional counterparts but also with the kernel-based soft margin support vector machines. Experimental studies with Monte Carlo-based simulations show the validity of the proposed methods in the data classification.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
References
Alizamir, S., Rebennack, S., Pardalos, P.M.: Improving the neighborhood selection strategy in simulated annealing using the optimal stopping problem. In: Tan, C.M. (ed.) Simulated Annealing. Springer, New York, pp. 63–382 (2008)
Anthony, M., Bartlett, P.L.: Neural Network Learning: Theoretical Foundations. Cambridge University Press, UK (2009)
Antonov, G.E., Katkovnik, V.J.: Generalization of the concept of statistical gradient. Avtomat. i Vycisl. Tehn. (Riga) 4, 25–30 (1972)
Bazaraa, M.S., Sherali, H.D., Shetty, C.M.: Nonlinear Programming: Theory And Algorithms. Wiley, New York (2006)
Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pp. 144–152. ACM (1992)
Catoni, O.: Metropolis, simulated annealing and IET algorithms: Theory and experiments. J. Complex. 12, 595–623 (1996)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
Fan, R.E., Chen, P.H., Lin, C.J.: Working set selection using second order information for training support vector machines. J. Mach. Learn. Res. 6, 1889–1918 (2005)
Gunn, S.R.: Support vector machines for classification and regression. ISIS technical report, 14 (1998)
Heisele, B., Ho, P., Poggio, T.: Face recognition with support vector machines: Global versus component-based approach. In: Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001. vol. 2, pp. 688–694. IEEE (2001)
Hornick, K., Stinchcombe, M., White, H.: Multilayer feedforward networks are universal approximators. Neural Netw. 2(5), 359–366 (1989)
Kim, K.I., Jung, K., Park, S.H., Kim, H.J.: Support vector machines for texture classification. IEEE Trans. Pattern Anal. Mach. Intell. 24(11), 1542–1550 (2002)
Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220(4598), 671 (1983)
Lundy, M., Mees, A.: Convergence of an annealing algorithm. Math. Progr. 34(1), 111–124 (1986)
McCulloch, W.S., Pitts, W.: A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biol. 5(4), 115–133 (1943)
Mehrotra, K., Mohan, C.K., Ranka, S.: Elements of Artificial Neural Networks. MIT Press, Cambridge (1997)
Michalewicz, Z., Fogel, D.B.: How to Solve It: Modern Heuristics. Springer, New York (2004)
Michie, D., Spiegelhalter, D.J., Taylor, C.C.: Machine Learning, Neural and Statistical Classification. Ellis Horwood, New York (1994)
Minsky, M., Seymour, P.: Perceptrons. MIT Press, Cambridge (1969)
Pardalos, P., Pitsoulis, L., Mavridou, T., Resende, M.: Parallel search for combinatorial optimization: Genetic algorithms, simulated annealing, tabu search and grasp. Parallel Algorithms for Irregularly Structured Problems, pp. 317–331. Wiley, Hoboken (1995)
Pardalos, P.M., Boginski, V.L., Vazacopoulos, A.: Data Mining in Biomedicine. Springer, New York (2007)
Principe, J.C.: Information Theoretic Learning: Renyi’s Entropy And Kernel Perspectives. Springer, New York (2010)
Reeves, C.R.: Modern heuristic techniques for combinatorial problems. Wiley, New York (1993)
Robbins, H., Monro, S.L.: A stochastic approximation method. Ann. Math. Stat. 22, 400–407 (1951)
Rosenblatt, F.: The perceptron: A probabilistic model for information storage and organization in the brain. Psychol. Rev. 65(6), 386 (1958)
Rubinstein, R.Y.: Smoothed functionals in stochastic optimization. Math. Oper. Res., 26–33 (1983)
SantamarÃa, I., Pokharel, P.P., Principe, J.C.: Generalized correlation function: Definition, properties, and application to blind equalization. IEEE Trans. Signal Process. 54(6), 2187–2197 (2006)
Schölkopf, B., Burges, C., Vapnik, V.: Extracting support data for a given task. In: Proceedings, First International Conference on Knowledge Discovery & Data Mining, pp. 252–257. AAAI Press, Menlo Park (1995)
Singh, A., Principe, J.C.: A loss function for classification based on a robust similarity metric. In: The 2010 International Joint Conference on Neural Networks (IJCNN), pp. 1–6. IEEE (2010).
Styblinski, M.A., Tang, T.S.: Experiments in nonconvex optimization: stochastic approximation with function smoothing and simulated annealing. Neural Netw. 3(4), 467–483 (1990)
Syed, M.N., Pardalos, P.M.: Neural network models in combinatorial optimization. Handbook of Combinatorial Optimization. In press
Tong, S., Koller, D.: Support vector machine active learning with applications to text classification. J. Mach. Learn. Res. 2, 45–66 (2002)
Vapnik, V., Golowich, S.E., Smola, A.: Support vector method for function approximation, regression estimation, and signal processing. Adv. Neural Inf. Process. Syst. 9, 281–287 (1996)
Vapnik, V.N.: An overview of statistical learning theory. IEEE Trans. Neural Netw. 10(5), 988–999 (1999)
Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, New York (2000)
Weston, J., Watkins, C.: Multi-class support vector machines. Technical report, Technical Report CSD-TR-98-04, Department of Computer Science, University of London, Royal Holloway (1998)
Zhang, J., Xanthopoulos, P., Chien, J., Tomaino, V., Pardalos, P.M.: Minimum prediction error models and causal relations between multiple time series. In: Cochran, J.J. (ed.) Wiley Encyclopedia of Operations Research and Management Science 3, 1843–1850 (2011)
Acknowledgements
This work is partially supported by DTRA and NSF grants.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer Science+Business Media New York
About this paper
Cite this paper
Syed, M.N., Principe, J.C., Pardalos, P.M. (2012). Correntropy in Data Classification. In: Sorokin, A., Murphey, R., Thai, M., Pardalos, P. (eds) Dynamics of Information Systems: Mathematical Foundations. Springer Proceedings in Mathematics & Statistics, vol 20. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-3906-6_5
Download citation
DOI: https://doi.org/10.1007/978-1-4614-3906-6_5
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-3905-9
Online ISBN: 978-1-4614-3906-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)