Abstract
In this paper, we propose a Robust Discriminant Analysis based on maximum entropy (MaxEnt) criterion (MaxEnt-RDA), which is derived from a nonparametric estimate of Renyi’s quadratic entropy. MaxEnt-RDA uses entropy as both objective and constraints; thus the structural information of classes is preserved while information loss is minimized. It is a natural extension of LDA from Gaussian assumption to any distribution assumption. Like LDA, the optimal solution of MaxEnt-RDA can also be solved by an eigen-decomposition method, where feature extraction is achieved by designing two Parzen probability matrices that characterize the within-class variation and the between-class variation respectively. Furthermore, MaxEnt-RDA makes use of high order statistics (entropy) to estimate the probability matrix so that it is robust to outliers. Experiments on toy problem , UCI datasets and face datasets demonstrate the effectiveness of the proposed method with comparison to other state-of-the-art methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Guyon, I., Elissee, A.: An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157–1182 (2003)
Jolliffe, I.T.: Principal component analysis. Springer, New York (1986)
He, X., Yan, S., Hu, Y., Niyogi, P., Zhang, H.: Face recognition using laplacianfaces. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(3), 328–340 (2005)
Zhao, D., Lin, Z., Tang, X.: Laplacian pca and its applications. In: International Conference on Computer Vision (2007)
Zhu, M., Martinez, A.M.: Subclass discriminant analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(8), 1274–1286 (2006)
Hamsici, O.C., Martinez, A.M.: Bayes optimality in linear discriminant analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 30(4), 647–657 (2008)
Tao, D.C., Li, X.L., Wu, X.D., Maybank, S.J.: Geometric mean for subspace selection. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(2), 260–274 (2009)
Belhumeur, P., Hespanha, J., Kriegman, D.: Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE Transactions on Pattern Analysis and Machine Intelligence 19(1), 711–720 (1997)
Fukunaga, K.: Statistical pattern recognition. Academic Press, London (1990)
Yan, S., Xu, D., Zhang, B., Zhang, H., Yang, Q., Lin, S.: Graph embedding and extensions: a general framework for dimensionality reduction. IEEE Transactions on Pattern Analysis and Machine Intelligence 29(1), 40–51 (2007)
Zhao, D., Lin, Z., Xiao, R., Tang, X.: Linear laplacian discrimination for feature extraction. In: IEEE conference on Computer Vision and Pattern Recognition (2007)
Principe, J., Xu, D., Iii, J.W.F.: Information-theoretic learning, http://www.cnel.ufl.edu/bib/./pdf_papers/chapter7.pdf
Torkkola, K.: Feature extraction by nonparametric mutual information maximization. Journal of Machine Learning Research 3, 1415–1438 (2003)
Nenadic, Z.: Information discriminant analysis: feature extraction with an information-theoretic objective. IEEE Transactions on Pattern Analysis and Machine Intelligence 29(8), 1394–1407 (2007)
Hild II, K.E., Erdogmus, D., Torkkola, K., Principe, C.: Feature extraction using information-theoretic learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(9), 1385–1392 (2006)
Yuan, X., Hu, B.: Robust feature extraction via information theoretic learning. In: International Conference on Machine Learning (ICML 2009), Montreal, Canada (2009)
Rao, S., Liu, W., Principe, J., de Medeiros Martins, A.: Information theoretic mean shift algorithm. In: Machine Learning for Signal Processing (2006)
Caticha, A., Giffin, A.: Updating probabilities. In: The 26th International Workshop on Bayesian Inference and Maximum Entropy Methods, Paris, France (2006)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern classification. Wiley-Interscience, Hoboken (2000)
Cover, T., Thomas, J.: Elements of Information Theory, 2nd edn. John Wiley, New Jersey (2005)
Liu, W., Pokharel, P.P., Principe, J.C.: Correntropy: Properties and applications in non-gaussian signal processing. IEEE Transactions on Signal Processing 55(11), 5286–5298 (2007)
Huber, P.: Robust statistics. Wiley, Chichester (1981)
Baek, J., Son, Y.S.: Local linear logistic discriminant analysis with partial least square components. In: Li, X., Zaïane, O.R., Li, Z.-h. (eds.) ADMA 2006. LNCS (LNAI), vol. 4093, pp. 574–581. Springer, Heidelberg (2006)
Ye, J.: Least squares linear discriminant analysis. In: International Conference on Machine Learning, ICML (2007)
Cai, D., He, X., Han, J.: Spectral regression for efficient regularized subspace learning. In: International Conference on Computer Vision, pp. 1–7 (2007)
Hastie, T., Tibshirani, R., Friedman, J.: The elements of statistical learning: Data mining, inference, and prediction. Springer, Heidelberg (2001)
Newman, D., Hettich, S., Blake, C., Merz, C.: Uci repository of machine learning databases (1998), http://www.ics.uci.edu/mlearn/MLRepository.html
Sim, T., Baker, S., Bsat, M.: The cmu pose, illumination, and expression database. IEEE Transactions on Pattern Analysis and Machine Intelligence 25, 1615–1618 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
He, R., Hu, BG., Yuan, XT. (2009). Robust Discriminant Analysis Based on Nonparametric Maximum Entropy. In: Zhou, ZH., Washio, T. (eds) Advances in Machine Learning. ACML 2009. Lecture Notes in Computer Science(), vol 5828. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-05224-8_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-05224-8_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-05223-1
Online ISBN: 978-3-642-05224-8
eBook Packages: Computer ScienceComputer Science (R0)