Abstract
Novel criteria that reformulate the Quadratic Mutual Information according to Fisher’s Discriminant Analysis are proposed for supervised dimensionality reduction. The proposed method uses a quadratic divergence measure and requires no prior assumptions about class densities. The criteria are optimized using gradient ascent with initialization using random or LDA based projections. Experiments on various datasets are conducted and highlight the superiority of the proposed approach compared to the standard QMI criterion.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Comon, P.: Independent component analysis, a new concept? Signal Processing 36(3), 287–314 (1994); Higher Order Statistics
Corporation, B.: Dynamic programming. Princeton University Press, City (1957)
Dietterich, T.G.: Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation 10(7), 1895–1923 (1998)
Frank, A., Asuncion, A.: UCI machine learning repository (2010), http://archive.ics.uci.edu/ml
Fukunaga, K.: Introduction to Statistical Pattern Recognition, 2nd edn. Academic Press (1990)
Kittler, J., Devijver, P.A.: Statistical properties of error estimators in performance assessment of recognition systems. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI 4(2), 215–220 (1982)
Kumar, N., Andreou, A.G.: Heteroscedastic discriminant analysis and reduced rank HMMs for improved speech recognition. Speech Communication 26(4), 283–297 (1998)
Okada, T., Tomita, S.: An optimal orthonormal system for discriminant analysis. Pattern Recognition, 139–144 (1985)
Ozertem, U., Erdogmus, D., Jenssen, R.: Spectral feature projections that maximize shannon mutual information with class labels. Pattern Recogn. 39, 1241–1252 (2006)
Schölkopf, B., Smola, A., Müller, K.R.: Kernel principal component analysis (1999)
Shannon, C.: A mathematical theory of communication. Bell Systems Techn. Journal 27, 623–656 (1948)
Torkkola, K.: Feature extraction by non-parametric mutual information maximization. Journal of Machine Learning Research 3, 1415–1438 (2003)
Vera, P.A., Estévez, P.A., Principe, J.C.: Linear Projection Method Based on Information Theoretic Learning. In: Diamantaras, K., Duch, W., Iliadis, L.S. (eds.) ICANN 2010, Part III. LNCS, vol. 6354, pp. 178–187. Springer, Heidelberg (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gavriilidis, V., Tefas, A. (2012). Exploiting Quadratic Mutual Information for Discriminant Analysis. In: Maglogiannis, I., Plagianakos, V., Vlahavas, I. (eds) Artificial Intelligence: Theories and Applications. SETN 2012. Lecture Notes in Computer Science(), vol 7297. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30448-4_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-30448-4_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30447-7
Online ISBN: 978-3-642-30448-4
eBook Packages: Computer ScienceComputer Science (R0)