Feature Selection by Distributions Contrasting

Tsurko, Varvara V.; Michalski, Anatoly I.

doi:10.1007/978-3-319-10554-3_13

Varvara V. Tsurko²³ &
Anatoly I. Michalski^23,24

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8722))

Included in the following conference series:

International Conference on Artificial Intelligence: Methodology, Systems, and Applications

1331 Accesses

Abstract

We consider the problem of selection the set of features that are the most significant for partitioning two given data sets. The criterion for selection which is to be maximized is the symmetric information distance between distributions of the features subset in the two classes. These distributions are estimated using Bayesian approach for uniform priors, the symmetric information distance is given by the lower estimate for corresponding average risk functional using Rademacher penalty and inequalities from the empirical processes theory. The approach was applied to a real example for selection a set of manufacture process parameters to predict one of two states of the process. It was found that only 2 parameters from 10 were enough to recognize the true state of the process with error level 8%. The set of parameters was found on the base of 550 independent observations in training sample. Performance of the approach was evaluated using 270 independent observations in test sample.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bay, S.D., Pazzani, M.J.: Detecting group differences: mining contrast sets. Data Mining and Knowledge Discovery 5, 213–246 (2001)
Article MATH Google Scholar
Blum, A., Langley, P.: Selection of relevant features and examples in machine learning. AI 97(1-2), 245–271 (1997)
MATH MathSciNet Google Scholar
Coetzee, F.M.: Correcting Kullback-Leibler Distance for Feature Selection. Pattern Recognition Letters 26(11), 1675–1683 (2005)
Article Google Scholar
Cover, T., Thomas, J.: Elements of Infornation Theory. Wiley (1991)
Google Scholar
Kira, K., Rendell, L.: The feature selection problem: Traditional methods and a new algorithm. In: Tenth National Conference on Artificail Intelligence, pp. 129–134. MIT Press (1992)
Google Scholar
Koller, D., Sahami, M.: Toward Optimal Feature Selection. In: Proceedings of the Thirteenth International Conference on Machine Learning, pp. 284–292. Morgan Kaufmann Publishers (1996)
Google Scholar
Koltchinskii, V.: Rademacher penalties and structural risk minimization. IEEE Transactions on Information Theory (1999)
Google Scholar
Koltchinskii, V.: Oracle Inequalities in Empirical Risk Minimization and Sparse Recovery Problems. LNM, vol. 2033. Springer, Heidelberg (2008)
Google Scholar
Kullback, S., Leibler, R.A.: On information and sufficiency. The Annals of Mathematical Statistics 22(1), 79–86 (1951)
Article MATH MathSciNet Google Scholar
Lozano, F.: Model selection using Rademacher Penalization. In: The Second ICSC Symposia on Neural Computation (NC 2000). ICSC Adademic (2000)
Google Scholar
Manning, C., Raghavan, P., Schutze, H.: An Introduction to Information Retrieval. Cambridge University Press, Cambridge (2009)
Google Scholar
Song, L., Smola, A., Gretton, A., Bedo, J., Borgwardt, K.: Feature selection via dependence maximization. Journal of Machine Learning Research 13, 1393–1434 (2012)
MATH MathSciNet Google Scholar
Tsurko, V., Michalski, A.: Statistical Analysis of Links between Cancer and Associated Diseases (in Russian). Adv. Geront. 26(4), 766–774 (2013)
Google Scholar
Vapnik, V.: Statitical Learning Theory. Wiley Interscience (1998)
Google Scholar
Vapnik, V., Chervonenkis, A.: Pattern Recognition Theory (in Russian). Nauka, Moscow (1974)
Google Scholar
Wolf, L., Shashua, A.: Features Selection for Unsupervised and Supervised Inference: The Emergence of Sparsity in a Weight-Based Approach. Journal of Machine Learning Research 6, 1855–1887 (2005)
MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Control Sciences Russian Academy of Sciences, Profsouznaya str. 65, 117997, Moscow, Russian Federation
Varvara V. Tsurko & Anatoly I. Michalski
National Research University Higher School of Economics, Bolshoy Tryokhsvyatitelsky Pereulok 3, 109028, Moscow, Russian Federation
Anatoly I. Michalski

Authors

Varvara V. Tsurko
View author publications
You can also search for this author in PubMed Google Scholar
Anatoly I. Michalski
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Information and Communication, Bulgarian Academy of Sciences, Sofia, Bulgaria
Gennady Agre
Department of Computer Science, Wright State University, Dayton, OH, USA
Pascal Hitzler
Wright State University, Dayton, OH, USA
Adila A. Krisnadhi
Higher School of Economics, National Research University, Moscow, Russia
Sergei O. Kuznetsov

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tsurko, V.V., Michalski, A.I. (2014). Feature Selection by Distributions Contrasting. In: Agre, G., Hitzler, P., Krisnadhi, A.A., Kuznetsov, S.O. (eds) Artificial Intelligence: Methodology, Systems, and Applications. AIMSA 2014. Lecture Notes in Computer Science(), vol 8722. Springer, Cham. https://doi.org/10.1007/978-3-319-10554-3_13

Download citation

DOI: https://doi.org/10.1007/978-3-319-10554-3_13
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10553-6
Online ISBN: 978-3-319-10554-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics