The contrast features selection with empirical data

Tsurko, V. V.; Michalski, A. I.

doi:10.1134/S0005117916120109

The contrast features selection with empirical data

Data Analysis
Published: 18 December 2016

Volume 77, pages 2212–2226, (2016)
Cite this article

Automation and Remote Control Aims and scope Submit manuscript

V. V. Tsurko¹ &
A. I. Michalski¹

37 Accesses
3 Citations
Explore all metrics

Abstract

The problem of selection the most informative features is reduced to an optimization problem for the average risk functional whose maximization is equivalent to maximization of informational distance between distributions of features in two classes. We consider a maximization procedure for the average risk functional via empirical risk, estimating the divergence between them, with Rademacher complexity. The proposed method has been applied efficiently to problems of selection parameters important to separate the states of technological processes. We show an experimental comparison of the developed approach with other widely known feature selection techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Variable Selection and Feature Screening

A Comparative Study Between Feature Selection Algorithms

On the Stability of Feature Selection in the Presence of Feature Correlations

References

Iverson, D.L., Data Mining Applications for Space Mission Operations System Health Monitoring, Proc. SpaceOps 2008 Conf., ESA, EUMETSAT, AIAA, Heidelberg, Germany, May 2008.
Google Scholar
Kostyukov, V.N. and Naumenko, A.P., Analysis of Modern Methods and Means for Monitoring and Diagnostics of Pneumatic Pumps. Part 1. Online Monitoring Systems, V Mire NK, 2010, no. 1 (47), pp. 64–70.
Google Scholar
Vapnik, V.N. and Chervonenkis, A.Ya., Teoriya raspoznavaniya obrazov (Image Recognition Theory), Moscow: Nauka, 1974.
Google Scholar
Wolf, L. and Shashua, A., Features Selection for Unsupervised and Supervised Inference: The Emergence of Sparsity in a Weight-Based Approach, J. Machine Learning Res., 2005, vol. 6, pp. 1855–1887.
MathSciNet MATH Google Scholar
Blum, A. and Langley, P., Selection of Relevant Features and Examples in Machine Learning, AI, 1997, vol. 97, pp. 245–271.
MathSciNet MATH Google Scholar
John, G.H., Kohavi, R., and Pfleger, K., Irrelevant Features and the Subset Selection Problem, Proc. 11th Int. Conf. on Machine Learning, Morgan Kaufmann Publishers, 1994, pp. 121–129.
Google Scholar
Kira, K. and Rendell, L., The Feature Selection Problem: Traditional Methods and a New Algorithm, 10th National Conf. on Artificial Intelligence, Cambridge: MIT Press, 1992, pp. 129–134.
Google Scholar
Allmuallim, H. and Dietterich, T.G., Learning with Many Irrelevant Features, Proc. 9th National Conf. on Artificial Intelligence, San Jose: AAAI, 1991, pp. 547–552.
Google Scholar
Jolliffe, I.T., Principal Component Analysis, New York: Springer-Verlag, 1986.
Book MATH Google Scholar
Comon, P., Independent Component Analysis. A New Concept, Signal Process., 1994, vol. 36, pp. 287–314.
Article MATH Google Scholar
Koller, D. and Sahami, M., Toward Optimal Feature Selection, Proc. 13th Int. Conf. on Machine Learning, Morgan Kaufmann Publishers, 1996, pp. 284–292.
Google Scholar
Kullback, S. and Leibler, R.A., On Information and Sufficiency, Annals Math. Statist., 1951, vol. 22, no. 1, pp. 79–86.
Article MathSciNet MATH Google Scholar
Novovicova, J., Pudil, P., and Kittler, J., Divergence Based Feature Selection for Multimodal Class Densities, IEEE Trans. Patt. Anal. Machine Intelligen., 1996, vol. 18 (2), pp. 218–223.
Article Google Scholar
Coetzee, F.M., Correcting Kullback–Leibler Distance for Feature Selection, Patt. Recognit. Lett., 2005, vol. 26, no. 11, pp. 1675–1683.
Article Google Scholar
Eguchi, S. and Copas, J., Interpreting Kullback–Leibler Divergence with the Neyman–Pearson Lemma, J. Multivariate Anal., 2006, vol. 97, pp. 2034–2040.
Article MathSciNet MATH Google Scholar
Koltchinskii, V. and Panchenko, D., Rademacher Process and Bounding the Risk of Function Learning, in High Dimension. Probab. II, Gine, D.E., Wellner, J., Eds., Basel: Birkhauser, 1999, pp. 443–457.
Hall, M.A., Correlation-based Feature Selection for Discrete and Numeric Machine Learning, Proc. 17th Int. Conf. on Machine Learning (ICML-00), Morgan Kaufmann Publish., 2000.
Google Scholar
IBM SPSS Modeler 14.2 Algorithms Guide, available at ftp://public.dhe.ibm.com/software/analytics/spss/documentation/modeler/.
Asuncion, A. and Newman, D.J., UCI Machine Learning Repository (http://www.ics.uci.edu/~mlearn/MLRepository.html), Irvine, CA: University of California, School of Information and Computer Science, 2007.
Google Scholar
Tsurko, V.V. and Mikhal’skii, A.I., Statistical Analysis of the Relation between Cancer and Accompanying Diseases, Usp. Gerontologii, 2013, vol. 26, no. 4, pp. 766–774.
Google Scholar

Download references

Author information

Authors and Affiliations

Trapeznikov Institute of Control Sciences, Russian Academy of Sciences, Moscow, Russia
V. V. Tsurko & A. I. Michalski

Authors

V. V. Tsurko
View author publications
You can also search for this author in PubMed Google Scholar
A. I. Michalski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to V. V. Tsurko.

Additional information

Original Russian Text © V.V. Tsurko, A.I. Michalski, 2016, published in Avtomatika i Telemekhanika, 2016, No. 12, pp. 136–154.

This paper was recommended for publication by L.A. Mironovskii, a member of the Editorial Board

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tsurko, V.V., Michalski, A.I. The contrast features selection with empirical data. Autom Remote Control 77, 2212–2226 (2016). https://doi.org/10.1134/S0005117916120109

Download citation

Received: 21 January 2015
Published: 18 December 2016
Issue Date: December 2016
DOI: https://doi.org/10.1134/S0005117916120109

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The contrast features selection with empirical data

Abstract

Access this article

Similar content being viewed by others

Variable Selection and Feature Screening

A Comparative Study Between Feature Selection Algorithms

On the Stability of Feature Selection in the Presence of Feature Correlations

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Navigation

The contrast features selection with empirical data

Abstract

Access this article

Similar content being viewed by others

Variable Selection and Feature Screening

A Comparative Study Between Feature Selection Algorithms

On the Stability of Feature Selection in the Presence of Feature Correlations

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation