Improving Classification Accuracy by Means of the Sliding Window Method in Consistency-Based Feature Selection

Pino Angulo, Adrian; Shin, Kilho

doi:10.1007/978-3-319-67786-6_12

Adrian Pino Angulo¹⁷ &
Kilho Shin¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10558))

Included in the following conference series:

International Conference on Discovery Science

944 Accesses

Abstract

In the digital era, collecting relevant information of a technological process has become increasingly cheaper and easier. However, due to the huge available amount of data, supervised classification is one of the most challenging tasks within the artificial intelligence field. Feature selection solves this problem by removing irrelevant and redundant features from data. In this paper we propose a new feature selection algorithm called Swcfs, which works well in high-dimensional and noisy data. Swcfs can detect noisy features by leveraging the sliding window method over the set of consecutive features ranked according to their non-linear correlation with the class feature. The metric Swcfs uses to evaluate sets of features, with respect to their relevance to the class label, is the bayesian risk, which represents the theoretical upper error bound of deterministic classification. Experiments reveal Swcfs is more accurate than most of the state-of-the-art feature selection algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Rohrmair, G., Lowe, G.: Using data-independence in the analysis of intrusion detection systems. Theor. Comput. Sci. 340(1), 82–101 (2005)
Article MathSciNet MATH Google Scholar
Angeleska, A., Jonoska, N., Saito, M.: Rewriting rule chains modeling DNA rearrangement pathways. Theor. Comput. Sci. 454, 5–22 (2012)
Article MathSciNet MATH Google Scholar
De Maria, E., Fages, F., Rizk, A., Soliman, S.: Design, optimization, and predictions of a coupled model of the cell cycle, circadian clock, DNA repair system, irinotecan metabolism and exposure control under temporal logic constraints. Theor. Comput. Sci. 412(21), 2108–2127 (2011)
Article MathSciNet MATH Google Scholar
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97, 273–324 (1997)
Article MATH Google Scholar
Molina, L.C., Belanche, L., Nebot, A.: Feature selection algorithms: a survey and experimental evaluations. In: Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM 2002), 9–12 December 2002, Maebashi City (2002)
Google Scholar
Hodorog, M., Schicho, J.: A regularization approach for estimating the type of a plane curve singularity. Theor. Comput. Sci. 479, 99–119 (2013)
Article MathSciNet MATH Google Scholar
John, G.H., Kohavi, R., Pfleger, K.: Irrelevant feature and the subset selection problem. In: ICML (1994)
Google Scholar
Shin, K., Kuboyama, T., Hashimoto, T., Shepard, D.: Super-CWC and super-LCC: super fast feature selection algorithms. In: Proceedings of 2015 IEEE International Conference on Big Data (Big Data), pp. 1–7 (2015)
Google Scholar
Pino Angulo, A., Shin, K.: Fast and accurate steepest-descent consistency-constrained algorithms for feature selection. In: Pardalos, P., Pavone, M., Farinella, G.M., Cutello, V. (eds.) MOD 2015. LNCS, vol. 9432, pp. 293–305. Springer, Cham (2015). doi:10.1007/978-3-319-27926-8_26
Chapter Google Scholar
Shin, K., Xu, X.M.: A consistency-constrained feature selection algorithm with the steepest descent method. In: Torra, V., Narukawa, Y., Inuiguchi, M. (eds.) MDAI 2009. LNCS, vol. 5861, pp. 338–350. Springer, Heidelberg (2009). doi:10.1007/978-3-642-04820-3_31
Chapter Google Scholar
Kira, K., Rendell, L.A.: A practical approach to feature selection. In: Proceedings of the Ninth International Workshop on Machine Learning, pp. 249–256. Morgan Kaufman Publishers Inc. (1992)
Google Scholar
Kononenko, I.: Estimating attributes: analysis and extensions of RELIEF. In: Bergadano, F., De Raedt, L. (eds.) ECML 1994. LNCS, vol. 784, pp. 171–182. Springer, Heidelberg (1994). doi:10.1007/3-540-57868-4_57
Chapter Google Scholar
Xiaofei, H., Deng, C., Partha, N.: Laplacian score for feature selection. In: Proceedings of the 18th International Conference on Neural Information Processing Systems (NIPS 2005), pp. 507–514 (2005)
Google Scholar
Zhu, L., Miao, L., Zhang, D.: Iterative Laplacian score for feature selection. In: Liu, C.-L., Zhang, C., Wang, L. (eds.) CCPR 2012. CCIS, vol. 321, pp. 80–87. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33506-8_11
Chapter Google Scholar
Quanquan, G., Zhenhui, L., Jiawei, H.: Generalized Fisher score for feature selection. In: Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence (UAI 2011), pp. 266–273 (2011)
Google Scholar
Yu, L., Liu, H.: Feature selection for high-dimensional data: a fast correlation-based filter solution. In: Proceedings of the Twentieth International Conference on Machine Learning (ICML-2003) (2003)
Google Scholar
Guyon, I., Weston, J., Barnhill, S.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46, 389 (2002)
Article MATH Google Scholar
Hall, M.A., Smith, L.A.: Feature selection for machine learning: comparing a correlation-based filter approach to the wrapper. In: Proceedings of the Twelfth International, pp. 235–239. AAAI Press (1999)
Google Scholar
Ding, C., Peng, H.: Minimum redundancy feature selection from microarray gene expression data. In: Proceedings of the IEEE Computer Society Conference on Bioinformatics (CSB 2003) (2003)
Google Scholar
Zhao, Z., Liu, H.: Searching for interacting features. In: Proceedings of the 20th International Joint Conference on Artifical Intelligence (IJCAI 2007) (2007)
Google Scholar
Shin, K., Xu, X.M.: Consistency-based feature selection. In: Velásquez, J.D., Ríos, S.A., Howlett, R.J., Jain, L.C. (eds.) KES 2009. LNCS, vol. 5711, pp. 342–350. Springer, Heidelberg (2009). doi:10.1007/978-3-642-04595-0_42
Chapter Google Scholar
Lichman, M.: UCI machine learning repository, School of Information and Computer Science, University of California, Irvine (2013). http://archive.ics.uci.edu/ml

Download references

Author information

Authors and Affiliations

Graduate School of Applied Informatics, University of Hyogo, Kobe, Hyogo, Japan
Adrian Pino Angulo & Kilho Shin

Authors

Adrian Pino Angulo
View author publications
You can also search for this author in PubMed Google Scholar
Kilho Shin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Adrian Pino Angulo .

Editor information

Editors and Affiliations

Kyoto University, Kyoto, Japan
Akihiro Yamamoto
Hokkaido University, Sapporo, Japan
Takuya Kida
National Institute of Informatics, Tokyo, Japan
Takeaki Uno
Gakushuin University, Tokyo, Japan
Tetsuji Kuboyama

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pino Angulo, A., Shin, K. (2017). Improving Classification Accuracy by Means of the Sliding Window Method in Consistency-Based Feature Selection. In: Yamamoto, A., Kida, T., Uno, T., Kuboyama, T. (eds) Discovery Science. DS 2017. Lecture Notes in Computer Science(), vol 10558. Springer, Cham. https://doi.org/10.1007/978-3-319-67786-6_12

Download citation

DOI: https://doi.org/10.1007/978-3-319-67786-6_12
Published: 16 September 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67785-9
Online ISBN: 978-3-319-67786-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics