Abstract
In this paper, we introduce a new hybrid filter-wrapper method for supervised feature selection, based on the Laplacian Score ranking combined with a wrapper strategy. We propose to rank features with the Laplacian Score to reduce the search space, and then we use this order to find the best feature subset. We compare our method against other based on ranking feature selection methods, namely, Information Gain Attribute Ranking, Relief, Correlation-based Feature Selection, and additionally we include in our comparison a Wrapper Subset Evaluation method. Empirical results over ten real-world datasets from the UCI repository show that our hybrid method is competitive and outperforms in most of the cases to the other feature selection methods used in our experiments.
Chapter PDF
Similar content being viewed by others
References
García, D.G., Rodríguez, R.S.: Spectral clustering and feature selection for microarray data. In: Fourth International Conference on Machine Learning and Applications, pp. 425–428 (2009)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11(1) (2009)
Liu, R., Yang, N., Ding, X., Ma, L.: An unsupervised feature selection algorithm: Laplacian Score combined with distance-based entropy measure. In: Workshop on Intelligent Information Technology Applications, vol. 3, pp. 65–68 (2009)
Niijima, S., Okuno, Y.: Laplacian linear discriminant analysis approach to unsupervised feature selection. IEEE/ACM Transactions on Computational Biology and Bioinformatics 6(4), 605–614 (2009)
Jensen, R., Shen, Q.: Computational intelligence and feature selection: rough and fuzzy approaches, pp. 61–84. Wiley, Chichester (2008)
von Luxburg, U.: A tutorial on spectral clustering. Statistics and Computing 17(4), 395–416 (2007)
Zhao, Z., Liu, H.: Spectral feature selection for supervised and unsupervised learning. In: ICML ’07: Proceedings of the 24th International Conference on Machine learning, pp. 1151–1157. ACM, New York (2007)
Asuncion, A., Newman, D.J.: UCI Machine Learning Repository. School of Information and Computer Science. University of California, Irvine (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html
He, X., Cai, D., Niyogi, P.: Laplacian Score for feature selection. In: Weiss, Y., Schölkopf, B., Platt, J. (eds.) Advances in Neural Information Processing Systems, vol. 18, pp. 507–514. MIT Press, Cambridge (2006)
Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering 17(4), 491–502 (2005)
Loughrey, J., Cunningham, P.: Using Early-Stopping to Avoid Overfitting in Wrapper-Based Feature Selection Employing Stochastic Search. Technical Report (TCD-CS-2005-37). Department of Computer Science, Trinity College Dublin, Dublin, Ireland (2005)
Pal, S.K., Mitra, P.: Pattern Recognition Algorithms for Data Mining, pp. 59–82. Chapman & Hall/CRC (2004)
Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy. J. Mach. Learn. Res. 5, 1205–1224 (2004)
Zhang, L., Sun, G., Guo, J.: Feature selection for pattern classification problems. In: International Conference on Computer and Information Technology, pp. 233–237 (2004)
Guyon, I.: An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157–1182 (2003)
Kim, Y.S., Nick Street, W., Menczer, F.: Feature selection in data mining, pp. 80–105 (2003)
Hall, M.A., Holmes, G.: Benchmarking attribute selection techniques for discrete class data mining. IEEE Transactions on Knowledge and Data Engineering 15, 1437–1447 (2003)
Nadeau, C., Bengio, Y.: Inference for the generalization error. Mach. Learn. 52(3), 239–281 (2003)
Xing, E.P., Jordan, M.I., Karp, R.M.: Feature selection for high-dimensional genomic microarray data. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 601–608 (2001)
Das, S.: Filters, wrappers and a boosting-based hybrid for feature selection. In: ICML ’01: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 74–81. Morgan Kaufmann Publishers Inc., San Francisco (2001)
Dumais, S., Platt, J., Heckerman, D., Sahami, M.: Inductive learning algorithms and representations for text categorization. In: Proceedings of the International Conference on Information and Knowledge Management, pp. 148–155 (1998)
Hall, M.A.: Correlation-based feature selection for machine learning. PhD thesis, Department of Computer Science, University ofWaikato, Hamilton, New Zealand (1998)
Dash, M., Liu, H.: Hybrid search of feature subsets. In: PRICAI’98: Topics in Artificial Intelligence, pp. 238–249 (1998)
Dash, M., Liu, H.: Feature selection for classification. Intelligent Data Analysis 1, 131–156 (1997)
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence 97, 273–324 (1997)
John, G.H., Langley, P.: Estimating Continuous Distributions in Bayesian Classifiers. In: Eleventh Conference on Uncertainty in Artificial Intelligence, San Mateo, pp. 338–345 (1995)
Kononenko, I.: Estimating attributes: Analysis and extensions of relief. In: Proceedings of the Seventh European Conference on Machine Learning, pp. 171–182. Springer, Heidelberg (1994)
Quinlan, R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo (1993)
Kira, K., Rendell, L.: A practical approach to feature selection. In: Proceedings of the Ninth International Conference on Machine Learning, pp. 249–256. Morgan Kaufmann, San Francisco (1992)
Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. IEEE Transactions on Information Theory 13(1), 21–27 (1967)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Solorio-Fernández, S., Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F. (2010). Hybrid Feature Selection Method for Supervised Classification Based on Laplacian Score Ranking. In: Martínez-Trinidad, J.F., Carrasco-Ochoa, J.A., Kittler, J. (eds) Advances in Pattern Recognition. MCPR 2010. Lecture Notes in Computer Science, vol 6256. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15992-3_28
Download citation
DOI: https://doi.org/10.1007/978-3-642-15992-3_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15991-6
Online ISBN: 978-3-642-15992-3
eBook Packages: Computer ScienceComputer Science (R0)