Feature Selection in Hierarchical Feature Spaces

Ristoski, Petar; Paulheim, Heiko

doi:10.1007/978-3-319-11812-3_25

Petar Ristoski²¹ &
Heiko Paulheim²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8777))

Included in the following conference series:

International Conference on Discovery Science

2038 Accesses
30 Citations

Abstract

Feature selection is an important preprocessing step in data mining, which has an impact on both the runtime and the result quality of the subsequent processing steps. While there are many cases where hierarchic relations between features exist, most existing feature selection approaches are not capable of exploiting those relations. In this paper, we introduce a method for feature selection in hierarchical feature spaces. The method first eliminates redundant features along paths in the hierarchy, and further prunes the resulting feature set based on the features’ relevance. We show that our method yields a good trade-off between feature space compression and classification accuracy, and outperforms both standard approaches as well as other approaches which also exploit hierarchies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Blum, A.L., Langley, P.: Selection of relevant features and examples in machine learning. Artificial Intelligence 97, 245–271 (1997)
Article MathSciNet MATH Google Scholar
Dash, M., Liu, H.: Feature selection for classification. Intelligent Data Analysis 1(3), 131–156 (1997)
Article Google Scholar
Jeong, Y., Myaeng, S.-H.: Feature selection using a semantic hierarchy for event recognition and type classification. In: International Joint Conference on Natural Language Processing (2013)
Google Scholar
John, G.H., Kohavi, R., Pfleger, K.: Irrelevant features and the subset selection problem. In: ICML 1994, pp. 121–129 (1994)
Google Scholar
Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Hellmann, S., Morsey, M., van Kleef, P., Auer, S., Bizer, C.: DBpedia – A Large-scale, Multilingual Knowledge Base Extracted from Wikipedia. Semantic Web Journal (2013)
Google Scholar
Lu, S., Ye, Y., Tsui, R., Su, H., Rexit, R., Wesaratchakit, S., Liu, X., Hwa, R.: Domain ontology-based feature reduction for high dimensional drug data and its application to 30-day heart failure readmission prediction. In: International Conference on Collaborative Computing (Collaboratecom), pp. 478–484 (2013)
Google Scholar
Mendes, P.N., Jakob, M., García-Silva, A., Bizer, C.: Dbpedia spotlight: Shedding light on the web of documents. In: Proceedings of the 7th International Conference on Semantic Systems, I-Semantics (2011)
Google Scholar
Molina, L.C., Belanche, L., Nebot, À.: Feature selection algorithms: A survey and experimental evaluation. In: International Conference on Data Mining (ICDM), pp. 306–313. IEEE (2002)
Google Scholar
Paulheim, H.: Generating possible interpretations for statistics from linked open data. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds.) ESWC 2012. LNCS, vol. 7295, pp. 560–574. Springer, Heidelberg (2012)
Google Scholar
Paulheim, H., Fürnkranz, J.: Unsupervised Generation of Data Mining Features from Linked Open Data. In: International Conference on Web Intelligence, Mining, and Semantics, WIMS 2012 (2012)
Google Scholar
Paulheim, H., Ristoski, P., Mitichkin, E., Bizer, C.: Data mining with background knowledge from the web. In: RapidMiner World (to appear, 2014)
Google Scholar
Platt, J.C.: Sequential minimal optimization: A fast algorithm for training support vector machines. Technical report, Advances in Kernel Methods - Support Vector Learning (1998)
Google Scholar
Wang, B.B., Bob Mckay, R.I., Abbass, H.A., Barlow, M.: A comparative study for domain ontology guided feature extraction. In: Australasian Computer Science Conference (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Research Group Data and Web Science, University of Mannheim, Germany
Petar Ristoski & Heiko Paulheim

Authors

Petar Ristoski
View author publications
You can also search for this author in PubMed Google Scholar
Heiko Paulheim
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Knowledge Technologies, Jožef Stefan Institute, Jamova cesta 39, 1000, Ljubljana, Slovenia
Sašo Džeroski , Panče Panov & Dragi Kocev , &
Faculty of Administration, University of Ljubljana, Gosarjeva 5, 1000, Ljubljana, Slovenia
Ljupčo Todorovski

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ristoski, P., Paulheim, H. (2014). Feature Selection in Hierarchical Feature Spaces. In: Džeroski, S., Panov, P., Kocev, D., Todorovski, L. (eds) Discovery Science. DS 2014. Lecture Notes in Computer Science(), vol 8777. Springer, Cham. https://doi.org/10.1007/978-3-319-11812-3_25

Download citation

DOI: https://doi.org/10.1007/978-3-319-11812-3_25
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11811-6
Online ISBN: 978-3-319-11812-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics