Skip to main content

Applying genetic algorithms to the feature selection problem in information retrieval

  • Conference paper
  • First Online:
Flexible Query Answering Systems (FQAS 1998)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1495))

Included in the following conference series:

Abstract

The demand of accuracy and speed in the Information Retrieval processes has revealed the necessity of a good classification of the large collection of documents existing in databases and Web servers. The representation of documents in the vector space model with terms as features offers the possibility of application of Machine Learning techniques. A filter method to select the most relevant features before the classification process is presented in this paper. A Genetic Algorithm (GA) is used as a powerful tool to search solutions in the domain of relevant features. Implementation and some preliminary experiments have been realized. The application of this technique to the vector space model in Information Retrieval is outlined as future work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baker, J.E. Adaptive Selection Methods for Genetic Algorithms. In Proc. on the First International Conference on Genetic Algorithms and their applications, pp.101–111, Grefenstette, J.J. (ed). Hillsdale, New Jersey: Lawrence Earlbaum, 1985.

    Google Scholar 

  2. Dash, M and Liu, H. Feature Selection for Classification. In Intelligent Data Analysis, vol. 1, no. 3, 1997.

    Google Scholar 

  3. Holland, J.H. Adaptation in Natural and Artificial Systems. Massachusetts: MIT Press, 1992.

    Google Scholar 

  4. John, G.H., Kohavi, R. and Pfleger, K. Irrelevant Features and the Subset Selection Problem. In Proc. of the Eleventh International Conference on Machine Learning, pp.121–129. San Francisco, CA: Morgan Kauffmann Publishers, 1994.

    Google Scholar 

  5. Langley, P. Selection of Relevant Features in Machine Learning. In Proc. of the AAAI Fall Symposium on Relevance. New Orleans, LA: AAAI Press, 1994.

    Google Scholar 

  6. Salton, G. and McGill, M.J. Introduction to Modern Information Retrieval. New York: McGraw-Hill, 1983.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Troels Andreasen Henning Christiansen Henrik Legind Larsen

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Martín-Bautista, M.J., Vila, MA. (1998). Applying genetic algorithms to the feature selection problem in information retrieval. In: Andreasen, T., Christiansen, H., Larsen, H.L. (eds) Flexible Query Answering Systems. FQAS 1998. Lecture Notes in Computer Science, vol 1495. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0056008

Download citation

  • DOI: https://doi.org/10.1007/BFb0056008

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-65082-9

  • Online ISBN: 978-3-540-49655-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics