GaXsearch: An XML Information Retrieval Mechanism Using Genetic Algorithms

  • K. G. Srinivasa
  • S. Sharath
  • K. R. Venugopal
  • Lalit M. Patnaik
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3809)


The XML technology, with its self-describing and extensible tags, is significantly contributing to the next generation semantic web. The present search techniques used for HTML and text documents are not efficient to retrieve relevant XML documents. In this paper, Genetic Algorithms are presented to learn about the tags, which are useful in indexing. The indices and relationship strength metric are used to extract fast and accurate semantically related elements in the XML documents. The Experiments are conducted on the DataBase systems and Logic Programming (DBLP) XML corpus and are evaluated for precision and recall. The proposed GaXsearch outperforms XSEarch [1] and XRank [2] with respect to accuracy and query execution time.


User Query Keyword Query Relationship Strength Query Semantic Document Tree 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Cohen, S., Mamou, J., Kanza, Y., Sagiv, Y.: XSEarch: A Semantic Search Engine for XML. In: VLDB 2003, pp. 45–56 (2003)Google Scholar
  2. 2.
    Guo, L., et al.: XRANK: Ranked Keyword Search over XML Documents. In: SIGMOD 2003 (2003)Google Scholar
  3. 3.
    Luk, R., et al.: A Survey of Search Engines for XML Documents. In: SIGIR Workshop on XML and IR (2000)Google Scholar
  4. 4.
    World Wide Web Consortium XQUERY: A Query Language for XML W3c Working Draft,
  5. 5.
    Yang, J., Korfhage, R.R.: Effects of Query Term Weights Modification in Annual Document Retrieval: A Study Based on a Genetic Algorithm. In: Proceedings of the Second Symposium on Document Analysis and Information Retrieval, pp. 271–285 (1993)Google Scholar
  6. 6.
    Yang, J., Korfhage, R.R., Rasmussen, E.: Query improvement in Information Retrieval using Genetic Algorithms: A Report on the Experiments of the TREC project. In: Proceedings of the First Text Retrieval Conference (TREC-1), pp. 31–58 (1993)Google Scholar
  7. 7.
    Kim, S., Zhang, B.T.: Genetic Mining of HTML Structures for effective Web Document Retrieval. Applied Intelligence 18, 243–256 (2003)CrossRefMathSciNetGoogle Scholar
  8. 8.
    DBLP XML Records (Febraury 2001),

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • K. G. Srinivasa
    • 1
  • S. Sharath
    • 2
  • K. R. Venugopal
    • 1
  • Lalit M. Patnaik
    • 3
  1. 1.Department of Computer Science and EngineeringUniversity Visvesvaraya College of EngineeringBangaloreIndia
  2. 2.Infosys TechnologiesBangaloreIndia
  3. 3.Microprocessor Applications LaboratoryIndian Institute of ScienceIndia

Personalised recommendations