Skip to main content

Web Retrieval with Query Expansion: A Parallel Retrieving Technique

  • Conference paper
  • First Online:
Transactions on Engineering Technologies (WCECS 2015)

Included in the following conference series:

  • 626 Accesses

Abstract

Most people consider that the World Wide Web (WWW) is a mine of information. The explosive growth in the WWW, not only in the amount of information but also in contents of Web pages, makes traditional search engines inadequate approach to the retrieval of documents or web pages that are most relevant to user needs (degree of relevance) in a short time. To improve the information retrieval process, from both time and degree of relevance to user need, parallel genetic algorithms could be utilized. In this paper, island genetic algorithm (IGA) is utilized to achieve parallelism and speed up the web information ‎retrieval process. Four different islands with different selection methods and fitness functions are suggested to be used to improve degree of relevance. To achieve parallel behavior, the four islands are executed independently on different ‎servers. Query expansion technique is used to add useful words to user query and enhance number of retrieved documents. ‎Finally, the results obtained by the four islands are combined and passed to a decision making phase to select the documents most pertinent ‎to user needs. Cosine similarity measure is used to evaluate the performance of the proposed ‎technique.‎

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Chen H, Chau M (2004) Web mining: machine learning for web applications. Ann Rev Inf Sci Technol 38:289–329

    Google Scholar 

  2. Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press

    Google Scholar 

  3. Herrouz A, Khentout C, Djoudi M (2013) Overview of web content mining tools. Int J Eng Sci (IJES) 2:1–6

    Google Scholar 

  4. Johnson F, Gupta SK (2012) Web content mining techniques: a survey. Int J Comput Appl 47:4–50

    Google Scholar 

  5. Picaroungne F, Monmarché N, Oliver A, Venturini G (2002) Web mining with a genetic algorithm. In: 11th international World Wide Web conference, Honolulu, Hawaii, 7–11 May 2002

    Google Scholar 

  6. Vallim MS, Coello JMA (2003) An agent for web information dissemination based on a genetic algorithm. In: International conference on systems, man and cybernetics. IEEE Press

    Google Scholar 

  7. Kim S, Zhang B-T (2003) Genetic mining of HTML structures for effective web-document retrieval. Applied intelligence, vol 18. Kluwer Academic Publishers, pp 243–256

    Google Scholar 

  8. Vizine AL, de Castro LN, Gudwin RR (2005) An evolutionary algorithm to optimize web document retrieval. In: International conference on integration of knowledge intensive multi-agent systems

    Google Scholar 

  9. Marghny MH, Ali AF (2005) Web mining based on genetic algorithm. In: Cairo: AIML 05 conference, pp 82–87

    Google Scholar 

  10. Yan X, Zhang C, Zhang S (2009) Genetic algorithm-based strategy for identifying association rules without specifying actual minimum support. Elsevier, Expert Systems with Applications, vol 36, pp 3066–3076

    Google Scholar 

  11. Sabnis V, Thakur RS (2013) GA based model for web content mining. IJCSI Int J Comput Sci Issues 10:308–313

    Google Scholar 

  12. Tungar D, Potgantwar AD (2014) Investigation of web mining optimization using microbial genetic algorithm. J Eng Res Appl 4:593–597

    Google Scholar 

  13. Thada V, Jaglan V (2014) Use of genetic algorithm in web information retrieval. Int J Emerg Technol Comput Appl Sci 7:278–281

    Google Scholar 

  14. Whitley D, Rana S, Heckendorn RB (1997) Island model genetic algorithms and linearly separable problems. In: Evolutionary computing: proceedings of AISB workshop, Lecture notes in computer science, vol 1305, pp 109–125

    Google Scholar 

  15. Belal MA, Haggag MH (2013) A structured population genetic-algorithm based on hierarchical hypercube of henes expressions. Int J Comput Appl 64:5–18

    Google Scholar 

  16. Engelbrecht AP (2002) England, Computational intelligence: an introduction. Wiley

    Google Scholar 

  17. Simon D (2013) Evolutionary optimization algorithms. Wiley

    Google Scholar 

  18. Yu X, Gen M (2010) Introduction to evolutionary algorithms. Springer Science & Business Media

    Google Scholar 

  19. Choi D, Kim J, Kim P (2014) A method for normalizing non-standard words in online social network services: a case study on twitter. In: Vinh PC et al (ed) Context-aware systems and applications, Lecture notes of the institute for computer sciences, social informatics and telecommunications engineering, vol 128. Springer International Publishing, pp 359–368

    Google Scholar 

  20. Mezyan N, Samawi VW (2015) Web information retrieval using island genetic algorithm. In: Lecture notes in engineering and computer science: proceedings of the world congress on engineering and computer science 2015, 21–23 Oct, 2015, San Francisco, USA, pp 325–330

    Google Scholar 

  21. Liu B (2007) Web Data mining: exploring hyperlinks, contents and usage data. Springer, New York

    Google Scholar 

  22. Rivas AR, Iglesias EL, Borrajo L (online) Study of query expansion techniques and their application in the biomedical information retrieval. Sci World J 2014 (2014). http://dx.doi.org/10.1155/2014/132158

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Noha Mezyan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper

Mezyan, N., Samawi, V.W. (2017). Web Retrieval with Query Expansion: A Parallel Retrieving Technique. In: Ao, SI., Kim, H., Amouzegar, M. (eds) Transactions on Engineering Technologies. WCECS 2015. Springer, Singapore. https://doi.org/10.1007/978-981-10-2717-8_28

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-2717-8_28

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-2716-1

  • Online ISBN: 978-981-10-2717-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics