Web Retrieval with Query Expansion: A Parallel Retrieving Technique

Mezyan, Noha; Samawi, Venus W.

doi:10.1007/978-981-10-2717-8_28

Noha Mezyan⁴ &
Venus W. Samawi⁵

Included in the following conference series:

The World Congress on Engineering and Computer Science

626 Accesses

Abstract

Most people consider that the World Wide Web (WWW) is a mine of information. The explosive growth in the WWW, not only in the amount of information but also in contents of Web pages, makes traditional search engines inadequate approach to the retrieval of documents or web pages that are most relevant to user needs (degree of relevance) in a short time. To improve the information retrieval process, from both time and degree of relevance to user need, parallel genetic algorithms could be utilized. In this paper, island genetic algorithm (IGA) is utilized to achieve parallelism and speed up the web information ‎retrieval process. Four different islands with different selection methods and fitness functions are suggested to be used to improve degree of relevance. To achieve parallel behavior, the four islands are executed independently on different ‎servers. Query expansion technique is used to add useful words to user query and enhance number of retrieved documents. ‎Finally, the results obtained by the four islands are combined and passed to a decision making phase to select the documents most pertinent ‎to user needs. Cosine similarity measure is used to evaluate the performance of the proposed ‎technique.‎

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Chen H, Chau M (2004) Web mining: machine learning for web applications. Ann Rev Inf Sci Technol 38:289–329
Google Scholar
Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press
Google Scholar
Herrouz A, Khentout C, Djoudi M (2013) Overview of web content mining tools. Int J Eng Sci (IJES) 2:1–6
Google Scholar
Johnson F, Gupta SK (2012) Web content mining techniques: a survey. Int J Comput Appl 47:4–50
Google Scholar
Picaroungne F, Monmarché N, Oliver A, Venturini G (2002) Web mining with a genetic algorithm. In: 11th international World Wide Web conference, Honolulu, Hawaii, 7–11 May 2002
Google Scholar
Vallim MS, Coello JMA (2003) An agent for web information dissemination based on a genetic algorithm. In: International conference on systems, man and cybernetics. IEEE Press
Google Scholar
Kim S, Zhang B-T (2003) Genetic mining of HTML structures for effective web-document retrieval. Applied intelligence, vol 18. Kluwer Academic Publishers, pp 243–256
Google Scholar
Vizine AL, de Castro LN, Gudwin RR (2005) An evolutionary algorithm to optimize web document retrieval. In: International conference on integration of knowledge intensive multi-agent systems
Google Scholar
Marghny MH, Ali AF (2005) Web mining based on genetic algorithm. In: Cairo: AIML 05 conference, pp 82–87
Google Scholar
Yan X, Zhang C, Zhang S (2009) Genetic algorithm-based strategy for identifying association rules without specifying actual minimum support. Elsevier, Expert Systems with Applications, vol 36, pp 3066–3076
Google Scholar
Sabnis V, Thakur RS (2013) GA based model for web content mining. IJCSI Int J Comput Sci Issues 10:308–313
Google Scholar
Tungar D, Potgantwar AD (2014) Investigation of web mining optimization using microbial genetic algorithm. J Eng Res Appl 4:593–597
Google Scholar
Thada V, Jaglan V (2014) Use of genetic algorithm in web information retrieval. Int J Emerg Technol Comput Appl Sci 7:278–281
Google Scholar
Whitley D, Rana S, Heckendorn RB (1997) Island model genetic algorithms and linearly separable problems. In: Evolutionary computing: proceedings of AISB workshop, Lecture notes in computer science, vol 1305, pp 109–125
Google Scholar
Belal MA, Haggag MH (2013) A structured population genetic-algorithm based on hierarchical hypercube of henes expressions. Int J Comput Appl 64:5–18
Google Scholar
Engelbrecht AP (2002) England, Computational intelligence: an introduction. Wiley
Google Scholar
Simon D (2013) Evolutionary optimization algorithms. Wiley
Google Scholar
Yu X, Gen M (2010) Introduction to evolutionary algorithms. Springer Science & Business Media
Google Scholar
Choi D, Kim J, Kim P (2014) A method for normalizing non-standard words in online social network services: a case study on twitter. In: Vinh PC et al (ed) Context-aware systems and applications, Lecture notes of the institute for computer sciences, social informatics and telecommunications engineering, vol 128. Springer International Publishing, pp 359–368
Google Scholar
Mezyan N, Samawi VW (2015) Web information retrieval using island genetic algorithm. In: Lecture notes in engineering and computer science: proceedings of the world congress on engineering and computer science 2015, 21–23 Oct, 2015, San Francisco, USA, pp 325–330
Google Scholar
Liu B (2007) Web Data mining: exploring hyperlinks, contents and usage data. Springer, New York
Google Scholar
Rivas AR, Iglesias EL, Borrajo L (online) Study of query expansion techniques and their application in the biomedical information retrieval. Sci World J 2014 (2014). http://dx.doi.org/10.1155/2014/132158

Download references

Author information

Authors and Affiliations

Department of Computer Science, Al-Albayt University, P.O. Box 130040, Mafraq, 25113, Jordan
Noha Mezyan
Department of Computer Information System, Amman Arab University, P.O. Box 2234, Amman, 11953, Jordan
Venus W. Samawi

Authors

Noha Mezyan
View author publications
You can also search for this author in PubMed Google Scholar
Venus W. Samawi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Noha Mezyan .

Editor information

Editors and Affiliations

IAENG Secretariat, International Association of Engineers, Hong Kong, Hong Kong
Sio-Iong Ao
Department of Computer and Communication, Engineering College, Catholic University of Daegu, Daegu, Korea (Republic of)
Haeng Kon Kim
Provost and Senior Vice-President, University of New Orleans, New Orleans, Louisiana, USA
Mahyar A. Amouzegar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mezyan, N., Samawi, V.W. (2017). Web Retrieval with Query Expansion: A Parallel Retrieving Technique. In: Ao, SI., Kim, H., Amouzegar, M. (eds) Transactions on Engineering Technologies. WCECS 2015. Springer, Singapore. https://doi.org/10.1007/978-981-10-2717-8_28

Download citation

DOI: https://doi.org/10.1007/978-981-10-2717-8_28
Published: 07 February 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-2716-1
Online ISBN: 978-981-10-2717-8
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics