Abstract
Nowadays, search engines are definitely a dominating web tool for finding information on the web. However, web search engines usually return web page references in a global ranking making it difficult to the users to browse different topics captured in the result set. Recently, there are meta-search engine systems that discover knowledge in these web search results providing the user with the possibility to browse different topics contained in the result set. In this paper, we focus on the problem of determining different thematic groups on web search engine results that existing web search engines provide. We propose a novel system that exploits semantic entities of Wikipedia for grouping the result set in different topic groups, according to the various meanings of the provided query. The proposed method utilizes a number of semantic annotation techniques using Knowledge Bases, like WordNet and Wikipedia, in order to perceive the different senses of each query term. Finally, the method annotates the extracted topics using information derived from clusters which in following are presented to the end user.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Baeza-Yates, R.A., Ribeiro-Neto, B.A.: Modern Information Retrieval, 2nd edn. Addison Wesley (1999, 2011), http://mir2ed.org/
Caputo, A., Basile, P., Semeraro, G.: SENSE: SEmantic N-levels Search Engine at CLEF2008 Ad Hoc Robust-WSD Track. In: Peters, C., Deselaers, T., Ferro, N., Gonzalo, J., Jones, G.J.F., Kurimo, M., Mandl, T., Peñas, A., Petras, V. (eds.) CLEF 2008. LNCS, vol. 5706, pp. 126–133. Springer, Heidelberg (2009)
Carmel, D., Roitman, H., Zwerdling, N.: Enhancing cluster labeling using wikipedia. In: SIGIR 2009, pp. 139–146 (2009)
Carpineto, C., Osiski, S., Romano, G., Weiss, D.: A survey of Web clustering engines. ACM Comput. Surv. (2009)
comScore. Baidu Ranked Third Largest Worldwide Search Property (2008), http://www.comscore.com/press/release.asp?press=2018
Cutting, D.R., Karger, D.R., Pedersen, J.O., Tukey, J.W.: Scatter/Gather: A Cluster-based Approach to Browsing Large Document Collections. In: SIGIR 1992, pp. 318–329 (1992)
Dunham, M.H.: Data Mining: Introductory and Advanced Topics. Prentice Hall PTR, Upper Saddle River (2002)
Everitt, B.S., Landau, S., Leese, M.: Cluster Analysis, 4th edn. Oxford University Press (2001)
Ferragina, P., Gullì, A.: The Anatomy of SnakeT: A Hierarchical Clustering Engine for Web-Page Snippets. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) PKDD 2004. LNCS (LNAI), vol. 3202, pp. 506–508. Springer, Heidelberg (2004)
Ferragina, P., Scaiella, U.: TAGME: on-the-fly annotation of short text fragments (by wikipedia entities). In: CIKM 2010, pp. 1625–1628 (2010)
Giannotti, F., Nanni, M., Pedreschi, D., Samaritani, F.: WebCat: Automatic Categorization of Web Search Results. In: SEBD 2003, pp. 507–518 (2003)
Hearst, M.A.: Search User Interfaces, 1st edn. Cambridge University Press (2009)
Hemayati, R., Meng, W., Yu, C.: Semantic-Based Grouping of Search Engine Results Using WordNet. In: Dong, G., Lin, X., Wang, W., Yang, Y., Yu, J.X. (eds.) APWeb/WAIM 2007. LNCS, vol. 4505, pp. 678–686. Springer, Heidelberg (2007)
Hoffart, J., Suchanek, F., Berberich, K., Lewis-Kelham, E., Melo, G., Weikum, G.: YAGO2: exploring and querying world knowledge in time, space, context, and many languages. In: WWW (Companion Volume) 2011, pp. 229–232 (2011)
Huang, J., Efthimiadis, E.N.: Analyzing and evaluating query reformulation strategies in web search logs. In: CIKM 2009, pp. 77–86 (2009)
Jansen, B.J., Spink, A., Blakely, C., Koshman, S.: Defining a session on Web search engines. JASIST 58(6), 862–871 (2007)
Jansen, B.J., Spink, A., Pedersen, J.: A temporal comparison of AltaVista Web searching. JASIST 56(6), 559–570 (2005)
Kanavos, A., Theodoridis, E., Tsakalidis, A.: Extracting Knowledge from Web Search Engine Results. In: ICTAI 2012, pp. 860–867 (2012)
Maarek, Y.S., Fagin, R., Ben-Shaul, I.Z., Pelleg, D.: Ephemeral Document Clustering for Web Applications. Tech. rep. RJ 10186, IBM Research (2000)
Makris, C., Plegas, Y., Theodoridis, E.: Improved text annotation with Wikipedia entities. In: SAC 2013, pp. 288–295 (2013)
Mihalcea, R., Csomai, A.: Wikify!: linking documents to encyclopedic knowledge. In: CIKM 2007, pp. 233–242 (2007)
Osinski, S., Stefanowski, J., Weiss, D.: Lingo: Search Results Clustering Algorithm Based on Singular Value Decomposition. In: Intelligent Information Systems 2004, pp. 359–368 (2004)
Scaiella, U., Ferragina, P., Marino, A., Ciaramita, M.: Topical clustering of search results. In: WSDM 2012, pp. 223–232 (2012)
Stein, B., Eissen, S.M.Z.: Topic Identification: Framework and Application. In: I-KNOW 2004, pp. 353–360 (2004)
Trillo, R., Po, L., Ilarri, S., Bergamaschi, S., Mena, E.: Using semantic techniques to access web data. Inf. Syst. 36(2), 117–133 (2011)
Zamir, O., Etzioni, O.: Grouper: A Dynamic Clustering Interface to Web Search Results. Computer Networks 31(11-16), 1361–1374 (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kanavos, A., Makris, C., Plegas, Y., Theodoridis, E. (2013). Extracting Knowledge from Web Search Engine Using Wikipedia. In: Iliadis, L., Papadopoulos, H., Jayne, C. (eds) Engineering Applications of Neural Networks. EANN 2013. Communications in Computer and Information Science, vol 384. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41016-1_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-41016-1_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41015-4
Online ISBN: 978-3-642-41016-1
eBook Packages: Computer ScienceComputer Science (R0)