Abstract
A number of strategies to perform parallel query processing in large scale Web search engines have been proposed in recent years. Their design assume that computers never fail. However, in actual data centers supporting Web search engines, individual cluster processors can enter or leave service dynamically due to transient and/or permanent faults. This paper studies the suitability of efficient query processing strategies under a standard setting where processor replication is used to improve query throughput and support fault-tolerance.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Badue, C., Baeza-Yates, R., Ribeiro, B., Ziviani, N.: Distributed query processing using partitioned inverted files. In: SPIRE, pp. 10–20 (November 2001)
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley, Reading (1999)
Bonacic, C., Garcia, C., Marin, M., Prieto, M.E., Tirado, F.: Exploiting Hybrid Parallelism in Web Search Engines. In: Luque, E., Margalef, T., BenÃtez, D. (eds.) Euro-Par 2008. LNCS, vol. 5168, pp. 414–423. Springer, Heidelberg (2008)
Broder, A.Z., Carmel, D., Herscovici, M., Soffer, A., Zien, J.Y.: Efficient query evaluation using a two-level retrieval process. In: CIKM, pp. 426–434 (2003)
Broder, A.Z., Ciccolo, P., Fontoura, M., Gabrilovich, E., Josifovski, V., Riedel, L.: Search advertising using web relevance feedback. In: CIKM, pp. 1013–1022 (2008)
Chaudhuri, S., Church, K., Christian König, A.: Liying Sui. Heavy-tailed distributions and multi-keyword queries. In: SIGIR, pp. 663–670 (2007)
Ding, S., Attenberg, J., Baeza-Yates, R.A., Suel, T.: Batch query processing for Web search engines. In: WSDM, pp. 137–146 (2011)
Falchi, F., Gennaro, C., Rabitti, F., Zezula, P.: Mining query logs to optimize index partitioning in parallel web search engines. In: INFOSCALE, p. 43 (2007)
Jeong, B.S., Omiecinski, E.: Inverted file partitioning schemes in multiple disk systems. TPDS 16(2), 142–153 (1995)
MacFarlane, A.A., McCann, J.A., Robertson, S.E.: Parallel search using partitioned inverted files. In: SPIRE, pp. 209–220 (2000)
Marin, M., Gil-Costa, V.: High-performance distributed inverted files. In: CIKM 2007, pp. 935–938 (2007)
Marin, M., Gil-Costa, V., Bonacic, C., Baeza-Yates, R.A., Scherson, I.D.: Sync/async parallel search for the efficient design and construction of web search engines. Parallel Computing 36(4), 153–168 (2010)
Marin, M., Gil-Costa, V., Gomez-Pantoja, C.: New caching techniques for web search engines. In: HPDC, pp. 215–226 (2010)
Marzolla, M.: Libcppsim: a Simula-like, portable process-oriented simulation library in C++. In: ESM, pp. 222–227. SCS (2004)
Moffat, A., Webber, W., Zobel, J., Baeza-Yates, R.: A pipelined architecture for distributed text query evaluation. Information Retrieval (August 2007)
Persin, M., Zobel, J., Sacks-Davis, R.: Filtered document retrieval with frequency-sorted indexes. JASIS 47(10), 749–764 (1996)
Xi, W., Sornil, O., Luo, M., Fox, E.A.: Hybrid partition inverted files: Experimental validation. In: Agosti, M., Thanos, C. (eds.) ECDL 2002. LNCS, vol. 2458, pp. 422–431. Springer, Heidelberg (2002)
Zhang, J., Suel, T.: Optimized inverted list assignment in distributed search engine architectures. In: IPDPS (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gomez-Pantoja, C., Marin, M., Gil-Costa, V., Bonacic, C. (2011). An Evaluation of Fault-Tolerant Query Processing for Web Search Engines. In: Jeannot, E., Namyst, R., Roman, J. (eds) Euro-Par 2011 Parallel Processing. Euro-Par 2011. Lecture Notes in Computer Science, vol 6852. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23400-2_36
Download citation
DOI: https://doi.org/10.1007/978-3-642-23400-2_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23399-9
Online ISBN: 978-3-642-23400-2
eBook Packages: Computer ScienceComputer Science (R0)