Advertisement

A Search Engine Accepting On-Line Updates

  • Mauricio Marin
  • Carolina Bonacic
  • Veronica Gil Costa
  • Carlos Gomez
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4641)

Abstract

We describe and evaluate the performance of a parallel search engine that is able to cope efficiently with concurrent read/write operations. Read operations come in the usual form of queries submitted to the search engine and write ones come in the form of new documents added to the text collection in an on-line manner, namely the insertions are embedded into the main stream of user queries in an unpredictable arrival order but with query results respecting causality. The search engine is built upon distributed inverted files for which we propose generic strategies for load balance and concurrency control.

Keywords

Search Engine Load Balance Query Processing Query Term User Query 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Badue, C., Baeza-Yates, R., Ribeiro, B., Ziviani, N.: Distributed query processing using partitioned inverted files. In: Eighth Symposium on String Processing and Information Retrieval (SPIRE 2001), pp. 10–20 (November 2001)Google Scholar
  2. 2.
    Buttcher, S., Clarke, C.: Indexing time vs. query time trade-offs in dynamic information retrieval systems. In: International Conference on Information and Knowledge Management, pp. 317–318 (2005)Google Scholar
  3. 3.
    MacFarlane, A., McCann, J., Robertson, S.: Parallel search using partitioned inverted files. In: 7th International Symposium on String Processing and Information Retrieval, pp. 209–220. IEEE Computer Society Press, Los Alamitos (2000)CrossRefGoogle Scholar
  4. 4.
    Moffat, W., Webber, J., Zobel, Baeza-Yates, R.: A pipelined architecture for distributed text query evaluation. Information Retrieval  (October 5, 2006)Google Scholar
  5. 5.
    Orlando, S., Perego, R., Silvestri, F.: Design of a parallel and distributed web search engine. In: Proc. 2001 Parallel Computing Conf., pp.197–204 (2001)Google Scholar
  6. 6.
    Zobel, J., Moffat, A.: Inverted files for text search engines. ACM Computing Surveys 38(2) (2006)Google Scholar
  7. 7.
    Persin, M., Zobel, J., Sacks-Davis, R.: Filtered document retrieval with frequency-sorted indexes. Journal of the American Society for Information Science 47(10), 749–764 (1996)CrossRefGoogle Scholar
  8. 8.
    Valiant, L.G.: A bridging model for parallel computation. Comm. ACM 33, 103–111 (1990)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Mauricio Marin
    • 1
  • Carolina Bonacic
    • 2
  • Veronica Gil Costa
    • 3
  • Carlos Gomez
    • 1
  1. 1.Yahoo! Research, Santiago, University ofChile
  2. 2.ARTECS, Complutense University of MadridSpain
  3. 3.DCC, University of San LuisArgentina

Personalised recommendations