Advertisement

Faster BlockMax WAND with Longer Skipping

  • Antonio MalliaEmail author
  • Elia PorcianiEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11437)

Abstract

One of the major problems for modern search engines is to keep up with the tremendous growth in the size of the web and the number of queries submitted by users. The amount of data being generated today can only be processed and managed with specialized technologies.

BlockMax WAND and the more recent Variable BlockMax WAND represent the most advanced query processing algorithms that make use of dynamic pruning techniques, which allow them to retrieve the top k most relevant documents for a given query without any effectiveness degradation of its ranking. In this paper, we describe a new technique for the BlockMax WAND family of query processing algorithm, which improves block skipping in order to increase its efficiency. We show that our optimization is able to improve query processing speed on short queries by up to 37% with negligible additional space overhead.

Keywords

Top-k query processing Inverted index Early termination 

Notes

Acknowledgments

Antonio Mallia’s research was partially supported by NSF Grant IIS-1718680 “Index Sharding and Query Routing in Distributed Search Engines”.

References

  1. 1.
    Broder, A.Z., Carmel, D., Herscovici, M., Soffer, A., Zien, J.: Efficient query evaluation using a two-level retrieval process. In: Proceedings of the 12th International Conference on Information and Knowledge Management, pp. 426–434 (2003)Google Scholar
  2. 2.
    Callan, J., Hoy, M., Yoo, C., Zhao, L.: Clueweb09 data set (2009). http://lemurproject.org/clueweb09/
  3. 3.
    Daoud, C.M., de Moura, E.S., da Costa Carvalho, A.L., da Silva, A.S., de Oliveira, D.F., Rossi, C.: Fast top-k preserving query processing using two-tier indexes. Inf. Process. Manage. 52, 855–872 (2016)CrossRefGoogle Scholar
  4. 4.
    Daoud, C.M., de Moura, E.S., de Oliveira, D.F., da Silva, A.S., Rossi, C., da Costa Carvalho, A.L.: Waves: a fast multi-tier top-k query processing algorithm. Inf. Retr. J. 20, 292–316 (2017)CrossRefGoogle Scholar
  5. 5.
    Dimopoulos, C., Nepomnyachiy, S., Suel, T.: Optimizing top-k document retrieval strategies for block-max indexes. In: Proceedings of the 6th ACM International Conference on Web Search and Data Mining, pp. 113–122 (2013)Google Scholar
  6. 6.
    Ding, S., Suel, T.: Faster top-k document retrieval using block-max indexes. In Proceedings of the 34th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 993–1002 (2011)Google Scholar
  7. 7.
    Kane, A., Tompa, F.W.: Split-lists and initial thresholds for wand-based search. In: Proceedings of the 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 877–880 (2018)Google Scholar
  8. 8.
    Mallia, A., Ottaviano, G., Porciani, E., Tonellotto, N., Venturini, R.: Faster blockmax WAND with variable-sized blocks. In: Proceedings of the 40th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 625–634 (2017)Google Scholar
  9. 9.
    Ottaviano, G., Venturini, R.: Partitioned Elias-Fano indexes. In: Proceedings of the 37th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 273–282 (2014)Google Scholar
  10. 10.
    Rojas, O., Gil-Costa, V., Marin, M.: Efficient parallel block-max wand algorithm. In: Proceedings of the 19th International Conference on Parallel Processing, pp. 394–405 (2013)Google Scholar
  11. 11.
    Silvestri, F.: Sorting out the document identifier assignment problem. In: Proceedings of the 29th European Conference on IR Research, pp. 101–112 (2007)Google Scholar
  12. 12.
    Vigna, S.: Quasi-succinct indices. In: Proceedings of the 6th ACM International Conference on Web Search and Data Mining, pp. 83–92 (2013)Google Scholar
  13. 13.
    Yan, H., Ding, S., Suel, T.: Inverted index compression and query processing with optimized document ordering. In: Proceedings of the 18th International Conference on World Wide Web, pp. 401–410 (2009)Google Scholar
  14. 14.
    Zobel, J., Moffat, A.: Inverted files for text search engines. ACM Comput. Surv. 38(2), 6 (2006)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Computer Science and EngineeringNew York UniversityNew YorkUSA
  2. 2.Sease Ltd.LondonUK

Personalised recommendations