Skip to main content

Scalable Top-K Query Processing Using Graphics Processing Unit

  • Conference paper
  • First Online:
Languages and Compilers for Parallel Computing (LCPC 2017)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11403))

Abstract

Top-K query processing is one of the fundamental and the most performance-deciding components in Web search engines. A number of techniques such as dynamic pruning have been proposed to reduce the query processing time on CPU. However, it has become increasingly difficult to further improve Top-K query processing’s efficiency without hurting its effectiveness. On the other hand, Graphic Processing Unit (GPU), a powerful computing accelerator on almost every computer today, is barely tapped in Web search engines. The biggest challenge to accelerate top-K query processing on GPU is that the parallel nature of execution model of GPU prevents many CPU top-K query processing optimizations from being directly ported to GPU. GPU with hundreds of cores is ideal for applications with massive parallelism, which is not readily available in existing CPU-oriented top-K query implementations.

This paper exploits the GPU computation power for top-K query processing. In particular, we propose a new domain-specific parallelization framework to utilize GPU to parallelize it. The proposed framework is general enough for both disjunctive and conjunctive query processing modes. Experiments on TREC collections show that our proposed GPU top-K query processing framework is able to improve the query processing time by a factor of 7 when compared with state-of-the-art dynamic pruning methods for the disjunctive mode and by a factor of 6 when compared with the conjunctive mode. Our results show that our GPU top-K query processing framework is faster than previously known GPU baseline method. In particular, our framework is shown to be more scalable and efficient than the CPU and GPU baselines when K is large.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Adinetz, A.: CUDA pro tip: optimized filtering with warp-aggregated atomics. Parallel Forall. Np (2014)

    Google Scholar 

  2. Ao, N., et al.: Efficient parallel lists intersection and index compression algorithms using graphics processing units. Proc. VLDB Endow. 4(8), 470–481 (2011)

    Article  Google Scholar 

  3. Asadi, N., Lin, J.: Effectiveness/efficiency tradeoffs for candidate generation in multi-stage retrieval architectures. In: Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 997–1000. ACM (2013)

    Google Scholar 

  4. Barroso, L.A., Dean, J., Holzle, U.: Web search for a planet: the Google cluster architecture. IEEE Micro 23(2), 22–28 (2003)

    Article  Google Scholar 

  5. Broder, A.Z., Carmel, D., Herscovici, M., Soffer, A., Zien, J.: Efficient query evaluation using a two-level retrieval process. In: Proceedings of the twelfth International Conference on Information and Knowledge Management, pp. 426–434. ACM (2003)

    Google Scholar 

  6. Buckley, C., Lewit, A.F.: Optimization of inverted vector searches. In: Proceedings of the 8th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 97–110. ACM (1985)

    Google Scholar 

  7. Büttcher, S., Clarke, C.L.: Index compression is good, especially for random access. In: Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, pp. 761–770. ACM (2007)

    Google Scholar 

  8. Ding, S., He, J., Yan, H., Suel, T.: Using graphics processors for high performance IR query processing. In: Proceedings of the 18th International Conference on World Wide Web, pp. 421–430. ACM (2009)

    Google Scholar 

  9. Ding, S., Suel, T.: Faster top-k document retrieval using block-max indexes. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 993–1002. ACM (2011)

    Google Scholar 

  10. Harris, M., Sengupta, S., Owens, J.D.: Parallel prefix sum (scan) with CUDA. GPU Gems 3(39), 851–876 (2007)

    Google Scholar 

  11. He, B., et al.: Relational query coprocessing on graphics processors. ACM Trans. Database Syst. (TODS) 34(4), 21 (2009)

    Article  Google Scholar 

  12. Hoberock, J., Bell, N.: Thrust: a parallel template library (2010)

    Google Scholar 

  13. Lee, S.J., Jeon, M., Kim, D., Sohn, A.: Partitioned parallel radix sort. J. Parallel Distrib. Comput. 62(4), 656–668 (2002)

    Article  Google Scholar 

  14. Lichterman, D.: Course project for UIUC ECE 498 AL: programming massively parallel processors. Wen-Mei Hwu and David Kirk, instructors (2007)

    Google Scholar 

  15. Macdonald, C., Santos, R.L., Ounis, I.: The whens and hows of learning to rank for web search. Inf. Retrieval 16(5), 584–628 (2013)

    Article  Google Scholar 

  16. Robertson, S.E., Walker, S., Jones, S., Hancock-Beaulieu, M.M., Gatford, M., et al.: Okapi at TREC-3. NIST Special Publication SP, p. 109 (1995)

    Google Scholar 

  17. Schurman, E., Brutlag, J.: Performance related changes and their user impact. In: Velocity Web Performance and Operations Conference (2009)

    Google Scholar 

  18. Sengupta, S., Harris, M., Garland, M., Owens, J.D.: Efficient parallel scan algorithms for many-core GPUs. In: Scientific Computing with Multicore and Accelerators, pp. 413–442 (2011)

    Google Scholar 

  19. Shams, R., Kennedy, R., et al.: Efficient histogram algorithms for NVIDIA CUDA compatible devices. In: Proceedings of the International Conference on Signal Processing and Communications Systems (ICSPCS), pp. 418–422. Citeseer (2007)

    Google Scholar 

  20. Tatikonda, S., Cambazoglu, B.B., Junqueira, F.P.: Posting list intersection on multicore architectures. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 963–972. ACM (2011)

    Google Scholar 

  21. Tonellotto, N., Macdonald, C., Ounis, I.: Efficient and effective retrieval using selective pruning. In: Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, pp. 63–72. ACM (2013)

    Google Scholar 

  22. Turtle, H., Flood, J.: Query evaluation: strategies and optimizations. Inf. Process. Manag. 31(6), 831–850 (1995)

    Article  Google Scholar 

  23. Wang, L., Lin, J., Metzler, D.: A cascade ranking model for efficient ranked retrieval. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 105–114. ACM (2011)

    Google Scholar 

  24. Wu, D., Zhang, F., Ao, N., Wang, G., Liu, J., Liu, J.: Efficient lists intersection by CPU-GPU cooperative computing. In: 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Ph.D. Forum (IPDPSW), pp. 1–8. IEEE (2010)

    Google Scholar 

  25. Wu, H., Fang, H.: Document prioritization for scalable query processing. In: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, pp. 1609–1618. ACM (2014)

    Google Scholar 

  26. Zhang, J., Long, X., Suel, T.: Performance of compressed inverted list caching in search engines. In: Proceedings of the 17th International Conference on World Wide Web, pp. 387–396. ACM (2008)

    Google Scholar 

  27. Zukowski, M., Heman, S., Nes, N., Boncz, P.: Super-scalar RAM-CPU cache compression. In: Proceedings of the 22nd International Conference on Data Engineering, ICDE 2006, p. 59. IEEE (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yulin Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, Y., Fang, H., Li, X. (2019). Scalable Top-K Query Processing Using Graphics Processing Unit. In: Rauchwerger, L. (eds) Languages and Compilers for Parallel Computing. LCPC 2017. Lecture Notes in Computer Science(), vol 11403. Springer, Cham. https://doi.org/10.1007/978-3-030-35225-7_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-35225-7_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-35224-0

  • Online ISBN: 978-3-030-35225-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics