Skip to main content

Join Query Processing in MapReduce Environment

  • Conference paper
Advances in Communication, Network, and Computing (CNC 2012)

Abstract

MapReduce is a framework for processing large data sets, where straightforward computations are performed by hundreds of machines on large input data. Data could be stored and retrieved using structured queries. Join queries are most frequently used and importatnt. So its crucial to find out efficient join processing techniques. This paper provides overview of join query processing techniques & proposes a strategy to find out best suitable join processing algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Jeffrey, D., Sanjay, G.: MapReduce: Simplified Data Processing on Large Clusters. In: OSDI 2004: Proceedings of the 6th Conference on Symposium on Operating Systems Design & Implementation (2004)

    Google Scholar 

  2. Apache Foundation – Hadoop Project, http://hadoop.apache.org

  3. Miao, J., Ye, W.: Optimization of Multi-Join Query Processing within MapReduce. In: 2010 4th International Universal Communication Symposium, IUCS (2010)

    Google Scholar 

  4. Foto, N.A., Jeffrey, D.U.: Optimizing Multiway Joins in a Map-Reduce Environment. IEEE Transactions on Knowledge and Data Engineering 23(9) (2011)

    Google Scholar 

  5. Spyros, B., Jignesh, M.P., Vuk, E., Jun, R., Eugene, J., Yuanyuan, T.: A Comparison of Join Algorithms for Log Processing in MapReduce. In: SIGMOD 2010, June 6–11. ACM, Indian-apolis (2010)

    Google Scholar 

  6. Jens, D., Jorge-Arnulfo, Q., Alekh, J., Yagiz, K., Vinay, S., Jorg, S.: Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing). In: Proceedings of the VLDB Endowment, vol. 3(1) (2010)

    Google Scholar 

  7. Sai, W., Feng, L., Sharad, M., Beng, C.: Query Optimization for Massively Parallel Data Processing. In: Symposium on Cloud Computing (SOCC 2011). ACM, Cascais (2011)

    Google Scholar 

  8. Yang, H.-C., Dasdan, A., Hsiao, R.-L., Parker, S.: Map-Reduce-Merge: Simplified Relational Data Processing on Large Clusters. In: SIGMOD 2007, June 12–14. ACM, Beijing (2007)

    Google Scholar 

  9. Minqi, Z., Rong, Z., Dadan, Z., Weining, Q., Aoying, Z.: Join Optimization in the MapReduce Environment for Column-wise Data Store. In: 2010 Sixth International Conference on Semantics, Knowledge and Grids. IEEE (2010)

    Google Scholar 

  10. Konstantin, S., Hairong, K., Sanjay, R., Robert, C.: The Hadoop Distributed File System. In: MSST 2010 Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies, MSST (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 ICST Institute for Computer Science, Social Informatics and Telecommunications Engineering

About this paper

Cite this paper

Shaikh, A., Jindal, R. (2012). Join Query Processing in MapReduce Environment. In: Das, V.V., Stephen, J. (eds) Advances in Communication, Network, and Computing. CNC 2012. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 108. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35615-5_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35615-5_42

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35614-8

  • Online ISBN: 978-3-642-35615-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics