Abstract
MapReduce is a framework for processing large data sets, where straightforward computations are performed by hundreds of machines on large input data. Data could be stored and retrieved using structured queries. Join queries are most frequently used and importatnt. So its crucial to find out efficient join processing techniques. This paper provides overview of join query processing techniques & proposes a strategy to find out best suitable join processing algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Jeffrey, D., Sanjay, G.: MapReduce: Simplified Data Processing on Large Clusters. In: OSDI 2004: Proceedings of the 6th Conference on Symposium on Operating Systems Design & Implementation (2004)
Apache Foundation – Hadoop Project, http://hadoop.apache.org
Miao, J., Ye, W.: Optimization of Multi-Join Query Processing within MapReduce. In: 2010 4th International Universal Communication Symposium, IUCS (2010)
Foto, N.A., Jeffrey, D.U.: Optimizing Multiway Joins in a Map-Reduce Environment. IEEE Transactions on Knowledge and Data Engineering 23(9) (2011)
Spyros, B., Jignesh, M.P., Vuk, E., Jun, R., Eugene, J., Yuanyuan, T.: A Comparison of Join Algorithms for Log Processing in MapReduce. In: SIGMOD 2010, June 6–11. ACM, Indian-apolis (2010)
Jens, D., Jorge-Arnulfo, Q., Alekh, J., Yagiz, K., Vinay, S., Jorg, S.: Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing). In: Proceedings of the VLDB Endowment, vol. 3(1) (2010)
Sai, W., Feng, L., Sharad, M., Beng, C.: Query Optimization for Massively Parallel Data Processing. In: Symposium on Cloud Computing (SOCC 2011). ACM, Cascais (2011)
Yang, H.-C., Dasdan, A., Hsiao, R.-L., Parker, S.: Map-Reduce-Merge: Simplified Relational Data Processing on Large Clusters. In: SIGMOD 2007, June 12–14. ACM, Beijing (2007)
Minqi, Z., Rong, Z., Dadan, Z., Weining, Q., Aoying, Z.: Join Optimization in the MapReduce Environment for Column-wise Data Store. In: 2010 Sixth International Conference on Semantics, Knowledge and Grids. IEEE (2010)
Konstantin, S., Hairong, K., Sanjay, R., Robert, C.: The Hadoop Distributed File System. In: MSST 2010 Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies, MSST (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 ICST Institute for Computer Science, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Shaikh, A., Jindal, R. (2012). Join Query Processing in MapReduce Environment. In: Das, V.V., Stephen, J. (eds) Advances in Communication, Network, and Computing. CNC 2012. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 108. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35615-5_42
Download citation
DOI: https://doi.org/10.1007/978-3-642-35615-5_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35614-8
Online ISBN: 978-3-642-35615-5
eBook Packages: Computer ScienceComputer Science (R0)