Abstract
Scalding is a Scala-based library built on top of Cascading, a Java library that forms an abstraction over low-level Hadoop API. It is comparable to Pig, but brings the advantages of Scala in building MapReduce jobs [1].
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Scala, ”The Scala Programming Language,” 2002. [Online]. Available: http://www.scalalang.org/.
Twitter, Scalding, 2011. [Online]. Available: https://github.com/twitter/scalding.
Wensel, C. K. ”Cascading: Defining and executing complex and fault tolerant data processin workflows on a hadoop cluster” (2008).
Cascading, ”Cascading: Application Platform for Enterprise Big Data” [Online] Available: http://www.cascading.org/
Zaharia, Matei, et al. ”Spark: cluster computing with working sets.” Proceedings of the 2nd USENIX conference on Hot topics in cloud computing. 2010.
B. Hindman, A. Konwinski, M. Zaharia, and I. Stoica. A common substrate for cluster computing. In Workshop on Hot Topics in Cloud Computing (HotCloud) 2009, 2009.
Spark, Apache. [Online] Available: http://spark.incubator.apache.org/docs/latest/
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Srinivasa, K., Muppalla, A.K. (2015). Programming Internals of Scalding and Spark. In: Guide to High Performance Distributed Computing. Computer Communications and Networks. Springer, Cham. https://doi.org/10.1007/978-3-319-13497-0_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-13497-0_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13496-3
Online ISBN: 978-3-319-13497-0
eBook Packages: Computer ScienceComputer Science (R0)