Skip to main content

ETL with Hadoop

  • Chapter
  • First Online:
Book cover Big Data Made Easy
  • 5134 Accesses

Abstract

Given that Hadoop-based Map Reduce programming is a relatively new skill, there is likely to be a shortage of highly skilled staff for some time, and those skills will come at a premium price. ETL (extract, transform, and load ) tools, like Pentaho and Talend, offer a visual, component-based method to create Map Reduce jobs, allowing ETL chains to be created and manipulated as visual objects. Such tools are a simpler and quicker way for staff to approach Map Reduce programming. I’m not suggesting that they are a replacement for Java or Pig-based code, but as an entry point they offer a great deal of pre-defined functionality that can be merged so that complex ETL chains can be created and scheduled. This chapter will examine these two tools from installation to use, and along the way, I will offer some resolutions for common problems and errors you might encounter.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 34.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 44.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Michael Frampton

About this chapter

Cite this chapter

Frampton, M. (2015). ETL with Hadoop. In: Big Data Made Easy. Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4842-0094-0_10

Download citation

Publish with us

Policies and ethics