In the previous chapter, we focused on a long-term processing job, which runs in a Hadoop cluster and leverages YARN or Hive. In this chapter, I would like to introduce you to what I call the 2014 way of processing the data: streaming data. Indeed, more and more data processing infrastructures are relying on streaming or logging architecture that ingest the data, make some transformation, and then transport the data to a data persistency layer.


Configuration File Processing Pipeline Site Traffic Hadoop Cluster Clickstream Data 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Copyright information

© Bahaaldine Azarmi 2016

Authors and Affiliations

  • Bahaaldine Azarmi
    • 1
  1. 1.Saint CloudFrance

Personalised recommendations