Exploring Data with Spark
Apache Spark changed the landscape of big data and analytics when it came out. Developers welcomed it like nothing else. It quickly became the superstar from ascendant technology. It is one of the most active and contributing open source projects in the big data ecosystem. At the time of writing, there are more than 1000 contributors to the project. Many big data companies have started moving from MapReduce to Spark. And there is no single reason for them to do so. Spark provides improvements in handling data, and it is very easy to work with. Before Spark, if you wanted to do batch processing, interactive query, machine learning, and stream analytics, then you would have used multiple tools like MapReduce, Hive, Storm, and so forth. And maintaining such a system with a wide range of technologies is not easy. Apache Spark can handle all of these scenarios and makes developers’ lives easy—one of the many reasons that Spark is so popular among the big data community.