Advertisement

Practical Hive

A Guide to Hadoop's Data Warehouse System

  • Scott Shaw
  • Andreas François Vermeulen
  • Ankur Gupta
  • David Kjerrumgaard

Table of contents

  1. Front Matter
    Pages i-xxi
  2. Scott Shaw, Andreas François Vermeulen, Ankur Gupta, David Kjerrumgaard
    Pages 1-22
  3. Scott Shaw, Andreas François Vermeulen, Ankur Gupta, David Kjerrumgaard
    Pages 23-35
  4. Scott Shaw, Andreas François Vermeulen, Ankur Gupta, David Kjerrumgaard
    Pages 37-48
  5. Scott Shaw, Andreas François Vermeulen, Ankur Gupta, David Kjerrumgaard
    Pages 49-76
  6. Scott Shaw, Andreas François Vermeulen, Ankur Gupta, David Kjerrumgaard
    Pages 77-98
  7. Scott Shaw, Andreas François Vermeulen, Ankur Gupta, David Kjerrumgaard
    Pages 99-114
  8. Scott Shaw, Andreas François Vermeulen, Ankur Gupta, David Kjerrumgaard
    Pages 115-131
  9. Scott Shaw, Andreas François Vermeulen, Ankur Gupta, David Kjerrumgaard
    Pages 133-217
  10. Scott Shaw, Andreas François Vermeulen, Ankur Gupta, David Kjerrumgaard
    Pages 219-232
  11. Scott Shaw, Andreas François Vermeulen, Ankur Gupta, David Kjerrumgaard
    Pages 233-243
  12. Scott Shaw, Andreas François Vermeulen, Ankur Gupta, David Kjerrumgaard
    Pages 245-247
  13. Scott Shaw, Andreas François Vermeulen, Ankur Gupta, David Kjerrumgaard
    Pages 249-252
  14. Scott Shaw, Andreas François Vermeulen, Ankur Gupta, David Kjerrumgaard
    Pages 253-262
  15. Back Matter
    Pages 263-265

About this book

Introduction

Dive into the world of SQL on Hadoop and get the most out of your Hive data warehouses. This book is your go-to resource for using Hive: authors Scott Shaw, Ankur Gupta, David Kjerrumgaard, and Andreas Francois Vermeulen take you through learning HiveQL, the SQL-like language specific to Hive, to analyze, export, and massage the data stored across your Hadoop environment. From deploying Hive on your hardware or virtual machine and setting up its initial configuration to learning how Hive interacts with Hadoop, MapReduce, Tez and other big data technologies, Practical Hive gives you a detailed treatment of the software.

In addition, this book discusses the value of open source software, Hive performance tuning, and how to leverage semi-structured and unstructured data. 

What You Will Learn

  • Install and configure Hive for new and existing datasets
  • Perform DDL operations
  • Execute efficient DML operations
  • Use tables, partitions, buckets, and user-defined functions
  • Discover performance tuning tips and Hive best practices

Who This Book Is For

Developers, companies, and professionals who deal with large amounts of data and could use software that can efficiently manage large volumes of input. It is assumed that readers have the ability to work with SQL. 


Keywords

ORC AVRO DDL Sqoop RDBMS Hive streaming Flume sentiment analysis semi-structured data YARN Ranger integration Atlas integration Hcatalog HiveQL Hadoop MapReduce

Authors and affiliations

  • Scott Shaw
    • 1
  • Andreas François Vermeulen
    • 2
  • Ankur Gupta
    • 3
  • David Kjerrumgaard
    • 4
  1. 1.Saint LouisUSA
  2. 2.West Kilbride North AyrshireUnited Kingdom
  3. 3.UxbridgeUnited Kingdom
  4. 4.HendersonUSA

Bibliographic information

Industry Sectors
Pharma
Materials & Steel
Automotive
Chemical Manufacturing
Biotechnology
Finance, Business & Banking
Electronics
IT & Software
Telecommunications
Consumer Packaged Goods
Energy, Utilities & Environment
Aerospace
Oil, Gas & Geosciences
Engineering