You are on page 1of 21

Intellipaat Software Solutions Pvt. Ltd.

Hive
1
Hive

Series of Video Data Files will be


Tutorials supplied along with

Intellipaat Software Solutions Pvt. Ltd.


From Basics these videos wherever
to Advanced deemed necessary
Pre-Requisites
Basic Understanding Links to
of HDFS external
Mapreduce Basics websites shall be
although helpful not provided when
necessary suitable
Cloudera Quick
Start VM used for
demonstration
purposes
Hive

1. Introduction to Hive

Intellipaat Software Solutions Pvt. Ltd.

What Is Hive?

Hive Schema and Data


Storage
Comparing Hive to
Traditional Databases
Hive vs. Pig
Hive Use Cases
2. Relational Data Analysis with Hive
Hive Databases and Tables

Intellipaat Software Solutions Pvt. Ltd.


Basic HiveQL Syntax

Data Types

Joining Data Sets

Common Built-in Functions

Hands-On Exercise: Running Hive Queries on the


Shell, Scripts, and Hue
3. Hive Data Management
Hive Data Formats

Creating Databases and Hive-Managed Tables

Intellipaat Software Solutions Pvt. Ltd.


Loading Data into Hive

Altering Databases and Tables

Self-Managed Tables

Simplifying Queries with Views

Storing Query Results

Controlling Access to Data

Hands-On Exercise: Data Management with Hive


4.Hive Optimization

Intellipaat Software Solutions Pvt. Ltd.


Understanding Query Performance

Partitioning

Bucketing

Indexing Data
5. Extending Hive

Intellipaat Software Solutions Pvt. Ltd.


User-Defined Functions

6. User defined
Functions,Optimizing Queries, Tips
and Tricks for performance tuning
Recap-HDFS-MR

Storing data

Intellipaat Software Solutions Pvt. Ltd.


Fault Tolerant/Reliable
MR leverages hdfs to provide parallel processing
on the stored data
Intellipaat Software Solutions Pvt. Ltd.
Intellipaat Software Solutions Pvt. Ltd.
Hive - Introduction


Typical Use Case

Intellipaat Software Solutions Pvt. Ltd.



If you have large data that
is unstructured

Place this data into some kind
of queryable system

BAs can now run ad-hoc
queries,summarize data,
Aggregate data
Intellipaat Software Solutions Pvt. Ltd.
Hive-Introduction

RDBMS vs Hive

Intellipaat Software Solutions Pvt. Ltd.


Pig vs Hive
Hive-Introduction

Hive is very similar to RDBMS

Intellipaat Software Solutions Pvt. Ltd.

SQL like queries

HiveQL based on SQL-92 framework

Safe to say its SQL for Hadoop DFS


Intellipaat Software Solutions Pvt. Ltd.
Intellipaat Software Solutions Pvt. Ltd.
Intellipaat Software Solutions Pvt. Ltd.
Hive - Introduction

Cloudera Quick Start

Intellipaat Software Solutions Pvt. Ltd.

VM Hive-User

HDFS copy command

Hue Browser-To view hive


meta information
Hive-Introduction

Word Count program

Intellipaat Software Solutions Pvt. Ltd.

Load a large file and to count the


words

Create table in Hive

Ingest data from local filesystem into


hdfs

Hive Query
Hive Introduction

So whats a Hive table ??

Intellipaat Software Solutions Pvt. Ltd.


Data : Files in hdfs
Schema :meta data stored in RDBMS and this points to
the location of the data inside HDFS
Schema and Data are decoupled .
Intellipaat Software Solutions Pvt. Ltd.