Академический Документы
Профессиональный Документы
Культура Документы
Blog
Testing
SAP
Web
Must Learn!
Live Projects
Sea rch
Big Data
3. Database Management Se
pdfcrowd.com
Hive
Tutorials
1) Introduction to
Buckets
Hive
Configuration of
HIVE and MYSQL
3) Data operations
in Hive
4) Hive Queries
and
Numeric Types
String Types
Date/Time Types
open in browser PRO version
2) Installation and
Implementation
5) Hive Query
Language, Built-in
pdfcrowd.com
Date/Time Types
Complex Types
Operators and
Functions
Numeric Types:
Type
Memory allocation
TINY INT
SMALL
INT
INT
BIG INT
FLOAT
DOUBLE
DECIMAL
6) Data Extraction
Using Hive
String Types:
open in browser PRO version
pdfcrowd.com
Type
Length
CHAR
255
VARCHAR
1 to 65355
STRING
Date/Time Types:
Type
Usage
Complex Types:
open in browser PRO version
pdfcrowd.com
Type
Usage
Arrays
ARRAY<data_type>
Negative values and non-constant expressions not allowed
Maps
MAP<primitive_type, data_type>
Negative values and non-constant expressions not allowed
.. >
UNIONTYPE<data_type, datat_type,
>
pdfcrowd.com
pdfcrowd.com
pdfcrowd.com
pdfcrowd.com
pdfcrowd.com
the deletion
Sample code Snippet for Internal Table
1. To create the internal table
Hive>CREATE TABLE guruhive_internaltable (id INT,Name
STRING);
Row format delimited
Fields terminated by '\t';
2. Load the data into internal table
Hive>LOAD DATA INPATH '/user/guru99hive/data.txt' INTO
table guruhive_internaltable;
pdfcrowd.com
pdfcrowd.com
pdfcrowd.com
pdfcrowd.com
Internal
External
Schema
Data on Schema
Schema on Data
pdfcrowd.com
Storage Location
/usr/hive/warehouse
HDFS location
Data availability
Within HDFS
Tables, Partitions, and Buckets are the parts of Hive data modeling.
Partitions
Partitions:
Hive organizes tables into partitions. It is one of the ways of dividing
tables into different parts based on partition keys. Partition is helpful
when the table has one or more Partition keys. Partition keys are basic
elements for determining how the data is stored in the table.
For Example: "Client having Some E commerce data which belongs to India
operations in which each state (38 states) operations mentioned in as
a whole. If we take state column as partition key and perform
partitions on that India data as a whole, we can able to get Number of
partitions (38 partitions) which is equal to number of states (38)
present in India. Such that each state data can be viewed separately in
open in browser PRO version
pdfcrowd.com
partitions tables.
Sample Code Snippet for partitions
1. Creation of Table all states
create table all states(state string, District
string,Enrolments string)
row format delimited
fields terminated by ',';
2. Loading data into created table all states
Load data local inpath
'/home/hduser/Desktop/AllStates.csv' into table allstates;
3. Creation of partition table
create table state_part(District string,Enrolments string)
PARTITIONED BY(state string);
4. set hive.exec.dynamic.partition.mode=nonstrict
For partition we have to set this property
5. Loading data into partition table
open in browser PRO version
pdfcrowd.com
pdfcrowd.com
pdfcrowd.com
pdfcrowd.com
Buckets
Buckets used for efficient querying. Buckets in hive provide an
effective way of segregating hive tables data into multiple files or
directories.
The data i.e. present in that partitions can be divided further into
Buckets
The division is performed based on Hash of particular columns that
we selected in the table.
Buckets use some form of Hashing algorithm at back end to read
each record and place it into buckets
In Hive, we have to enable buckets by using the
set.hive.enforce.bucketing=true;
Step 1) Creating Bucket as shown below.
pdfcrowd.com
pdfcrowd.com
From the above screenshot, we can see that the data from the
employees table is transferred into 4 buckets created in step 1.
Views and Indexes
Views:
Views are Similar to tables, which are generated based on the
requirements.
We can save any result set data as a view in Hive
Usage is similar to as views used in SQL
All type of DML operations can be performed on a view
open in browser PRO version
pdfcrowd.com
Creation of View:
Syntax:
Create VIEW < VIEWNAME> AS SELECT
Example:
Hive>Create VIEW Sample_ViewAS SELECT * FROM employees WHERE
salary>25000
In this example, we are creating view Sample_View where it will display
all the row values with salary field greater than 25000.
Index:
Indexes are pointers to particular column name of a table.
The user has to manually define the index
Wherever we are creating index, it means that we are creating
pointer to particular column name of table
Any Changes made to the column present in tables are stored using
the index value created on the column name.
Syntax:
Create INDEX < INDEX_NAME> ON TABLE < TABLE_NAME(column
open in browser PRO version
pdfcrowd.com
names)>
Example:
Create INDEX sample_Index ON TABLE guruhive_internaltable(id)
Here we are creating index on table guruhive_internaltable for column
name id.
HBase Vs Hive
DataStage Tutorial:
Complete Guide
Prev
Next
0 Comments
open in browser PRO version
Guru99
Are you a developer? Try out the HTML to PDF API
pdfcrowd.com
Recommend 5
Share
ALSO ON GURU99
Pinky Walve
screenshots :)
AWS Tutorial
ubalddmo
developer associate
Subscribe
Privacy
pdfcrowd.com
About
Contact Us
About us
Advertise
with Us
Jobs
Privacy Policy
Contact us
FAQ
Write For Us
Follow Us
Certifications Execute
online
ISTQB
Certification
MySQL
Certification
QTP
Certification
Testing
Certification
CTAL Exam
Execute Java
Online
Execute
Javascript
Execute
HTML
Execute
Python
Interesting!
Books to
Read!
Contest
Quiz
pdfcrowd.com