Learn Hive in 24 Hours
By Alex Nordeen
()
About this ebook
Apache Hive is the new member in database family that works within the Hadoop ecosystem. It provides all great features like data summarization, ad-hoc query, and analysis of large datasets. If you are not a good programmer, then this edition will teach you how to use hive queries without writing complex codes.
Most users face the problem of not getting a dedicated course on Hive. The goal of this e-book is to cater everything about Hive and only Hive with minimum jargons. The notes, lessons and hands-on examples in this small e-book are simplified and tactfully presented to solve all your Hive queries. Instead of writing long code for MapReduce or Java, the e-book shows tips on writing the same program with a minimum code snippet.
Beginners as well as peers will thoroughly enjoy this book. They will discover and learn more hive patterns for data processing and data integrations. Unlike other e-book, where they skip basic detail thinking users having prior subject knowledge. This edition has given complete attention to each and every small aspect of the hive like “how to set up and configure Hive in your environment”.
This e-book is also helpful for those who just want to explore Hive and don’t want to spend big bucks for short courses. You will quickly learn, apply and share your Hive knowledge with this e-book.
Table of content
Chapter 1: Introduction
What is Hive?
Hive Architecture
Different modes of Hive
What is Hive Server2 (HS2)?
Hive vs Map Reduce
Chapter 2: Installation and Configuration
Installation of Hive
Hive shell commands
Install and configure MYSQL database
Chapter 3: Data operations
Data types in Hive
Creation and dropping of Database in Hive
Create, Drop and altering of tables in Hive
Table types and its Usage
Partitions
Buckets
Chapter 4: Queries and Implementation
Order by query
Group by query
Sort by
Cluster By
Distribute By
Join queries
Different type of joins
Sub queries
Embedding custom scripts
UDFs (User Define Functions)
Chapter 5: Query Language, Built-in Operators and Functions
Hive Query Language (HQL)
Built-in operators
Built-in functions
Chapter 6: Data Extraction
Working with Structured Data using Hive
Working with Semi structured data using Hive (XML, JSON)
Hive in Real time projects – When and Where to Use
Read more from Alex Nordeen
Learn SQL in 24 Hours Rating: 5 out of 5 stars5/5Learn SAP MM in 24 Hours Rating: 0 out of 5 stars0 ratingsLearn SAP Basis in 24 Hours Rating: 5 out of 5 stars5/5Python: Learn Python in 24 Hours Rating: 4 out of 5 stars4/5Learn HANA in 24 Hours Rating: 5 out of 5 stars5/5Learn Data Warehousing in 24 Hours Rating: 0 out of 5 stars0 ratingsLearn SAP SD in 24 Hours Rating: 0 out of 5 stars0 ratingsLinux: Learn in 24 Hours Rating: 5 out of 5 stars5/5Business Analysis : Learn in 24 Hours Rating: 0 out of 5 stars0 ratingsLearn JavaScript in 24 Hours Rating: 3 out of 5 stars3/5Learn Software Testing in 24 Hours Rating: 0 out of 5 stars0 ratingsLearn PMP in 24 Hours Rating: 0 out of 5 stars0 ratingsLearn SAP HR in 24 Hours Rating: 5 out of 5 stars5/5Learn PHP in 24 Hours Rating: 0 out of 5 stars0 ratingsLearn MongoDB in 24 Hours Rating: 5 out of 5 stars5/5Hacking : Guide to Computer Hacking and Penetration Testing Rating: 5 out of 5 stars5/5Learn Operating System in 24 Hours Rating: 0 out of 5 stars0 ratingsC++ Learn in 24 Hours Rating: 0 out of 5 stars0 ratingsLearn Excel in 24 Hours Rating: 4 out of 5 stars4/5Learn AngularJS in 24 Hours Rating: 0 out of 5 stars0 ratingsLearn Selenium in 24 Hours Rating: 0 out of 5 stars0 ratingsLearn C Programming in 24 Hours Rating: 0 out of 5 stars0 ratingsLearn Hadoop in 24 Hours Rating: 0 out of 5 stars0 ratingsLearn R Programming in 24 Hours Rating: 0 out of 5 stars0 ratingsLearn SAP BI in 24 Hours Rating: 3 out of 5 stars3/5Learn SQLite in 24 Hours Rating: 0 out of 5 stars0 ratingsC# for Beginners: Learn in 24 Hours Rating: 0 out of 5 stars0 ratingsLearn Design and Analysis of Algorithms in 24 Hours Rating: 0 out of 5 stars0 ratingsLearn JSP in 24 Hours Rating: 0 out of 5 stars0 ratingsLearn Cassandra in 24 Hours Rating: 0 out of 5 stars0 ratings
Related to Learn Hive in 24 Hours
Related ebooks
Exploring Hadoop Ecosystem (Volume 1): Batch Processing Rating: 0 out of 5 stars0 ratingsLearn Hbase in 24 Hours Rating: 0 out of 5 stars0 ratingsLearn Hadoop in 24 Hours Rating: 0 out of 5 stars0 ratingsProfessional Hadoop Solutions Rating: 4 out of 5 stars4/5Learning Hadoop 2 Rating: 4 out of 5 stars4/5Exploring Hadoop Ecosystem (Volume 2): Stream Processing Rating: 0 out of 5 stars0 ratingsMastering Hadoop Rating: 0 out of 5 stars0 ratingsHadoop Real-World Solutions Cookbook - Second Edition Rating: 0 out of 5 stars0 ratingsLearn Cassandra in 24 Hours Rating: 0 out of 5 stars0 ratingsHDInsight Essentials - Second Edition Rating: 0 out of 5 stars0 ratingsLearning HBase Rating: 0 out of 5 stars0 ratingsHadoop Blueprints Rating: 0 out of 5 stars0 ratingsDBA's Guide to NoSQL Rating: 5 out of 5 stars5/5Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform Rating: 5 out of 5 stars5/5AWS Certified Database Study Guide: Specialty (DBS-C01) Exam Rating: 0 out of 5 stars0 ratingsApache Hive Cookbook Rating: 0 out of 5 stars0 ratingsBig Data Analytics Rating: 0 out of 5 stars0 ratingsMonitoring Hadoop Rating: 0 out of 5 stars0 ratingsGetting Started with Big Data Query using Apache Impala Rating: 0 out of 5 stars0 ratingsSpark SQL A Complete Guide Rating: 0 out of 5 stars0 ratingsAzure Databricks A Complete Guide - 2020 Edition Rating: 0 out of 5 stars0 ratingsApache Oozie Essentials Rating: 0 out of 5 stars0 ratingsFast Data Processing with Spark 2 - Third Edition Rating: 0 out of 5 stars0 ratingsNiFi A Complete Guide - 2021 Edition Rating: 0 out of 5 stars0 ratingsElasticsearch Server: Second Edition Rating: 0 out of 5 stars0 ratingsBuilding Big Data Applications Rating: 0 out of 5 stars0 ratingsHadoop Essentials Rating: 5 out of 5 stars5/5
Computers For You
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing Rating: 5 out of 5 stars5/5Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad Rating: 0 out of 5 stars0 ratingsThe ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology Rating: 0 out of 5 stars0 ratingsSQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL Rating: 4 out of 5 stars4/5AWS Certified Cloud Practitioner All-in-One Exam Guide (Exam CLF-C01) Rating: 5 out of 5 stars5/5Quantum Computing For Dummies Rating: 0 out of 5 stars0 ratingsStorytelling with Data: Let's Practice! Rating: 4 out of 5 stars4/5Artificial Intelligence: The Complete Beginner’s Guide to the Future of A.I. Rating: 4 out of 5 stars4/5The Mega Box: The Ultimate Guide to the Best Free Resources on the Internet Rating: 4 out of 5 stars4/5CompTIA Security+ Get Certified Get Ahead: SY0-701 Study Guide Rating: 5 out of 5 stars5/5Deep Search: How to Explore the Internet More Effectively Rating: 5 out of 5 stars5/5Practical Lock Picking: A Physical Penetration Tester's Training Guide Rating: 5 out of 5 stars5/5Ultimate Guide to Mastering Command Blocks!: Minecraft Keys to Unlocking Secret Commands Rating: 5 out of 5 stars5/5Grokking Algorithms: An illustrated guide for programmers and other curious people Rating: 4 out of 5 stars4/5Elon Musk Rating: 4 out of 5 stars4/5101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters Rating: 4 out of 5 stars4/5Master Builder Roblox: The Essential Guide Rating: 4 out of 5 stars4/5Tor and the Dark Art of Anonymity Rating: 5 out of 5 stars5/5CompTIA Security+ Practice Questions Rating: 2 out of 5 stars2/5AP® Computer Science Principles Crash Course Rating: 0 out of 5 stars0 ratingsLearning the Chess Openings Rating: 5 out of 5 stars5/5The Professional Voiceover Handbook: Voiceover training, #1 Rating: 5 out of 5 stars5/5CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61 Rating: 0 out of 5 stars0 ratings
Reviews for Learn Hive in 24 Hours
0 ratings0 reviews
Book preview
Learn Hive in 24 Hours - Alex Nordeen
Learn Hive in 24 Hours
By Alex Nordeen
Copyright 2021 - All Rights Reserved – Alex Nordeen
ALL RIGHTS RESERVED. No part of this publication may be reproduced or transmitted in any form whatsoever, electronic, or mechanical, including photocopying, recording, or by any informational storage or retrieval system without express written, dated and signed permission from the author.
Table Of Content
Chapter 1: Introduction
What is Hive?
Hive Architecture
Different modes of Hive
What is Hive Server2 (HS2)?
Hive vs Map Reduce
Chapter 2: Installation and Configuration
Installation of Hive
Hive shell commands
Install and configure MYSQL database
Chapter 3: Data operations
Data types in Hive
Creation and dropping of Database in Hive
Create, Drop and altering of tables in Hive
Table types and its Usage
Partitions
Buckets
Chapter 4: Queries and Implementation
Order by query
Group by query
Sort by
Cluster By
Distribute By
Join queries
Different type of joins
Sub queries
Embedding custom scripts
UDFs (User Define Functions)
Chapter 5: Query Language, Built-in Operators and Functions
Hive Query Language (HQL)
Built-in operators
Built-in functions
Chapter 6: Data Extraction
Working with Structured Data using Hive
Working with Semi structured data using Hive (XML, JSON)
Hive in Real time projects – When and Where to Use
Chapter 1: Introduction
Hive is developed on top of Hadoop. It is a data warehouse framework for querying and analysis of data that is stored in HDFS. Hive is an open source-software that lets programmers analyze large data sets on Hadoop.
What is Hive?
Hive is an ETL and Data warehousing tool developed on top of Hadoop Distributed File System (HDFS). Hive makes job easy for performing operations like
Data encapsulation
Ad-hoc queries
Analysis of huge datasets
Important characteristics of Hive
In Hive, tables and databases are created first and then data is loaded into these tables.
Hive as data warehouse designed for managing and querying only structured data that is stored in tables.
While dealing with structured data, Map Reduce doesn't have optimization and usability features like UDFs but Hive framework does. Query optimization refers to an effective way of query execution in terms of performance.
Hive's SQL-inspired language separates the user from the complexity of Map Reduce programming. It reuses familiar concepts from the relational database world, such as tables, rows, columns and schema, etc. for ease of learning.
Hadoop's programming works on flat files. So, Hive can use directory structures to partition
data to improve performance on certain queries.
A new and important component of Hive i.e. Metastore used for storing schema information. This Metastore typically resides in a relational database. We can interact with Hive using methods like
Web GUI
Java Database Connectivity (JDBC) interface
Most interactions tend to take place over a command line interface (CLI). Hive provides a CLI to write Hive queries using Hive Query Language(HQL)
Generally, HQL syntax is similar to the SQL syntax that most data analysts are familiar with. The Sample query below display all the records present in mentioned table name.
Sample query : Select * from
Hive supports four file formats those are TEXTFILE, SEQUENCEFILE, ORC and RCFILE (Record Columnar File).
For single user metadata storage, Hive uses derby database and for multiple user Metadata or shared Metadata case Hive uses MYSQL.