Вы находитесь на странице: 1из 6

About Big Data Hadoop Training

Big Data Hadoop Training course helps you learn the core techniques and concepts
of Big Data and Hadoop ecosystem. Learn architecture and concepts of HDFS,
Hadoop Cluster setup, Map-Reduce, Hadoop 2.x, Flume, Sqoop, Pig, Hive, HBase,
Oozie etc. to become leader in this field.

Course Objectives

Lead you on the path to become a certified Hadoop professional

Prepare to lead Hadoop movement and become employable in Big Data


Industry

Learn how to Develop Game-Changing Hadoop Applications

Understand Hadoop Architecture

Master the concepts of Hadoop Distributed File System (HDFS)

Advanced features in Hadoop 2.x HDFS Federation, NameNode High


Availability

Setup Hadoop on single Node Cluster

Master the concepts of Map-Reduce framework

Program in Map-Reduce

Learn advance Map-Reduce Concepts and internal dataflow

Hands on Program Development in MapReduce using Java

Master the concepts of YARN

Perform Data Analytics using Pig and Hive

Understand Data Loading Techniques using Sqoop and Flume

Implement HBase, Map-Reduce Integration, Advanced Usage and


Advanced Indexing

Have a good understanding of ZooKeeper service

Implement best Practices for Hadoop Development and Debugging

Design and develop applications involving large data using Hadoop eco
system

Recommended Audience

Experienced Professionals and aspiring to make career in Big Data


Analytics using Hadoop Framework

Developers, Programmers, ETL Developers, Data warehouse


Developers, Testing Professionals

Team Leads, Project Managers, General Managers, Architects, Delivery


Heads, CXOs

Other professionals who are looking forward to make a shift in Big


Data technologies and future proof their career

Pre-requisites

Basic programming skills in Core Java. We will provide you a


complimentary self-paced Java course for the same. This course will help you
to learn Java concepts required to write Map-Reduce Jobs

Basic knowledge of Linux Commands. We will provide you Linux


material, which will help you while setup and configuration of Hadoop

Good Analytical skills to grasp and implement the concepts in Big Data
and Hadoop

Why learn Hadoop?


We create 2.5 quintillion bytes of data every day. So much that 90% of the data in
the world today has been created in the last two years alone (Source: IBM). These
extremely large datasets are hard to deal with using legacy systems such as RDBMS
as data exceed the storage and processing capacity of database. The legacy systems
are becoming obsolete.
According to Gartner: Big Data is new Oil. Big Data is all about finding the needle
of value in a haystack of Structured, Semi-structured and Un-structured data.
Hadoop (the Solution of All Big Data Problems) has become the most important
component in the data stack, which enables rapid processing of data at petabyte
scale. Hadoop is expected to be at the core of more than half of all analytics
software within the next two years.

Grab the Opportunity


The demand for Big Data developers is outpacing the global supply as Hadoop
becomes the predominant platform for data management and analytics.
According to Mckinsey: The United States alone faces a shortage of 140,000 to
190,000 people with analytical expertise and 1.5 million managers and analysts with
the skills to understand and make decisions based on the analysis of big data.
Gartner says Big Data will create over 4 million jobs by 2015
GE says Big data is the future of industry
IDC Says, Big Data market will be growing 6 times faster than the overall IT
market
Experfy says, Hadoop will grow at CAGR of 58%, will reach $50 billion by 2020

Course Description
Hadoop Big Data Training course helps you learn the core techniques and concepts of Big Data
and Hadoop ecosystem. It equips you with in-depth knowledge of writing codes using
MapReduce framework and managing large data sets with HBase. The topics covered in this
course mainly includes- Hive, Pig and setup of Hadoop Cluster.

Course Outcomes

Understand Big Data and Hadoop ecosystem

Work with Hadoop Distributed File System (HDFS)

Write MapReduce programs and implementing HBase

Write Hive and Pig scripts

Pre-requisites

Knowledge of programming in C++ or Java or any other Object Oriented


Programming language is preferred.
Hardware/Software Requirements:

64 bit or 64 bit ready PC/Laptop (Intel Core 2 Duo or above)

8 GB RAM

80 GB HDD

Grab the opportunity


Learning Hadoop gives you the opportunity to build your career in the field of Big Data, either as
a Hadoop Administrator or a Hadoop Developer.

Course Outline

Modul
es

Content
Hadoop on Ubuntu Installation
Hadoop
Why Hadoop, Scaling, Distributed Framework, Hadoop v/s RDBMS, Brief
history of Hadoop, Problems with traditional large-scale systems,
Requirements for a new approach, Anatomy of a Hadoop cluster, Other

Module-I Hadoop Ecosystem components


Setup Hadoop
Pseudo mode, Installation of Java, Hadoop, Configurations of Hadoop,
Hadoop Processes ( NN, SNN, JT, DN, TT), Temporary directory, UI, Common
errors when running Hadoop cluster, Solutions

HDFS- Hadoop Distributed File SystemHDFS design and architecture, HDFS concepts, Interacting HDFS using
ModuleII

command line, Dataflow, Blocks, Replica


Hadoop Processes
Name node, Secondary name node, Job tracker, Task tracker, Data node

Module-

MapReduce

III
Developing MapReduce application, Phases in MapReduce framework,
MapReduce input and output formats, Advanced concepts, Sample

applications, Combiner
Writing a MapReduce Program
The MapReduce flow, Examining a sample MapReduce program, Basic
MapReduce API concepts, Driver code, Mapper, Reducer, Hadoops streaming
API, Using Eclipse for rapid development, Hands-on exercise
Common MapReduce Algorithms
Sorting and Searching, Indexing, Word Co-occurrence, Hands-on exercise
Writing advance map reduce programs
Building multivalue writable data, Accessing and using counters, Partitioner Hashpartitioner, Hands on Exercises .

Hadoop Programming Languages

HIVE: Introduction, Installation, Configuration, Interacting HDFS using HIVE,


MapReduce programs through HIVE, HIVE commands, Loading, Filtering,
Grouping, Data types, Operators, Joins, Groups, Sample programs in HIVE
ModuleIV
PIG: Basics, Configuration, Commands,Loading, Filtering, Grouping, Data
types, Operators, Joins, Groups, Sample programs in PIG
HBase
What is HBase, HBase architecture, HBase API, Managing large data sets with
HBase, Using HBase in Hadoop applications.

Module-

Integrating Hadoop into the Enterprise Workflow

V
Loading Data from an RDBMS into HDFS by Using Sqoop, Managing Real-Time

Data Using Flume.

Вам также может понравиться