Вы находитесь на странице: 1из 11

Course Contents of Hadoop and Big Data

1. Introduction to Hadoop & Big Data

Introduction to Hadoop

Introduction to Big Data

Hadoop Ecosystem - Concepts

Hadoop Map Reduce Concepts and Features

Developing the Map Reduce Applications

PIG Concepts

HIVE Concepts

Flume concepts

HUE Concepts

Hbase concepts

Real Life Use Cases

How Hadoop can solve problem associated with traditional large scale system

Other Open Source Software related to Hadoop

IN depth Knowledge on how Big Data Solutions work on Cloud

How to create your own Hadoop Cluster

2. Hadoop Architecture

Understand the main Hadoop Components

Setup Hadoop

Pseudo Mode

Cluster Mode

Ipv6

Installation of Java , Hadoop

Configuration of Hadoop

Hadoop Processes

Name Node , Secondry Name Node

Job Tracker , Task Tracker

Data Node

HDFS - Hadoop Distributed File Syatem

Learn How HDFS Works

HDFS Design and Architecture

HDFS Concepts

Interacting HDFS using command Line

Interacting HDFS using Java APIs

Dataflow

Blocks

Replica

List Data Access patterns for which HDFS is Designed

Learn how data is stored in HDFC Cluster

Learn HDFC Commands

LINUX Basics , Installation and Commands

3. Querying Data

An overview of Pig, Hive and JAQL

Working with Pig

Working with Hive

Working with JAQL

Working with Pig , Hive and JAQL Transcript

Querying Data with Pig , Hive and JAQL

4. Introduction to MapReduce

Understands the concepts of map and reduce operations

Developing Map reduce Applications

Describes how Hadoop execute a MapReduce Job

Phases in Map reduce Framework

Map reduce Input and Output Formats

Advanced Concepts

Sample Application

Combiner

Joining Datasets in Map reduce Jobs

Map - Side Join

Reduce - Side Join

Map reduce customization

Custom Input Format class

Hash Partitioner

Custom Partitioner

Sorting Technique

Custom Output Format class

Writing a Map reduce programe

The Map reduce Flow

Examining a sample Map reduce programe

Basic Map reduce API concepts

The Mapper

The Reducer

Hadoop Streaming API

Using Eclipse for rapid development

Hans on Exercise

Common Map reduce Algorithms

Sorting and Searching

Indexing

Machine Learning with Mahout

Term Frequency

List MapReduce Fundamental Data Types

Explain a MapReduce Data flow

List MapReduce fault tolerance and scheduling features

5. HIVE

Introduction to Hive

Installation and Configuration

Interacting HDFS using Hive

Map Reduce Programs through Hive

Hive Commands

Loading , Filtering , Grouping

Data Types , Operators

Joins , Groups

Sample Program in Hive

Hive Query Language

Alter and Delete in Hive

Partition in Hive

Indexing

Joins in Hive , Unions in Hive

Authentication and Authorization

Statistics with Hive

Archiving in Hive

Hands on exercise

6. PIG

Introduction to PIG

Installation and Configuration

Commands

Data Loading in PIG

Data Extraction in PIG

Data Transformation in PIG

Hands on Exercise on PIG

7. Shifting Data into Hadoop

Understand how to transfer data into Hadoop using Flume

Introduction to Flume

Introduction to Flume Transcript

Working with Flume

Flume mode of operation and configuration

8. Working with Sqoop

Introduction to Sqoop

Import Data

Export Data

Sqoop Syntax

Databases connection

Hands on Exercise

9. Working with Flume

Introduction to Flume

Configuration and Setup

Flume sink with example

Channel

Flume source with example

Complex Flume Architecture

IMPALA Concepts

HUE Concepts

OOZIE Concepts

10. Graphs Techniques used in Hadoop

- See more at: http://www.madridsoftwaretrainings.com/hadoop.php#sthash.EaadiRab.dpuf

Course Content:
Course Objective Summary
During this course, you will learn:

Introduction to Big Data and Analytics


Introduction to Hadoop
Hadoop ecosystem - Concepts
Hadoop Map-reduce concepts and features
Developing the map-reduce Applications
Pig concepts
Hive concepts
Sqoop concepts
Flume Concepts
Oozie workflow concepts
Impala Concepts
Hue Concepts
HBASE Concepts
ZooKeeper Concepts
Real Life Use Cases

Reporting Tool

Tableau

1. Virtualbox/VM Ware

Basics
Installations
Backups
Snapshots

2. Linux
Basics
Installations
Commands

3. Hadoop

Why Hadoop?
Scaling
Distributed Framework
Hadoop v/s RDBMS
Brief history of hadoop

4. Setup hadoop

Pseudo mode
Cluster mode
Ipv6
Ssh
Installation of java, hadoop
Configurations of hadoop
Hadoop Processes ( NN, SNN, JT, DN, TT)
Temporary directory
UI
Common errors when running hadoop cluster, solutions

5. HDFS- Hadoop distributed File System

HDFS Design and Architecture


HDFS Concepts
Interacting HDFS using command line
Interacting HDFS using Java APIs
Dataflow
Blocks
Replica

6. Hadoop Processes

Name node
Secondary name node
Job tracker
Task tracker
Data node

7. Map Reduce

Developing Map Reduce Application


Phases in Map Reduce Framework
Map Reduce Input and Output Formats
Advanced Concepts
Sample Applications
Combiner

8. Joining datasets in Mapreduce jobs


Map-side join
Reduce-Side join

9. Map reduce customization

Custom Input format class


Hash Partitioner
Custom Partitioner
Sorting techniques
Custom Output format class

10. Hadoop Programming Languages :I.HIVE

Introduction
Installation and Configuration
Interacting HDFS using HIVE
Map Reduce Programs through HIVE
HIVE Commands
Loading, Filtering, Grouping.
Data types, Operators..
Joins, Groups.
Sample programs in HIVE

II. PIG
Basics
Installation and Configurations
Commands.

OVERVIEW HADOOP DEVELOPER


11. Introduction
12. The Motivation for Hadoop
Problems with traditional large-scale systems
Requirements for a new approach

13. Hadoop: Basic Concepts

An Overview of Hadoop
The Hadoop Distributed File System
Hands-On Exercise
How MapReduce Works
Hands-On Exercise
Anatomy of a Hadoop Cluster
Other Hadoop Ecosystem Components

14. Writing a MapReduce Program

The MapReduce Flow


Examining a Sample MapReduce Program
Basic MapReduce API Concepts
The Driver Code
The Mapper
The Reducer
Hadoops Streaming API
Using Eclipse for Rapid Development
Hands-on exercise
The New MapReduce API

15. Common MapReduce Algorithms

Sorting and Searching


Indexing
Machine Learning With Mahout
Term Frequency Inverse Document Frequency
Word Co-Occurrence
Hands-On Exercise.

16.PIG Concepts..

Data loading in PIG.


Data Extraction in PIG.
Data Transformation in PIG.
Hands on exercise on PIG.

17. Hive Concepts.

Hive Query Language.


Alter and Delete in Hive.
Partition in Hive.
Indexing.
Joins in Hive.Unions in hive.
Industry specific configuration of hive parameters.
Authentication & Authorization.
Statistics with Hive.
Archiving in Hive.
Hands-on exercise

18. Working with Sqoop


Introduction.
Import Data.

Export Data.
Sqoop Syntaxs.
Databases connection.
Hands-on exercise

19. Working with Flume

Introduction.
Configuration and Setup.
Flume Sink with example.
Channel.
Flume Source with example.
Complex flume architecture.

20.
21.
22.
23.
24.

OOZIE Concepts
IMPALA Concepts
HUE Concepts
HBASE Concepts
ZooKeeper concepts

Reporting Tool..
Tableau
This course is designed for the beginner to intermediate-level Tableau user. It is for anyone who works
with data regardless of technical or analytical background. This course is designed to help you
understand the important concepts and techniques used in Tableau to move from simple to complex
visualizations and learn how to combine them in interactive dashboards.

Course Topics
Overview
What is visual analysis?
Strengths/weakness of the visual system.

Laying the Groundwork for Visual Analysis


Analytical Process
Preparing for analysis

Getting, Cleaning and Classifying Your Data


Cleaning, formatting and reshaping.
Using additional data to support your analysis.
Data classification

Visual Mapping Techniques


Visual Variables : Basic Units of Data Visualization
Working with Color
Marks in action: Common chart types

Solving Real-World Problems with Visual Analysis

Getting a Feel for the Data- Exploratory Analysis.


Making comparisons
Looking at (co-)Relationships.
Checking progress.
Spatial Relationships.
Try, try again.

Communicating Your Findings


Fine-tuning for more effective visualization
Storytelling and guided analytics
Dashboards

Вам также может понравиться