Академический Документы
Профессиональный Документы
Культура Документы
https://developers.google.com/hadoop/images/hadoop-elephant.png
"The name my kid gave a stuffed yellow elephant. Short, relatively easy to
spell and pronounce, meaningless and not used elsewhere: those are my
naming criteria. Kids are good at generating such. Googol is a kid's term."
Outline
First Day:
Session I:
Session II:
HDFS
Session III:
YARN
Session IV:
MapReduce
Outline #2
Second Day:
Session I:
Session II:
Session III:
Session IV:
Outline #3
Third Day:
Session I:
Session II:
Hadoop cluster installation and configuration
Session III:
Session IV:
Case Studies
Introduction
http://news.nost.org.cn/wp-content/uploads/2013/06/BigDataBigBuildings.jpg
http://www.atkearney.com.tr/documents/10192/698536/FG-Big-Data-and-the-Creative-
Destruction-of-Todays-Business-Models-1.png
Foundations of Big Data"
http://blog.softwareinsider.org/wp-content/uploads/2012/02/Screen-shot-2012-02-27-at-
11.18.56-AM.png
What is Hadoop?
Apache project for storing and processing large data sets
Components:
core:
noncore:
Virtualization vs clustering
Hadoop distributions
Apache Hadoop, http://hadoop.apache.org/
Amazon Elastic
MapReduce, https://aws.amazon.com/elasticmapreduce/
Amazon Elastic
MapReduce, https://aws.amazon.com/elasticmapreduce/
HDInsight, https://azure.microsoft.com/en-us/services/hdinsight/
http://cdn2.hadoopwizard.com/wp-
content/uploads/2013/01/hadoopwizard_companies_by_node_size600.png
BigData future:
search engines
scientific analysis
trends prediction
business intelligence
artificial intelligence
References
Books:
Publications:
http://research.google.com/pubs/papers.html
http://ieeexplore.ieee.org/Xplore/home.jsp
Websites:
Cloudera
website: http://www.cloudera.com/content/cloudera/en/home.ht
ml
Internet
Hadoop cluster
VirtualBox
VirtualBox 64-bit version (click here to download)
GCE
GUI: https://console.cloud.google.com/compute/
Project ID: check after logging in in the "My First Project" lap
HDFS
http://munnamark.files.wordpress.com/2012/03/petabyte1.jpg
Fault tolerance:
data replication
Scalability:
commodity hardware
horizontal scalability
Resilience:
auto-healing feature
Design
FUSE (Filesystem in USErspace)
Non POSIX-compliant
filesystem (http://standards.ieee.org/findstds/standard/1003.1-
2008.html)
Block abstraction:
block size: 64 MB
Benefits:
data durability
Daemons
Datanode:
Namenode:
Secondary namenode:
Data
Stored in a form of blocks on datanodes
Metadata
Stored in a form of files on the namenode:
Metadata checkpointing
When does it occur?
every hour
Read path
1. A client opens a file by calling the "open()" method on the
"FileSystem" object
6. The client repeats steps 3-5 for the next block or steps 2-5 for
the next batch of blocks, or closes the file when copied
Read path #2
2. The client calls the namenode to create the file with no blocks
in the filesystem namespace
8. The client repeats steps 4-5 for the next blocks or steps 3-5 for
the next batch of blocks, or closes the file when written
Write path #2
Fencing methods
Journal Node
Relatively lightweight
Namenode Federation
Namenode memory limitations
Horizontal scalability
Filesystem partitioning
ViewFS
Federation of HA pairs
Namenode Federation #2
Interfaces
Underlying interface:
Additional interfaces:
CLI
C interface
HTTP interface
FUSE
URIs
[path] - local HDFS filesystem
http://[namenode]/[path] - HTTP
viewfs://[path] - ViewFS
Java interface
Reminder: Hadoop is written in Java
Reading data:
Writing data:
by appending to files
Command-line interface
Shell-like commands
Base commands
hdfs dfs -mkdir [URI] - creates a directory
hdfs dfs -cp [source URI] [destination URI] - copies the file or the
directory
hdfs dfs -mv [source URI] [destination URI] - moves the file or the
directory
hdfs dfs -du [URI] - displays a size of the file or the directory
Base commands #2
hdfs dfs -chmod [mode] [URI] - changes permissions of the file or
the directory
hdfs dfs -chown [owner] [URI] - changes the owner of the file or the
directory
hdfs dfs -chgrp [group] [URI] - changes the owner group of the file
or the directory
hdfs dfs -put [source path] [destination URI] - uploads the data
into HDFS
HTTP interface
HDFS HTTP ports:
50070 - namenode
50075 - datanode
Access methods:
direct
via proxy
FUSE
Reminder:
Fuse-DFS module
Quotas
Count quotas vs space quotas:
Setting quotas:
Removing quotas:
Viewing quotas:
Limitations
Low-latency data access:
tens of milliseconds
single writer
append operations only
YARN
http://th00.deviantart.net/fs71/PRE/f/2013/185/c/9/spinning_yarn_by_chaosfissure-
d6byoqu.jpg
Legacy architecture
Historically, there had been only one data processing paradigm
for Hadoop - MapReduce
Scalability:
Flexibility:
Daemons
ResourceManager:
NodeManager:
JobHistoryServer:
Design
http://hadoop.apache.org/docs/r2.3.0/hadoop-yarn/hadoop-yarn-site/YARN.html
2. The client calculates input splits and writes the job resources (e.g. jar
file) into HDFS
5. The application master process initializes the job and retrieves job
resources from HDFS
11. The client periodically polls the application master process for
progress and status updates
http://pravinchavan.files.wordpress.com/2013/04/yarn-2.png
STONITH
Fencing methods
Resources configuration
Allocated resources:
physical memory
virtual cores
yarn.nodemanager.resource.memory-mb - available
memory in megabytes
yarn.scheduler.minimum-allocation-vcores - minimum
number of allocated virtual cores
yarn.scheduler.increment-allocation-mb - increment
memory in megabytes into which the request is rounded up
yarn.scheduler.increment-allocation-vcores - increment
number of virtual cores into which the request is rounded up
MapReduce
http://cacm.acm.org/system/assets/0000/2020/121609_CACMpg73_MapReduce_A_Flexible.la
rge.jpg
Data locality:
Fault tolerance:
jobs high availability
tasks distribution
Scalability:
horizontal scalability
Design
Fully automated parallel data processing paradigm
Job components:
map tasks
reduce tasks
Benefits:
simplicity of development
MapReduce by analogy
Lab Exercise 1.4.1
Sample data
http://academic.udayton.edu/kissock/http/Weather/gsod95-
current/allsites.zip
1 1 1995 82.4
1 2 1995 75.1
1 3 1995 73.7
1 4 1995 77.1
1 5 1995 79.5
1 6 1995 71.3
1 7 1995 71.4
1 8 1995 75.2
1 9 1995 66.3
1 10 1995 61.8
#!/usr/bin/perl
use strict;
my %results;
while (<>)
{
$_ =~ m/.*(\d{4}).*\s(\d+\.\d+)/;
push (@{$results{$1}}, $2);
}
foreach my $year (sort keys %results)
{
print "$year: ";
foreach my $temperature (@{$results{$year}})
{
print "$temperature ";
}
print "\n";
}
Function output:
#!/usr/bin/perl
use List::Util qw(max);
use strict;
my %results;
while (<>)
{
$_ =~ m/(\d{4})\:\s(.*)/;
my @temperatures = split(' ', $2);
@{$results{$1}} = @temperatures;
}
foreach my $year (sort keys %results)
{
my $temperature = max(@{$results{$year}});
print "$year: $temperature\n";
}
Function output:
year: temperature
http://blog.cloudera.com/wp-content/uploads/2014/03/ssd1.png
Map task in details
Map function:
Combine function
Executed by the map task
Java MapReduce
MapReduce jobs are written in Java
org.apache.hadoop.mapreduce namespace:
hadoop [MapReduce program path] [input data path] [output data path]
Streaming
Uses hadoop-streaming.jar class
Pipes
Supports all languages capable of reading from and writing to
network sockets
hadoop pipes \
-D hadoop.pipes.java.recordreader=true
-D hadoop.pipes.java.recordwriter=true
-input [input data path] \
-output [output data path] \
-program [MapReduce program path]
-file [attached files path]
Limitations
If one woman can have a baby in nine months, nine women should
be able to have a baby in one month.
Simplicity:
Too low-level:
http://www.cpusage.com/blog/wp-content/uploads/2013/11/distributed-computing-1.jpg
http://news.nost.org.cn/wp-content/uploads/2013/06/BigDataBigBuildings.jpg
Hadoop ecosystem
http://www.imaso.co.kr/data/article/1(2542).jpg
Simplicity of development:
Simplicity of execution:
MapReduce programs require compiling, packaging and
submitting
Pig - Design
Pig components:
and more
$n - n-th column
x + y / x - y - addition / substraction
x * y / x / y - multiplication / division
x is null - is null
x or y - logical or
Pig - types
int - 32-bit signed integer
Pig - Exercises
Lab Exercise 2.3.1
Data structurization:
SQL interface:
Hive - Design
Metastore:
RDBMS database:
embedded: Derby
local: MySQL, PostgreSQL, etc.
HiveQL:
Interfaces:
CLI
Web GUI
ODBC
JDBC
UPDATE, INSERT,
Updates INSERT OVERWRITE TABLE
DELETE
Create table as
Partially supported Supported
select
Hive - Exercises
Lab Exercise 2.3.2
Sqoop - Design
JDBC - used to connect to the RDBMS
Sqoop - Exercises
Lab Exercise 2.3.3
Developed by Apache
Interactive queries:
Data structurization:
HBase - Design
Runs on the top of HDFS
Columnar storage
HBase - Architecture
Tables are distributed across the cluster
Table rows are sorted by the row key which is the table's primary key
Table columns can be added on the fly if the column family exists
HBase - Exercises
Lab Exercise 2.3.4
Data queueing:
When writing directly to HDFS data are lost during spike periods
Flume - Design
What is Flume? Flume is a frontend to HDFS
Channel - a transient store for the events, which are delivered by the
sources and removed by sinks
http://archive.cloudera.com/cdh/3/flume-ng-1.1.0-cdh3u4/images/UserGuide_image00.png
https://flume.apache.org/FlumeUserGuide.html#flume-sources
Sink types:
Flow pipeline:
Oozie - Design
What is Oozie? Oozie is a frontend to Hadoop jobs
Oozie workflow
action nodes:
http://www.datacenterjournal.com/wp-content/uploads/2012/11/big-data1-612x300.jpg
Hue - Goals and motivation
Hadoop cluster management:
Open Source:
Hue - Design
What is Hue? Hue is a frontend to Hadoop components
Hue components:
Supported browsers:
Google Chrome
Internet Explorer 9+
Safari 5+
Hue - Design #2
http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/4.3.0/Hue-2-User-
Guide/images/huearch.jpg
Developed by Cloudera
Interactive queries:
Data structurization:
SQL interface:
Impala - Design
Runs on the top of HDFS or HBase
Columnar storage
Interfaces:
CLI
web GUI
ODBC
JDBC
Impala - Daemons
Impala Daemon:
Impala Statestore:
Job types:
Tez - Design
http://images.cnitblog.com/blog/312753/201310/19114512-
89b2001bf0444863b5c11da4d5100401.png
Spark
Fast and general engine for large-scale data processing.
NoSQL
"Not only SQL" database"
examples:
MongoDB, https://www.mongodb.org/
Redis, http://redis.io/
Couchbase, http://www.couchbase.com/
http://news.fincit.com/wp-content/uploads/2012/07/tangle.jpg
Picking up a distribution
Considerations:
license cost
Hadoop version
available features
supported OS types
ease of deployment
integrity
support
Hadoop releases
http://drcos.boudnik.org/
Why CDH?
Free
Open source
Easy to deploy
Hardware selection
Considerations:
CPU
RAM
storage IO
storage capacity
storage resilience
network bandwidth
network resilience
hardware resilience
Hardware selection #2
Master nodes:
namenodes
resource managers
secondary namenode
Worker nodes
Network devices
Supporting nodes
Monitoring system
Security system
Logging system
Network topologies
Hadoop East/West traffic flow design
Network topologies:
Leaf
Spine
Leaf topology
E. Sammer. "Hadoop Operations"
Spine topology
SANs considerations:
Virtualization considerations:
Hadoop directories
Grows
Directory Linux path
?
defined by an
Namenode metadata directory yes
administrator
defined by an
Datanode data directory yes
administrator
defined by an
MapReduce local directory no
administrator
Filesystems
LVM (Logical Volume Manager) - HDFS killer
Considerations:
extent-based design
burn-in time
Winners:
EXT4
XFS
http://www.michael-noll.com/blog/uploads/Yahoo-hadoop-cluster_OSCON_2007.jpeg
Underpinning software
Prerequisites:
Additional software:
Cron daemon
NTP client
SSH server
RSYNC tool
NMAP tool
Hadoop installation
Should be performed as the root user
Installation types:
manual installation
automated installation
Installation sources:
packages
tarballs
source code
/usr/bin - executables
/usr/lib - C libraries
CDH-specific additions
Binaries for Hadoop Ecosystem
/etc/alternatives/hadoop-conf
/etc/security/limits.d
Hadoop configuration
hadoop-env.sh - environmental variables
Hadoop configuration
Parameter: fs.defaultFS
<property>
<name>fs.defaultFS</name>
<value>hdfs://namenode.hadooplab:8020</value>
</property>
Hadoop configuration
Parameter: io.file.buffer.size
Guidline: 64 KB
Format: integer
Example:
<property>
<name>io.file.buffer.size</name>
<value>65536</value>
</property>
Hadoop configuration
Parameter: fs.trash.interval
Purpose: specifies the amount of time (in minutes) the file is retained
in the .Trash directory
Format: integer
Example:
<property>
<name>fs.trash.interval</name>
<value>60</value>
</property>
Hadoop configuration
Parameter: ha.zookeeper.quorum
Used by: JT
Example:
<property>
<name>ha.zookeeper.quorum</name>
<value>jobtracker1.hadooplab:2181,jobtracker2.hadooplab:2181</value>
</property>
Hadoop configuration
Parameter: dfs.name.dir
Used by: NN
Example:
<property>
<name>dfs.name.dir</name>
<value>file:///namenodedir,file:///mnt/nfs/namenodedir</value>
</property>
Hadoop configuration
Parameter: dfs.data.dir
Used by: DN
Example:
<property>
<name>dfs.data.dir</name>
<value>file:///datanodedir1,file:///datanodedir2</value>
</property>
Hadoop configuration
Parameter: fs.checkpoint.dir
Example:
<property>
<name>fs.checkpoint.dir</name>
<value>file:///secondarynamenodedir</value>
</property>
Hadoop configuration
Parameter: dfs.permissions.supergroup
Format: string
Example:
<property>
<name>dfs.permissions.supergroup</name>
<value>supergroup</value>
</property>
Hadoop configuration
Parameter: dfs.balance.bandwidthPerSec
Format: integer
Used by: DN
Example:
<property>
<name>dfs.balance.bandwidthPerSec</name>
<value>100000000</value>
</property>
Hadoop configuration
Parameter: dfs.blocksize
Guideline: 64 MB
Format: integer
Example:
<property>
<name>dfs.block.size</name>
<value>67108864</value>
</property>
Hadoop configuration
Parameter: dfs.datanode.du.reserved
Format: integer
Used by: DN
Example:
<property>
<name>dfs.datanode.du.reserved</name>
<value>10737418240</value>
</property>
Hadoop configuration
Parameter: dfs.namenode.handler.count
Used by: NN
Example:
<property>
<name>dfs.namenode.handler.count</name>
<value>105</value>
</property>
Hadoop configuration
Parameter: dfs.datanode.failed.volumes.tolerated
Format: integer
Used by: DN
Example:
<property>
<name>dfs.datanode.failed.volumes.tolerated</name>
<value>0</value>
</property>
Hadoop configuration
Parameter: dfs.hosts
Used by: NN
Example:
<property>
<name>dfs.hosts</name>
<value>/etc/hadoop/conf/dfs.hosts</value>
</property>
Hadoop configuration
Parameter: dfs.hosts.exclude
Used by: NN
Example:
<property>
<name>dfs.hosts.exclude</name>
<value>/etc/hadoop/conf/dfs.hosts.exclude</value>
</property>
Hadoop configuration
Parameter: dfs.nameservices
Example:
<property>
<name>dfs.nameservices</name>
<value>HAcluster1,HAcluster2</value>
</property>
Hadoop configuration
Parameter: dfs.ha.namenodes.[nameservice]
Example:
<property>
<name>dfs.ha.namenodes.HAcluster1</name>
<value>namenode1,namenode2</value>
</property>
Hadoop configuration
Parameter: dfs.namenode.rpc-address.[nameservice].[namenode]
Example:
<property>
<name>dfs.namenode.rpc-address.HAcluster1.namenode1</name>
<value>http://namenode1.hadooplab:8020</value>
</property>
Hadoop configuration
Parameter: dfs.namenode.http-address.[nameservice].[namenode]
Configuration file: hdfs-site.xml
Example:
<property>
<name>dfs.namenode.http-address.[nameservice].[namenode]</name>
<value>http://namenode1.hadooplab:50070</value>
</property>
Hadoop configuration
Parameter: topology.script.file.name
Used by: NN
Example:
<property>
<name>topology.script.file.name</name>
<value>/etc/hadoop/conf/topology.sh</value>
</property>
Hadoop configuration
Parameter: dfs.namenode.shared.edits.dir
Example:
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://namenode1.hadooplab.com:8485/hadoop/hdfs/HAcluster</
value>
</property>
Hadoop configuration
Parameter: dfs.client.failover.proxy.provider.[nameservice]
Purpose: specifies the class used for locating active namenode for
namenode HA purpose
Example:
<property>
<name>dfs.client.failover.proxy.provider.HAcluster1</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverPro
xyProvider</value>
</property>
Hadoop configuration
Parameter: dfs.ha.fencing.methods
Example:
<property>
<name>dfs.ha.fencing.methods</name>
<value>/usr/local/bin/STONITH.sh</value>
</property>
Hadoop configuration
Parameter: dfs.ha.automatic-failover.enabled
Example:
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
Hadoop configuration
Parameter: ha.zookeeper.quorum
Used by: NN
Example:
<property>
<name>ha.zookeeper.quorum</name>
<value>namenode1.hadooplab:2181,namenode2.hadooplab:2181</value>
</property>
Hadoop configuration
Parameter: fs.viewfs.mounttable.default.link.[path]
Used by: NN
Example:
<property>
<name>fs.viewfs.mounttable.default.link./federation1</name>
<value>hdfs://namenode.hadooplabl:8020</value>
</property>
Hadoop configuration
Parameter: yarn.resourcemanager.hostname
Format: hostname:port
Example:
<property>
<name>yarn.resourcemanager.hostname</name>
<value>resourcemanager.hadooplab:8021</value>
</property>
Hadoop configuration
Parameter: yarn.nodemanager.local-dirs
Example:
<property>
<name>yarn.nodemanager.local-dirs</name>
<value>file:///applocaldir</value>
</property>
Hadoop configuration
Parameter: yarn.nodemanager.resource.memory-mb
Format: integer
Used by: NM
Example:
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>40960</value>
</property>
Hadoop configuration
Parameter: yarn.nodemanager.resource.cpu-vcores
Configuration file: yarn-site.xml
Format: integer
Used by: NM
Example:
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>8</value>
</property>
Hadoop configuration
Parameter: yarn.scheduler.minimum-allocation-mb
Format: integer
Used by: NM
Example:
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>2048</value>
</property>
Hadoop configuration
Parameter: yarn.scheduler.minimum-allocation-vcores
Used by: NM
Example:
<property>
<name>yarn.scheduler.minimum-allocation-vcores</name>
<value>2</value>
</property>
Hadoop configuration
Parameter: yarn.scheduler.increment-allocation-mb
Format: integer
Used by: NM
Example:
<property>
<name>yarn.scheduler.increment-allocation-mb</name>
<value>512</value>
</property>
Hadoop configuration
Parameter: yarn.scheduler.increment-allocation-vcores
Format: integer
Used by: NM
Example:
<property>
<name>yarn.scheduler.increment-allocation-vcores</name>
<value>1</value>
</property>
Hadoop configuration
Parameter: mapreduce.task.io.sort.mb
Used by: NM
Example:
<property>
<name>mapreduce.task.io.sort.mb</name>
<value>128</value>
</property>
Hadoop configuration
Parameter: mapreduce.task.io.sort.factor
Format: integer
Used by: NM
Example:
<property>
<name>mapreduce.task.io.sort.factor</name>
<value>64</value>
</property>
Hadoop configuration
Parameter: mapreduce.map.output.compress
Used by: NM
Example:
<property>
<name>mapreduce.map.output.compress</name>
<value>true</value>
</property>
Hadoop configuration
Parameter: mapreduce.map.output.compress.codec
Purpose: specifies the codec used for map tasks output compression
Used by: NM
Example:
<property>
<name>mapreduce.map.output.compress.codec</name>
<value>org.apache.io.compress.SnappyCodec</value>
</property>
Hadoop configuration
Parameter: mapreduce.output.fileoutputformat.compress.type
Used by: NM
Example:
<property>
<name>mapreduce.output.fileoutputformat.compress.type</name>
<value>BLOCK</value>
</property>
Hadoop configuration
Parameter: mapreduce.reduce.shuffle.parallelcopies
Format: integer
Used by: NM
Example:
<property>
<name>mapreduce.reduce.shuffle.parallelcopies</name>
<value>5</value>
</property>
Hadoop configuration
Parameter: mapreduce.job.reduces
Format: integer
Used by: RM
Example:
<property>
<name>mapreduce.job.reduces</name>
<value>1</value>
</property>
Hadoop configuration
Parameter: mapreduce.shuffle.max.threads
Format: integer
Used by: RM
Example:
<property>
<name>mapreduce.shuffle.max.threads</name>
<value>1</value>
</property>
Hadoop configuration
Parameter: mapreduce.job.reduce.slowstart.completedmaps
Format: float
Used by: RM
Example:
<property>
<name>mapreduce.job.reduce.slowstart.completedmaps</name>
<value>0.5</value>
</property>
Hadoop configuration
Parameter: yarn.resourcemanager.ha.rm-ids
Example:
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>resourcemanager1,resourcemanager2</value>
</property>
Hadoop configuration
Parameter: yarn.resourcemanager.ha.id
Format: string
Used by: RM
Example:
<property>
<name>yarn.resourcemanager.ha.id</name>
<value>resourcemanager1</value>
</property>
Hadoop configuration
Parameter: yarn.resourcemanager.address.[resourcemanager]
Example:
<property>
<name>yarn.resourcemanager.address.resourcemanager1</name>
<value>http://resourcemanager1.hadooplab:8032</value>
</property>
Hadoop configuration
Parameter: yarn.resourcemanager.scheduler.address.
[resourcemanager]
Example:
<property>
<name>yarn.resourcemanager.scheduler.address.resourcemanager1</name>
<value>http://resourcemanager1.hadooplab:8030</value>
</property>
Hadoop configuration
Parameter: yarn.resourcemanager.admin.address.[resourcemanager]
Example:
<property>
<name>yarn.resourcemanager.admin.address.resourcemanager1</name>
<value>http://resourcemanager1.hadooplab:8033</value>
</property>
Hadoop configuration
Parameter: yarn.resourcemanager.resource-tracker.address.
[resourcemanager]
Example:
<property>
<name>yarn.resourcemanager.resource-
tracker.address.resourcemanager1</name>
<value>http://resourcemanager1.hadooplab:8031</value>
</property>
Hadoop configuration
Parameter: yarn.resourcemanager.webapp.address.
[resourcemanager]
Purpose: specifies resource manager address for web UI and REST API
services
Example:
<property>
<name>yarn.resourcemanager.webapp.address.resourcemanager1</name>
<value>http://resourcemanager1.hadooplab:8088</value>
</property>
Hadoop configuration
Parameter: yarn.resourcemanager.ha.fencer
Used by: RM
Example:
<property>
<name>yarn.resourcemanager.ha.fencer</name>
<value>/usr/local/bin/STONITH.sh</value>
</property>
Hadoop configuration
Parameter: yarn.resourcemanager.ha.auto-failover.enabled
Format: boolean
Used by: RM
Example:
<property>
<name>yarn.resourcemanager.ha.auto-failover.enabled</name>
<value>true</value>
</property>
Hadoop configuration
Parameter: yarn.resourcemanager.recovery.enabled
Format: boolean
Used by: RM
Example:
<property>
<name>yarn.resourcemanager.ha.recovery.enabled</name>
<value>true</value>
</property>
Hadoop configuration
Parameter: yarn.resourcemanager.ha.auto-failover.port
Format: integer
Used by: RM
Example:
<property>
<name>yarn.resourcemanager.ha.auto-failover.port</name>
<value>8018</value>
</property>
Hadoop configuration
Parameter: yarn.resourcemanager.ha.enabled
Used by: RM
Example:
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
Hadoop configuration
Parameter: yarn.resourcemanager.store.class
Format: string
Used by: RM
Example:
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStat
eStore</value>
</property>
Hadoop configuration
Parameter: yarn.resourcemanager.cluster-id
Format: string
Example:
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>HAcluster</value>
</property>
http://zone16.pcansw.org.au/site/ponyclub/image/fullsize/60786.jpg
http://www.cloudera.com/content/cloudera/en/training/certificatio
n/ccah.html
Surveys: http://www.nobleprog.pl/te