Академический Документы
Профессиональный Документы
Культура Документы
.................................................................................................................... 1-1
1.1
........................................................................... 1-1
1.2
............................................................. 1-2
1.2.1
................................................................. 1-2
1.2.2
................................................................. 1-4
1.3
2
..................................................................................................... 1-6
........................................................................ 2-1
2.2
.................................................................... 2-2
2.2.1
............................................................................ 2-2
2.2.2
....................................................................................... 2-2
2.2.3
............................................................................ 2-2
2.2.4
............................................... 2-3
2.3
2.3.1
.................................................. 2-7
2.3.2
2.4
2.4.1
2.4.2
2.4.3
2.5
3
.............................................................................................................. 2-49
........................................................................... 3-1
3.1
3.1.1
3.1.2
3.1.3
3.2
3.2.1
.......................................................... 3-5
3.2.2
3.2.3
3.3
3.3.1
............................................... 3-8
3.3.2
-1
3.3.3
3.4
.................................. 3-13
3.4.2
..................................... 3-15
3.4.3
............................................. 3-16
.............................................................................................................. 3-17
................................................................................... 4-1
4.1
................................................................ 4-1
4.1.1
..................................................................... 4-1
4.1.2
4.1.3
4.1.4
L3 ........................................................... 4-8
4.1.5
4.1.6
4.1.7
L2 ................................................. 4-10
4.2
............................................................................................... 4-10
4.2.1
............................................................... 4-10
4.2.2
............................................................... 4-12
4.2.3
4.2.4
4.3
4.3.1
FT ................................................................ 4-15
4.3.2
FT ................................................................... 4-18
4.4
.................................................................................... 3-13
3.4.1
3.5
4
........................................................................................ 4-19
4.4.1
4.4.2
................................................................................................ 4-19
........................................................................ 5-1
5.1
........................................................................... 5-1
5.1.1
............................................................................ 5-1
5.1.2
....................................... 5-2
5.1.3
.......................................................... 5-2
5.1.4
............................................... 5-3
5.1.5
............................................................................ 5-4
5.1.6
.............................................................................................. 5-4
5.2
............................................................................................................ 5-5
5.2.1
..................................................................................................... 5-5
-2
5.3
5.3.1
..................................................................... 5-6
5.3.2
.......................................................... 5-9
5.3.3
......................................... 5-12
5.3.4
............................................................... 5-14
5.4
................................................................................. 5-18
5.4.2
................................................................................................ 5-20
.................................................................................................................... 6-1
............................................................................................................ 6-1
6.1.1
........................................................................................... 6-1
6.1.2
................................................................................... 6-2
6.2
..................................................................................................... 6-5
6.3
6.3.1
...................................................... 6-6
6.3.2
......................................... 6-13
6.4
................................................ 6-17
6.4.1
........................................................... 6-17
6.4.2
............................................. 6-18
6.4.3
................................................ 6-18
6.4.4
1 ............................................. 6-19
6.4.5
6.5
................................................ 6-21
6.5.1
............................................. 6-22
6.5.2
........................................................... 6-28
6.6
............................................................................................ 6-31
.................................................................................................................... 7-1
2
8
........................................................................................ 5-17
5.4.1
6.1
......................................................................................................... 5-6
8.1.1
8.1.2
8.2
8.2.1
8.2.2
8.2.3
-3
8.2.4
8.3
8.3.1
8.3.2
8.3.3
8.3.4
................................................................................................... 8-18
8.3.5
8.3.6
.......................................................................... 8-23
8.3.7
.......................................................................... 8-26
8.3.8
................ 8-27
8.4
8.4.1
Pig............................................................................................................. 8-28
8.4.2
8.4.3
8.5
8.5.1
8.5.2
.................................................... 8-35
8.5.3
8.5.4
8.5.5
8.6
9
................................................................................................... 8-38
.............................................................................................. 9-1
9.1.1
.......................................................... 9-1
9.1.2
.................................... 9-2
9.1.3
..................................................................... 9-3
9.2
............................................................................................................ 9-5
9.2.1
...................................................... 9-5
9.2.2
....................................... 9-6
9.2.3
9.3
9.3.1
9.3.2
9.3.3
9.3.4
9.4
-4
9.4.1
9.4.2
9.5
9.5.1
9.5.2
9.5.3
9.6
9.6.1
............................................. 9-48
9.6.2
9.6.3
9.7
9.7.1
9.7.2
9.7.3
9.7.4
10
10.1
......................................................................... 10-1
10.2
....................................................... 10-3
10.3
................................................................................. 10-4
........................................................................................ 10-5
10.4.3
10.5
........................................................................... 10-11
10.5.1
HA .............................................................. 10-12
........................................................................... 10-25
10.6.1
HA .............................................................. 10-25
11.1
11.1.1
-5
11.1.2
11.1.3
......................................................................................... 11-4
11.2
....................................................................................................... 11-6
11.2.1
............................................................................................ 11-6
11.2.2
11.3
11.3.1
................................................................... 11-7
11.3.2
........................... 11-7
11.3.3
................ 11-8
11.3.4
11-10
11.3.5
......................11-11
11.4
............................................................................... 11-13
11.4.1
........................................................................ 11-13
11.4.2
11.4.3
11.4.4
11.4.5
11.4.6
11.5
11.5.1
.................................................................... 11-19
11.5.2
......................................................... 11-19
11.5.3
........................................... 11-20
11.5.4
................................................................. 11-22
11.6
.................................................................................. 11-23
11.6.1
..................................................................................................... 11-23
11.6.2
.............................................................................................. 11-23
12
12.1
...................................................................... 12-1
12.1.1
12.1.2
12.2
....................................................................................................... 12-3
12.2.1
.......................................................................... 12-3
12.2.2
.................................................... 12-3
12.2.3
.............................................................................. 12-4
12.3
12.3.1
.................................................................................... 12-4
Hadoop ............. 12-4
-6
12.3.2
12.4
......................................................................................... 12-5
12.4.1
12.4.2
12.4.3
12.4.4
12.4.5
............................................................................... 12-10
12.4.6
...................................................... 12-12
12.4.7
.................................................. 12-13
12.4.8
......................... 12-14
12.4.9
........................................................................... 12-19
12.5.1
12.5.2
............................................................................... 12-21
13
13.1
........................................................................................ 13-1
13.1.1
................................................................................. 13-1
13.1.2
............................................................................................ 13-1
13.2
....................................................................................................... 13-3
13.2.1
13.3
.......................................................................... 13-3
.................................................................................... 13-3
13.3.1
13.3.2
2 ......................................... 13-9
13.3.3
3 ..
............................................................................................................ 13-12
13.3.4
..................... 13-14
13.3.5
....................................................................................... 13-20
13.4
13.4.1
..................................................................................................... 13-27
13.4.2
.............................................................................................. 13-28
3
I
.................................................................................. 13-27
............................................................................................................. I-1
I.1
..................................................................... I-1
I.2
......................................................................... I-3
I.2.1
............................................................................................... I-3
-7
I.2.2
II
...................................................................................... I-15
................................................................................................................... II-1
-8
1.1
(
)
1-1
Hadoop
MPI(Message Passing Interface)
MPICHOpen MPI
HDFS
MapReduce Hadoop
Hadoop Google MapReduce
1.2
1.2.1
2 1-1
1-1
1-2
2
Hadoop
(1)
Google
MapReduce Hadoop
MapReduce Map
Reduce 2 Map
Reduce Map
MapReduce
MapReduce MapReduce Map
Reduce
MAP
SHUFFLE
REDUCE
1-2 MapReduce
1-3
(2)
Hadoop HDFS(Hadoop Distributed File
System) 64MB
1
1
HDFS NameNode
DataNode 2
Client
NameNode
SW
SW
SW
DataNodes
Rack
1-3 HDFS
1.2.2
1-4
1.2.2.1
Hadoop
100 Hadoop
1.2.2.2
Hadoop
1.2.2.3
1 1
1-5
Hadoop
Hadoop
Hadoop
100 Hadoop
1.3
2 1 7
8 13
1-4 26
8 MapReduce
3 Hadoop
9
2
4 Hadoop
Hadoop
10
5 Hadoop
3 11 Hadoop
12 Hadoop 13
Hadoop
1-6
2 5
2 MapReduce
8 MapReduce
9 Hadoop
10 Hadoop
11 Hadoop
12 Hadoop
6
13 Hadoop
7
BA
1-4
1-7
2 MapReduce
MapReduce
MapReduce
MapReduce
MapReduce
8
2.1
GPS
Hadoop
2-1
2 MapReduce
2.2
Hadoop MapReduce
2.2.1
HDFS
2.2.2
2.2.3
5
2-2
2 MapReduce
2.2.4
2.2.4.1
10
2-1
2-1
No.
110
ID
2-3
2 MapReduce
2-2
2-2
No.
12
ID
2-3
2-3
No.
105
388
2.2.4.2
5
2-4
2-1
2-4
2 MapReduce
2-4
No.
ID
ID
2-1
2-5
2 MapReduce
2.3 MapReduce
MapReduce
MapReduce
Map Reduce
MapReduce Map Reduce
MapReduce MapReduce
Map
Reduce
MapReduce
2-6
2 MapReduce
2.3.1
2-2
2-2
2-7
2 MapReduce
2.3.1.1
2-2
2-3
ID
(1)
(2)
(3)
(4)
(5)
2-3
(1)
2-8
2 MapReduce
(2)
ID
2-4
Step1
ID:1
ID:2
Step2
500m
230m
ID:1
Step3
230m
500m
2-4
2-9
2 MapReduce
(3)
(2)
(4)
2-5
10:00:00
10:00:00
10:05:00
10:05:00
10:10:00
10:10:00
10:15:00
10:03:02
10:00:00
2-5
(5)
ID ID
(2)
(
)(2)
ID ID
1
2-10
2 MapReduce
2-6
Step1
1
2
ID:1
:
ID:1
:
ID:1
:
Step2
)
::40km::20km::10km
1
40km
2
40km
20km
Step3
1
40km
2
(40km+20km)/2=30km
Step4
ID
ID
09/11/01
12:00:00
40km30km
09/11/01
12:00:00
10km2km,1km
2-6
2-11
2 MapReduce
2.3.1.2
2-2
2-7
ID
(1)
(2)
(3)
(4)
(5)
(6)
2-7
(1)
2-12
2 MapReduce
(2)
2-8
10:00:00
10:00:00
10:05:00
10:05:00
10:10:00
10:10:00
10:15:00
10:03:02
10:00:00
2-8
(3)
ID
ID
2-13
2 MapReduce
(4)
ID
2-9
Step1
ID:1
ID:2
Step2
500m
230m
ID:1
Step3
230m
500m
2-9
(5)
ID
2-14
2 MapReduce
2-10
()
12:00:08
12:00:04
12:00:00
ID
2-10
(6)
2-11
2-15
2 MapReduce
Step1
Step2
1
2
3
12:00:45
12:00:30
12:00:25
1
36km
12:00:10
12:00:00
2
26km
Step3
ID
ID
09/11/01
12:00:00
0001
36km26km
09/11/01
12:00:00
0002
10km2km,1km
2-11
2-16
2 MapReduce
2.3.1.3
2-2
2-12
(1)
(2)
2-12
(1)
ID
ID
ID
2-17
2 MapReduce
(2)
(1)
(1)
2-13
ID
2009/11/30 10:00:00
1:,2:
2009/11/30 10:00:00
1:,2:
ID
12:00
1:,2:
10:00
1:,2:
ID
2009/5/3
10:00
1:,2:
2009/5/3
11:00
1:,2:
2-13
2-18
2 MapReduce
2.3.2 MapReduce
2.3.1 MapReduce
Map Reduce
MapReduce Map
Reduce
(Gane-Sarson ) 2-14
MapReduce
2.3.1.1
(1)
(2)
(3) Map
(4) Reduce
(5)
(6)
(7)
2-14 MapReduce
Gane-Sarson
MapReduce
2-15
2-15
2-19
2 MapReduce
(1)
2-16 2.3.1.1
(1)
(2)
(3)
(4)
(5)
2-16
2.3.1.1
2.3.1.1
2-20
2 MapReduce
( ID)
ID
2-17
ID
ID
ID
ID
ID
ID
ID
ID
2-17
2-21
2 MapReduce
(2)
2-17
ID ID
2-17
2-18
ID
ID
ID
ID
ID
ID
ID
ID
ID
ID
ID
ID
ID
ID
ID
ID
2-18
2-22
2 MapReduce
(3) Map
2-18 Map Map
2-18
2-18
Map
Map 2-19
Map
ID
ID
ID
ID
ID
ID
ID
ID
ID
ID
ID
ID
ID
ID
ID
ID
2-19 Map
2-23
2 MapReduce
(4) Reduce
2-18 Reduce
Reduce
2-18
2-18
Reduce
Reduce
2-20
ID
ID
Map
ID
ID
ID
ID
ID
ID
ID
ID
ID
ID
ID
ID
ID
ID
Reduce
2-20 Reduce
2-24
2 MapReduce
(5)
2-18 KeyValue
KeyValue
Map Reduce
Map Reduce
Key Value Key Reduce
Key
Reduce
Reduce
2.3.1.1 (5)
ID
ID
Key ID ID
Map KeyValue 2-21
Key
ID
ID
Map
ID
ID
ID
ID
ID
ID
ID
ID
ID
ID
ID
ID
Reduce
ID
ID
2-21
2-25
2 MapReduce
(6)
2-18 KeyValue
KeyValue
Map Map
Key Value
Key
Key Value
2-22
ID
ID
Map
ID
ID
ID
ID
ID
ID
ID
ID
ID
ID
ID
ID
Reduce
ID
ID
2-22
2-26
2 MapReduce
(7)
2-18 KeyValue
KeyValue
Reduce Reduce
Key Value
Key
Reduce Key
Value
2-23
ID
ID
Map
ID
ID
ID
ID
ID
ID
ID
ID
ID
ID
ID
ID
Reduce
ID
ID
2-23
2-27
2 MapReduce
MapReduce
2-24 MapReduce Map Redcue
ID
Map
Reduce
ID
ID
2-24 MapReduce
MapRedcue
MapReduce 2-25 2-26
2-28
2 MapReduce
ID
Map
ID
Reduce
ID
2-25 MapReduce
Map
Reduce
ID
IDID
ID
2-26 MapReduce
2-29
2 MapReduce
2.4 MapReduce
MapReduce
MapReduce MapReduce
MapReduce 2.3.2 MapReduce
MapReduce
2-27
ID
(Key)
LongWritable
(Value)
Text
TextInputFormat
Map
TaxiProbeAnalysisMapper
(Key)
TaxiProbeAnalysisKeyWrita
bleComparable
(Value)
TaxiProbeAnalysis
ValueWritable
Reduce
TaxiProbeAnalysisReduce
(Key)
NullWritable
(Value)
ProbeAnalysisInfo
TextOutputFormat
Map
ID
Reduce
ID
2-27
2-30
2 MapReduce
2.4.1 Map
Map Map
Map
Map 2-28 2-29
2-31
2 MapReduce
2-32
2 MapReduce
setup()Map
Map
Configuration MapReduce
map()2.3.2 Map
map() Key 2.3.2
Value TextInputFormat setPaths()
A) map() value
B) Key(TaxiProbeAnalysisWritableComparable)
Value(TaxiProbeAnalysisValueWritable)
C) Key Value Context write()
Text
2-33
2 MapReduce
Key
2.3.2 2-30 2-31
2-34
2 MapReduce
Key WritableComparable
WritableComparable
2.3.2 Value ( ID
ID)
get()set()
2-35
2 MapReduce
write()
2-5
2-5 write()
ID
int
IntWritable write()
boolean
BooleanWritable write()
String
Text write()
ID
String
Text wirte()
read()read()
write()
2-6
2-6 read()
ID
int
IntWritable readFields()
boolean
BooleanWritable readFields ()
String
Text readFields ()
ID
String
Text readFields ()
comparatorTo()Key ID
ID Hadoop
Key Comparator
2-36
2 MapReduce
Comparator
Comparator Key
Comparator
Comparator
Key Comparator
Key TaxiProbeAnalysisKeyWritableComparable
Comparator 2-32 2-33
2-37
2 MapReduce
Key TaxiProbeAnalysisComparator
Comparator TaxiProbeAnalysisComparator
2-38
2 MapReduce
Comparator WritableComparator
TaxiProbeAnalysisKeyWritableComparable
compare() Key
Compare() b1 b2
Key s1
s2 Key
b1 b2 s1s2
Key write()
2-7
2-7 compare()
()
ID
int
s1,s2
Integer.SIZE/8
boolean
ID + ID
ID
string
string
WritableUtils.decodeVIntSiz
e()+WritableUtils.readVInt()
ID +
WritableUtils.decodeVIntSiz
ID
e()+WritableUtils.readVInt()
Key(TaxiProbeAnalysisKeyWritableComparable) write()
ID ID
( ID
)
int
2-39
2 MapReduce
WritableUtils.decodeVIntSize()+WritableUtils.readVInt()
Text write()
Partition
Key Partition
Hadoop Partititon Key hashCode()
Key hashCode()
Key
Key
Reduce Key
TaxiProbeAnalysisKeyWritableComparable Partition
2-34
2-34 Partition
:
getPartition()TaxiProbeAnalysisKeyWritableComparable
2-40
2 MapReduce
TaxiProbeAnalysisKeyWritableComparable
hashCode()
Value
2.3.2 Value 2-35
2-35 Value
:
Value Writable
2.3.2 Value (
2-41
2 MapReduce
)
get()set()
:
write()
2-8
2-8 write()
double
DataOutput writeDouble()
double
DataOutput writeDouble()
int
DataOutput writeInt()
double
DataOutput wirteDouble()
read()
write()
2-9
2-9 read()
double
DataInput readDouble()
double
DataInput readDouble()
int
DataInput readInt()
double
DataInput readDouble()
2-42
2 MapReduce
2.4.2 Reduce
Reduce Reduce
Reduce
2.3.2 Reduce 2-36
2-36 Reduce
2-43
2 MapReduce
setup()Reduce Reduce
2.3.1.1
reduce()2.3.2 Reduce
reduce() KeyValue Map map()
Context.write() Key
A) reduce() KeyValue
2.3.2 Reduce
B) Value(ProbeAnalysisInfo)
C) KeyValue Context write()
Key NullWritable
TextOutputFormat Key
Value Key NullWritable
2-44
2 MapReduce
KeyValue
2-37 Value
2-45
2 MapReduce
2.3.2 Value ( ID
ID)
get()set()
toString()
TextOutputFormat Value
toString()
2-46
2 MapReduce
2.4.3 MapReduce
2.3.2 MapReduce 2-38
2-38 Reduce
:
set()
Hadoop
2-47
2 MapReduce
Configuration
set() MapReduce
MapReduce
Configuration
MapReduce
set()
:
run()MapReduce
2-10
2-10 MapReduce
No.
Job
Job
InputSplit
TextInputFormat
setMaxInputSplitSize
TextInputFormat
setInputPaths
Map
Job
setMapperClass
TaxiProbeAnalysisMapper
Key
Job
setMapperOutputKeyC
TaxiProbeAnalysisKeyWritableC
lass
omparator
setMapperOutputValue
TaxiProbeAnalysisValueWritable
Value
Job
Class
7
Shuffle
Job
setPartitionerClass
TaxiProbeAnalysisPartitioner
Reduce
Job
setReducerClass
TaxiProbeAnalysisReducer
Key
Job
setOutputKeyClass
NullWritable
10
Value
Job
setOutputValueClass
ProbeAnalysisInfo
11
TextOutputFormat
setOutputPath
12
Reduce
Job
setNumReduceTasks
Reduce
2-48
2 MapReduce
true MapReduce
2.5
Hadoop
MapReduce
Hadoop Map Reduce
MapReduce
Map Reduce
Hadoop MapReduce
Hadoop
Hadoop
MapReduce Hadoop
MapReduce
2-49
2 Hadoop
Hadoop
Hadoop
MapReduce
MapReduce
MapReduce
3.1 Hadoop
2 Hadoop
Hadoop
3.1.1
MapReduce 3 3-1 3
3-2
3-1 3-2 24 48
Hadoop
3-1 Hadoop
No.
24
24
48
48
93
3-1
3-2
No.
CPU
S1
2GB
SATA
48
24 , 48 ,
2.53GHz 2
2
S2
Xeon E5504
250GB2
6GB
SAS
2GHz 4
3
S3
S4
Xeon 5148
2GB
SAS
S5
Xeon X5460
6GB
SAS
16
Xeon E5345
146GB2
8GB
SAS
2.33GHz 4 2
3.1.2
72GB2
3.16GHz 4
5
17
300GB2
2.33GHz 2
4
146GB2
MapReduce Hadoop
MapReduce Hadoop
Hadoop MapReduce 3-1 2
Map
Reduce
Map
Map
Reduce
3-2
MapReduce
Map Reduce Hadoop
9
2
Hadoop MapReduce JavaVM
MapReduce
MapReduce
RAID-0 1
3.1.3
Hadoop
Hadoop
Map Reduce
Map Reduce ()
CPU
9 Map Reduce
Map 11.5
Reduce 1
Map Reduce
Hadoop HDFSMapReduce
3-3
9
HDFS
HDFS
MapReduce
HDFS MapReduce
MapReduce
Hadoop
3.1.2 Hadoop
Hadoop
Map Reduce JavaVM
2
JavaVM
Map Reduce
JavaVM
Map Reduce TaskTracker
MapReduce
Map Reduce
JavaVM
9 3-2
200MB450MB JavaVM
3-4
Hadoop 3-2 2
OS 2 1
1
Hadoop OS
60GB 1 3-2
S3 GB Hadoop
MapReduce Map Reduce
S3 RAID-0
2 1 Hadoop
Hadoop 3-3
3-3 Hadoop
No.
Map
Reduce
S1
S2
S3
RAID-0 2
1
S4
S5
JavaVM 200MB450MB
3.2 MapReduce
MapReduce Map Reduce
3.2.1
2
MapReduce
3-4
3-5
3-4 MapReduce
No.
MapReduce
Map
Reduce
No.1,No.2
MapReduce
100,000
MapReduce Map Reduce 1
3-5
3-5
No.
1
Map
Reduce
Map
Reduce
Map
Reduce
670
25
141
19
MapReduce Map
MapReduce Reduce
Map Reduce
3-6
MapReduce Map
Map
MapReduce Reduce
Reduce
3.2.2 Map
Map
Map Map
Map 9 Map
CPU
CPU Map 9 Map
30 Map
3-5 HDFS Map
Map Map
3.2.3 Reduce
Reduce MapReduce
MapReduce Reduce
9 Reduce
Reduce JavaVM
Reduce
MapReduce MapReduce
Reduce JavaVM
Reduce
3.3 MapReduce
MapReduce
GB
3-7
3.3.1
( 10MB GB )MapReduce
Map : Map
Reduce : Reduce
Map :
Reduce : Reduce
Map : Map
Reduce : Reduce
Map : Map
Reduce : Reduce
3.3.2 MapReduce
MapReduce
MapReduce
Map
Map : Map
Reduce : Reduce
Map : Map
Reduce : Reduce
Map
Map Map
[]1Map = []Map ([]Map [
]Map )
[] Map = []1Map []Map
3-8
[] Map
[][]1Map = ([]Map [
]Map ) ([]Map []Map )
Map [] Map 1Map
Reduce
Reduce
[]Reduce = []([][
]Reduce )
Reduce
[]1Reduce = []Reduce ([]Reduce
[]Reduce )
[]1Reduce = []Reduce []Reduce
[]1Reduce = []Reduce [
]Reduce
[]1Reduce = []1Reduce [
]1Reduce []1Reduce
Reduce = []1Reduce []Reduce
[]Reduce
MapReduce
Hadoop MapReduce Reduce
Map
Reduce Map ( 5%)MapReduce
3-9
+ Reduce
3.3.3 MapReduce
MapReduce
MapReduce
1 MapReduce
1 MapReduce
1 ( 5GB)
24
Map : 74
Reduce : 896
Map 1 : 5.8210^9 Byte
Reduce : 7.2710^9 Byte
Map : 87
Reduce : 260
Map (24 ) : 48
Reduce (24 ) : 48
Map
Map
[]1Map = []Map ([]Map [
]Map )
3-10
74 (87 48)
40.828 ()
= 436.86 1.000026
= 436.84 ()
Reduce
Reduce
[]Reduce = [] ([]
[]Reduce )
1.8610^11 (5.8210^9 7.2710^9 )
2.32 10^11 (Byte)
Reduce Reduce
165.41 ()
3-11
7.2710^11 260
2.7910^7 (Byte)
[]1Reduce = []Reduce [
]Reduce
=
2.3210^11 1300
1.7910^8 (Byte)
[]1Reduce = []1Reduce [
]1Reduce []1Reduce
=
1057.87()
5289.35 ()
MapReduce
Reduce Map 5%
MapReduce Map Reduce Map
+ Reduce
436.840.05 + 5289.35
5311.19 ()
3-12
3.4
2 3-1
105
388
3.4.1
Hadoop 3-6 3
3-13
3-6 Hadoop
No.
24
24
48
48
93
1
9
3-2
8000
7537
7000
()
6000
5000
3856
4000
3000
2285
1744
2000
1230
634
1000
0
0
10
20
30
40
50
60
70
80
90
100
3-2
3-14
3.4.2
9 30 (1 )90 (3 )365
(1 )
3-3
80000
74604
70000
()
60000
50000
40000
30000
16564
20000
10000
5841
1744
0
0
50
100
150
200
250
()
300
350
400
3-3
3-15
3.4.3
1
9
2
93 Map Reduce 260
1Map 640KB
Reduce 1300
JavaVM 400MB
2000
1744
1800
1600
()
1400
1200
1000
902
800
634
600
400
281
200
0
3-4
105
388
3-16
3.5
2 MapReduce
MapReduce
MapReduce
MapReduce
1.9
24
10%
3-17
Hadoop
Hadoop 10
4.1
Hadoop
4.1.1
2 5
5
1
5
Hadoop
4.1.2 Hadoop
1.2.2.2
Hadoop
Hadoop 2
HDFS
MapReduce
4.1.2.1 HDFS
HDFS 4-1 HDFS
DataNode NameNode DataNode
4-1
DataNode DataNode
RackAwareness
NameNode HDFS
Client
NameNode
Rack
SW
SW
SW
DataNodes
4-1 HDFS
HDFS 2
SecondaryNameNode
4-2 SecondaryNameNode
SecondaryNameNode NameNode
4-2
NameNode
SecondaryNameNode
NameNode
editFSImage
edits
fsimage
fsimage
edits
edits.new
edit
fsimage.chkpt
FSImage
fsimage.chkpt
edit
FSImage
fsimage
edits
4-2 SecondaryNameNode
Hadoop
4.1.2.2 Hadoop
MapReduce 4-3
MapReduce TaskTracker
JobTracker
TaskTracker TaskTracker
JobTracker TaskTracker
MapReduce Hadoop
4-3
MAP
SHUFFLE
REDUCE
4-3 MapReduce
4.1.2.3 Hadoop
Hadoop
(1) HDFS
HDFS 4-1
4-1 HDFS
No.
DataNode
10
()
Hadoop
2
3
DataNode
DataNode
Hadoop
Hadoop Terasort
4-2 MapReduce
No.
Map
TaskTracker
TaskTracker
Map Map
TaskTracker
Reduce
TaskTracker
TaskTracker
Reduce Reduce
TaskTracker
Hadoop Hadoop
Reduce
10
4-4
MapReduce
Map Reduce Hadoop
Shuffle Reduce
4-5
shuffle
4.1.3 Hadoop
Hadoop
4-5 4-3
4-6
Job
L3
Hadoop (DataNode/TaskTracker)
L2
L2
L2
L2
L2
NameNode
Hadoop 100
JobTracker
Core2 Duo
40
Hadoop
Core2 Duo
10
4-5
4-3
No.
L3
L2
Hadoop
Hadoop
HDFS
Hadoop
MapReduce
JobTracker
Hadoop
HDFS
NameNode
4
5
MapReduce
DHCP/DNS
4-7
L3 Hadoop
L2 Hadoop
MapReduce
4.1.4 L3
L3
4-4
4-4 L3
No.
Hadoop - Hadoop
LAN
4.1.5 Hadoop
Hadoop HDFS DataNode
Hadoop
Hadoop MapReduce
2
4-5
4-5 Hadoop
No.
CPU
MapReduce
MapReduce
4-8
No.
3
4
HDD
MapReduce
RAID1
Hadoop
bonding
NIC
MapReduce
OS
7
8
MapReduce
OS
NameNode
HDFS
JobTracker
MapReduce
MapReduce
4.1.6 Hadoop
Hadoop Hadoop
Hadoop
HDFS 3
4-6 Hadoop
No.
HDFS
Hadoop
Hadoop
MapReduce
Hadoop MapReduce
4-9
4.1.7 L2
L2 Hadoop
Hadoop L2
Hadoop
MapReduce
2 Hadoop
6 L2
1/6
4.2
Hadoop
4.2.1
Hadoop
4.1.4 L3 4-6 L3
L2 L3
4-10
L3
L2-L3
L3
L2
4-6
L3
L3 4-7 VRRP
L3
4-8
VRRP(Active)
VRRP(Standby)
L3
L2
4-7 VRRP
4-11
4-8 L3
No.
Active/Standby
VRRP
10
OS
L2
L3 L2 L2
(STP)
4-7
L3
L2
4-7
4.2.2
4-9
4-9
No.
L3 ()
4-12
No.
L3 (
)
3
L2
L3
L2
1 Hadoop
4.2.3 Hadoop
4.1.3 Hadoop
HA
HA Heartbeat
DRBD HA
4-8
LAN
NameNode
NameNode
heartbeat
DRBD
heartbeat
DRBD
NameNode()
NameNode()
edits
Heartbeat/ LAN
4-8 HA
4-13
edits
NameNode 4-10
SecondaryNameNode NameNode
SecondaryNameNode JobTracker
4-10
No.
NFS
NFS
SecondaryNameNode
JobTracker
4.2.4 Hadoop
Hadoop
4-11
4-11 Hadoop
No.
()
bonding
()
()
STONITH
()
NameNode
NameNode
4-14
No.
10
JobTracker
JobTracker
NameNode
Heartbeat 2 NameNode
Safemode HDFS
3
JobTracker
Heartbeat
Heartbeat
Hadoop 3
JobTracker MapReduce
HA
4.3 Hadoop
4.2.4 Hadoop HA
NameNode Safemode
JobClient
FT
4.3.1 FT
FT
4.3.1.1 FT
FT 2
FT 4-9
4-15
OS
OS
OS
OS
CPU
CPU
CPU
LAN
LAN
CPU
LAN
LAN
4-9 FT
FT Kemari Kemari
FT I/O
LAN
OS
OS
LAN
4-10 Kemari
4-16
4-12 Kemari
No.
FT
OS CPU
4.3.1.2 FT
Kemari Hadoop
Hadoop
Hadoop
Hadoop
Hadoop
(1) NameNode
HDFS HDFS
1 200byte
8GB HDFS 40TB
NameNode
(2) JobTracker
JobTracker MapReduce
TaskTracker
Hadoop Kemari
4-17
FT
4.3.2
Kemari
Hadoop FT
4-13
4-13 FT
No.
()
bonding
()
STONITH
()
()
Hadoop 1
Hadoop
NNBench
Kemari
268tps
Map :1
Kemari
30tps
:5000
Kemari
17.5
:10GB
Kemari
16
2
3
4
Terasort
4-18
4.4
4.4.1 Hadoop
Hadoop
Hadoop HA
FT Kemari
4.4.2
Hadoop Hadoop
Hadoop0.21.0 SecondaryNameNode StandbyNode
NameNode
4-11
Hadoop
DRBD
4-19
Active
NameNode
SNN Registration
RPC: NameNodeProtocol
Standby
NameNode
Journal
Spool
Edit OutputStream
Edit OutputStream
4-11 StandbyNameNode
4-20
Hadoop
Hadoop
11 12 13
5.1
Hadoop
Hadoop
5.1.1
5-1 Hadoop
5-1
No.
OS
Hadoop
5-1
5.1.2
Hadoop
Hadoop
Hadoop
5.1.3
Hadoop
5-1
5-2
5-1
11 12 13
5-1
5-2
No
5.1.4
Hadoop
Hadoop
5-3
5.1.5
Hadoop
CPU
5.1.6
5.1.1
5.1.2
11 12 13
5.1.35.1.45.1.5
5-3
5-3
No
5.3.1.3
5.3.1.4
5.3.1.5
5.3.2.3
5-4
No
5.3.2.4
10
5.3.2.5
5.3.3.3
11
12
13
14
5.3.3.4
15
5.3.3.5
5.3.4.3
16
17
18
19
5.3.4.4
20
5.3.4.5
5.2
5.2.1
5-2
Job
L3
L3
Hadoop (DataNode/TaskTracker)
L2
L2
L2
L2
L2
NameNode
Namenode
Hadoop 100
JobTracker
JobTracker
5-2
5-5
Core2 Duo
40
Hadoop
Core2 Duo
10
5.3
Hadoop
5-3
Hadoop
5.3.1
Hadoop Hadoop
5.3.1.1
Hadoop 96
Hadoop
12 5-1
Hadoop 96
Kickstart
DHCP/TFTP/HTTP
DHCP/DNS
96
5-3 11
5.3.1.2
96 90
10 Hadoop
5-4 CPU
5-6
5-4 CPU
5.3.1.3
96
10
4 380
CPU 25%
4 400 Hadoop
HTTP
CPU 5%
20 2000 Hadoop
Puppet
5-4
5-7
5-4
No.
10
4 380
90 96
HTTP CPU
90 400
Hadoop
5.3.1.4
Hadoop
5.3.1.5
96 Hadoop 5-5
5
5-5
No.
CPU
HDD
HP
Xeon QuadCore/2.33GHz x2
8GB
SAS 146GB x 2
HP
Xeon QuadCore/3.16G
6GB
SAS 146GB x 2
HP
Xeon DualCore/2.33G
2GB
SAS 72GB x 2
HP
Xeon QuadCore/2G
6GB
SAS300GB x 2
NEC
2GB
SATA 250GB x 2
5-8
CPU
Hadoop
5.3.2
Hadoop Hadoop
5.3.2.1
Hadoop
11 Ganglia
12 Ganglia
5-5
gmond
gmond
gmond
gmond
gmond
gmond
gmond
gmond
gmond
gmond
gmond
gmond
5-5 Ganglia
5.3.2.2
Ganglia Hadoop
CPU Hadoop
5-9
5-6 4 r4-1-0-01
4
5-6 Ganglia
4
5-7 Ganglia
5-7 Ganglia
5-8 Hadoop Ganglia
1.
2.
3.Heap
4.swap-inout
5.
6.
5-10
5.3.2.3
96
Web
1 Hadoop
100 Hadoop
CPU 35%
300
CPU 10%
10 1000
0.25%WAIT CPU
WAIT CPU 10% CPU
90 / 0.25 = 360 CPU Idle
13
5-6
No.
Web
100 1
CPU
300
5.3.2.4
Hadoop
Hadoop
Hadoop
3
5-11
5.3.2.5
5.3.3
Hadoop Hadoop Hadoop
5.3.3.1
Hadoop
11
Puppet
5-9
Hadoop
OS
(puppetrun)
Puppet
-Hadoop NameNode
-Hadoop DataNode
-Hadoop
Ganglia
-Hadoop
CPU/
5-9 Puppet
5.3.3.2
Puppet Hadoop
1
Hadoop Puppet facter
Hadoop
5-12
5.3.3.3
Hadoop
Hadoop
3 1 Hadoop
1 10
CPU 1
10
5-7
5-7
No.
100
100 3
5.3.3.4
Hadoop
Hadoop
5.3.3.5
96 Hadoop 5-5
Puppet facter
5-13
5.3.4
Hadoop
5.3.4.1
3 70
5.3.4.2
5-10
30
25
SAS 46
20
SATA 24
15
10
0
0:00
0:10
0:20
0:30
0:40
0:50
1:00
1:10
1:20
1:30
1:40
[:]
5-10
2 Hadoop
CPU 5-11
5-14
5-11 CPU
11
5-8
No.
11/27
HDD RAID
OS
BIOS
12/1 RAID
PhysicalVolume
12/13
HDD RAID
OS
BIOS
12/15 HDDHDD
PhysicalVolume
1/ 26
swap
5-15
No.
1/ 28
OS
5.3.4.3
10
1 5
2 70
90
5-9
No.
70 1
40
30
5
2
5.3.4.4
5-10 5-11
5-10
No..
70
5-16
10
91
No..
HDFS
5-11
No.
32
OS
2
OS
Ganglia,
OS
Hadoop
5.3.4.5
Hadoop 5-5
Puppet
5.4
Hadoop
5-17
5.4.1
Hadoop
5-12
Hadoop
5-12
No.
96 Hadoop
380
90 96
100 1
Hadoop
5-18
No.
10
11
100
12
13
100 3
14
15
Puppet facter
16
70 1
17
18
5 2
19
20
5-19
5.4.2
5-20
5 Hadoop
MapReduce
6.1
6.1.1
2 5 MapReduce Hadoop
Hadoop
2 3
4 Hadoop
FT
Kemari
Hadoop
6-1
(1 )Kemari
5 Hadoop
6.1.2
6.1.2.1
5
5
0 5 10 55
5
5
2
6-1
6-2
6-1
()
No.
1
22
(:
730 )
25
900
(:
30000 )
93
900
(:
30000 )
()
3
100
2
3
6.1.2.2
1
6.1.2.1
( 10 )
2
6-2
6-3
6-2
93
()
No.
1
2.1TB
140
6-4
6.2
2 5 6-1
Hadoop/
Job
L3
L2
NameNode
Namenode
JobTracker
JobTracker
r2
L2
L2
L2
L2
L2
L2
Hadoop
(DataNode/TaskTracker)
r6
10
r5
18
r4
16
r3
12
r7
40
6-1
6.3 MapReduce
2 6-1
6-2
6-5
6.3.1
6-1
6-3
6-3
()
No.
1
22
(:
730 )
25
900
(:
30000 )
93
900
(:
30000 )
()
6-6
6.3.1.1
()
6-4
6-4 1
No.
122
()
(300 ) 122
6-2
6-2
6-7
6.3.1.2
900
( 1 )
25 ()
6-5
6-5 2
No.
1,223
25
240
()
1 1223
240
6-6
6-6
No.
()
()
22 ( 1 )
12,215
900 ( 2 )
295,745
24
6-8
6-3 6-4
6-3 1
6-4 2
6-9
6.3.1.3
25 2
93 ( 3 ) 6-7
6-7 3
No.
25
548
93
221
()
( 25 )548
221
3
6-8
6-8
No.
( 2 )
295,745
( 3 )
457,816
6-10
154%
6-5 6-6
6-5 2
6-11
6-6 3
6-12
6.3.2
6-2
9.6 MapReduce
9.6.2 MapReduce
6-9
6-9
93
()
No.
1
2.1TB
140
6.3.2.1
9.6.2 MapReduce
3.5
9.6.2
6-10
6-13
6-10
No.
Map
546
492
119
Reduce
572
5349
128
Map
850MB
173GB
7.8GB
Reduce
750MB
217GB
21GB
Map
1386
2782
2600
Reduce
1300
1300
50
Map
260
260
260
Reduce
260
260
260
Hadoop
MapReduce
6-11
6-11
No.
4.6GB
1.9TB
85GB
Map
3900
31840
2600
Reduce
1300
1300
100
Map
260
260
260
Reduce
260
260
260
9.6
MapReduce
9.6.2 MapReduce
6-12
6-14
6-12MapReduce
No.
Map
Reduce
Total
51 2
46 37
56 11
1 33 48
16 59 43
17 25
21 37
41 26
42 31
6.3.2.2
6-9 1 1
MapReduce
6-13
6-13
No.
93
21 53 10
10
MapReduce
10
Hadoop Map Map
HDFS
Map
Map
HDFS Map
6-15
6-14
6-14
No.
(-)
43 45
51 2
-14.5%
44 8
46 37
-17.7%
46 41
56 11
-16.9%
21.5%
Map
54 1
33 48
20
16
Reduce
40 11
59 43
20
17
Total
43 24
25
11 2
21 37
-48.9%
20 51
41 26
-49.6%
23 5
42 31
-45.7%
21 53
18 43 7
16.9%
10
Map
Reduce
Total
5
6
7
21.6%
21.3%
Map
8
Reduce
Total
10
Total
16.9%9
MapReduce
MapReduce
10
Map Reduce
Map 1.7%
6.4
4
FT Kemari
(1)
(2)
(3) Kemari
(4) Kemari
(1) (6.4.1)
(2) (6.4.2)
(3) (JotTracker/NameNode)(6.4.3)
(4) L2 (6.4.4)
6.4.1
6-17
6.4.2
1 (r4-1-0-01)
Reduce
6-15
(r5-1-0-12)
Map
6-15 Reduce
No
r4-1-0-01
11:10:22
35
r5-1-0-12
6.4.3
11:11:41
10 9
10 21
FT
JobTracker NameNode
FT GratiousARP
6.4.3.1 JobTracker
JobTracker
4 JobTracker
Hadoop
6-16
6-16 MapReduce
No
Map
184
Reduce
50
6.4.3.2 NameNode
NameNode
2 NameNode
Hadoop
6-17
6-18
6-17 MapReduce
No
Map
184
Reduce
268
FT
6.4.4
1
(r6) 10
1 6-18
6.4.1
1
6-18
No
11:17:01
1 46
11:21:21
20 1
11:41:24
28
6-19
HDFS
I/O
6-7 1 CPU
6.4.5 Kemari
FT Kemari
I/O Kemari
6-1
Kemari 6-19
6-19 Kemari
No.
Kemari
180
256
29%
25
249
485
48%
93
258
553
53%
6-20
Kemari
Kemari 6-8
4
40Mbps
25
93
6-8
6.5
5 6-1
(1)
(2)
6-21
6.5.1
12
6.5.1.1
Hadoop 25
100
6.5.1.2
6-20
LAN
6-20
No
70
10
96
70
5 (
TaskTracker
Hadoop HDFS
6-22
No
1,2
6-9
30
SAS 46
25
20
[]
SATA 24
15
10
0
0:00
0:10
0:20
0:30
0:40
0:50
1:00
1:10
1:20
1:30
1:40
[:]
6-9
6-21
6-23
CPU
(Mbyte/s1 )
seq read / seq write
1
2
SAS (
2.33GHz2 8
SAS 146GB
SATA
SATA 250GB
()
2.53GHz 2
60.939 / 62.358
95.419/ 26.693
6-10
JobClient
6-10 CPU
6-24
Ganglia IO
JobClient
6.5.1.3
6-22
No
233 ( 8 )
149 ( 19 )
98
( 9 )
CPU
CPU
19
4
Ganglia 4
OS
6-12 4 CPU
OS
Hadoop
6.5.1.4
Hadoop JobClient
Hadoop
6-13
6-26
JobClient
TaskTracker
TaskTracker
TaskTracker
TaskTracker
TaskTracker
6-13
6-10 CPU
JobClient CPU
CPU
Linux rsync
rsync SSH
CPU Hadoop
RSH
CPU
5GB
6-27
2GB
JobClient CPU
6.5.2
Hadoop
5 10
6.5.2.1
6-23
6-23
No.
r4-1-0-09
Hadoop OS
6.5.2.2
11
6-24
6-24
No.
16:50
r4-1-0-09
16:52
Ganglia ( 6-14)
16:55
( 6-15)
17:00
6-28
No.
32
17:02
17:03
OS
17:35
Ganglia,
OS
Hadoop
6.5.2.3
6-24 No2
Ganglia 6-14
6-14 Ganglia
Ganglia
6-29
Ganglia 6-15
4
4
6-15
Nagios, Ganglia
Ganglia
6-16
6-16
6-30
Ganglia
6.5.2.4
6-25
6-25
No
85.6 ( 4 )
84.5 ( 2 )
86.3 ( 7 )
86.8 ( 5 )
() 5
6.6
FT Kemari Hadoop
Hadoop
Kemari
6-31
Hadoop
Hadoop
Hadoop 10 2,3
8 (3 25 )
5 1(1223 240 )
Hadoop
Hadoop
FT Kemari 93 53%
FT
6-32
100
MapReduce
MapReduce
Map Reduce
Map Reduce
Map Reduce
Map Reduce
Hadoop
Map Reduce CPU
GB
MapReduce GB
MapReduce
Hadoop
Hadoop Hadoop
Hadoop Hadoop L3
7-1
Hadoop HA FT
FT Kemari
Hadoop
HA
Kemari
Hadoop 3 3 100
Hadoop
Hadoop
Hadoop
Kickstart Puppet
100 90
Hadoop
Ganglia
100 Hadoop
PC 400
Hadoop 10 2,3
8 (3 25 )
5 1(1223 240 )
Hadoop
Hadoop
7-2
FT Kemari 93 53%
FT
7-3
8 MapReduce
MapReduce
MapReduce
MapReduce MapReduce
Hadoop
MapReduce PageRank
Hadoop MapReduce Hadoop
MapReduce
8.1 MapReduce
MapReduce
8.1.1 MapReduce
MapReduce Google MapReduce
8-1
8 MapReduce
Map
Shuffle
<key1, Value1>
<key2, Value2>
...
Reduce
<key1, Value1>
<key1, Value3>
...
<key2, Value2>
<key1, Value3>
<key3, Value4>
...
<key3, Value4>
...
<Key, Value>
Key <Key, Value>
Map,Reduceworker
8-1 MapReduce
8.1.2 MapReduce
MapReduce
MapReduce
Map Reduce
Map Reduce
8.2 MapReduce
MapReduce
8-2
8 MapReduce
8.2.1 MapReduce
MapReduce 3
(1)
(2)
(1)
MapReduce
MapReduce
MapReduce
MapReduce
MapReduce
MapReduce
Key
MapReduce KeyValue Key
Key MapReduce
8.2.2 MapReduce
MapReduce 2
(1) Web
8-3
8 MapReduce
Web 8-2
MapReduce Web Map <
, 1>Shuffle
Reduce <, 1><, >
Welcome to My HomePage.
Thank you.
Where is your house? ....
Web
Map
<homepage, 1>
<homepage, 1>
<house, 1>
<welcome, 1>
<homepage, 1>
<you, 1>
<go, 1>
<where, 1>
<your, 1>
<house, 1>
<homepage, 1>
<, 1>
Shuffle
<welcome, 1>
<welcome, 1>
<where, 1>
<you, 1>
<your, 1>
<your, 1>
Reduce
<go, 2>
<homepage, 10>
<house, 3>
<welcome, 8>
<where, 7>
<you, 4>
<your, 5>
8-2 Web
MapReduce
Web :
: Web
(2)
8-4
8 MapReduce
abc@example.com hello...
def@example.net adadafa
<abc@example.com, 10>
<def@example.net, 200>
<ghi@example.org, 0>
<aaa@example.jp, 0>
<abc@example.com, 0>
<def@example.net, 100>
<def@example.net, 50>
Map
Shuffle
Reduce
<, >
<aaa@example.jp, false>
<abc@example.com, false>
<def@example.net, true>
<ghi@example.org, false>
8-3
Map Reduce
Map Reduce
8.2.3 MapReduce
8.2.1 8-4
PageRank MapReduce PageRank
Web Web
Web PageRank
Web Web
8-5
8 MapReduce
PageRank
PageRank
Web
(1)
(2) (1/)
(3)
Web
example.com
100.5
example.net
200.5
hogehoge.jp
300.5
fugafuga.jp
0.25
example.jp
8-4 MapReduce
PageRank
(1) Web 1 1
(2) Web 1
(1/)
8.2.4 MapReduce
8.2.3 PageRank MapReduce PageRank
PageRank 8-5 3
8-5 PageRank
8-5
8-6
8 MapReduce
8-6
Web
Web
URL
8-6 PageRank
8-6 PageRank
8-7
Web
URL
Web
Web
URL
Web
8-7 PageRank
Map Reduce
Reduce
Map
PageRank
Key Reduce
Map 8-8
8-7
8 MapReduce
Reduce Map
MapReduce
Web
Web
URL
Map
Reduce
Web
Web
URL
8-8 PageRank
Map Reduce PageRank
8-9 MapReduce
: 10
My Homepage Link.
example.net : example
example.com : example
<example.net, 1/10>
<example.com, 1/10>
Web
Map
: 25
example.net Link.
My Homepage Link :
example.com : example
<mypage.html, 1/25>
<example.com, 1/25>
Shuffle
1Web
Web1
Reduce
<example.com, 100.25>
<example.net, 50.66>
<mypage.html, 0.2>
Web
PageRanK
MapReduce
PageRank
8-8
8 MapReduce
PageRank MapReduce
(: X (X-1) )
Web
(1)
Map
Reduce
Shuffle
(7)
(1) MapReduce
(2)
(3)
(4) Map
(5) Shuffle
(6) Reduce
(7)
(3)
<Key, Value>
(2)
(4)
(5)
(6)
8-9
8 MapReduce
8.3.1 MapReduce
Hadoop MapReduce MapReduce MapReduce
8.3.7 Map Reduce
MapReduce
MapReduce Job
(org.apache.hadoop.mapreduce.Job)
Map Reduce
Job MapReduce
Job
Job.getConfiguration().set()
MapReduce
8.3.1.1
MapReduce
(1)
FileInputFormat setInputPaths /
(2)
FileOutputFormat setOutputPath (
)
8-10
8 MapReduce
FileAlreadyExistsException MapReduce
(3)
8.3.6 8.3.7
(4) Map Reduce
8.3.2 Map
8.3.3 Reduce
8.3.1.2
MapReduce
( 200MB)
(2) Map Reduce
Map Reduce TaskTracker Child
Child TaskTracker
Failed
mapred.task.timeout
(
60000 )
(3) Map Reduce
OS
mapred.child.ulimit
OS
8.3.1.3
MapReduce
8-11
8 MapReduce
(1) Reduce
Job setNumReduceTasks
Reduce
0 Map Hadoop Reduce
Reduce Hadoop
Hadoop
Reduce >= Reduce
(2) Map
Map Map
mapred.max.split.sizeMap 1
1 Map
Map
16MB(16777216)
Job job = new Job(); //
job.getConfiguration().set(mapred.max.split.size, 16777216);
Hadoop
Hadoop
Map >=
(3) MapReduce
HDFS
dfs.replication
HDFS 3 HDFS
MapReduce
3
8-12
8 MapReduce
8.3.1.4
MapReduce Map
Job.getConfiguration().set()MapReduce
MapReduce
Map
8.3.1.5 MapReduce
MapReduce
MapReduce
MapReduce
org.apache.hadoop.conf.Configured
org.apache.hadoop.util.Tool
main : run
run : MapReduce
MapReduce : setJobName
: setInputFormatClass
: setOutputFormatClass
Map : setMapperClass
Reduce : setReducerClass
: FileInputFormat.setInputPaths (HDFS
)
: FileOutputFormat.setOutputPath(HDFS
)
Key : setOutputKeyClass
Value : setOutputValueClass
Map Key : setMapOutputKeyClass
Map Value : setMapOutputValueClass
configuration :
execute : MapReduce
main MapReduce run
configurationexecute
8-13
8 MapReduce
8-11 MapReduce
Hadoop MapReduce
Map Reduce
8.3.2 Map
Map Map KeyValue Map
setup Map cleanup
(: Map )
Map
Map
org.apache.hadoop.mapreduce.Mapper
setup : Map
map : Map
8-14
8 MapReduce
cleanup : Map
run : setup, map, cleanup Map
()
Map 8.3.5
(: TextInputFormat Key
LongWritable Value Text ) KeyValue
8.3.1.5 KeyValue
Map KeyValue
8.2.4 PageRank Map PageRank
Web
Map
8-12
PageRank Map
8-15
8 MapReduce
8-12 Map
PageRankMapper Map setup Key
Web countTotalLinks
map
Map
Hadoop
Map
8-16
8 MapReduce
8.3.3 Reduce
8.2 Reduce Hadoop Reduce
Reduce setup Reduce
cleanup
Reduce
Reduce
org.apache.hadoop.mapreduce.Reducer
setup : Reduce
reduce : Reduce
cleanup : Reduce
run : setup, reduce, cleanup Reduce
()
Reduce Reduce
8-17
8 MapReduce
8-13 Reduce
PageRankReducer reduce Key
Reduce Reduce
Key Iterable reduce
Reduce Map
8.3.4
Hadoop MapReduce KeyValue
(1) : Map
(2) : Map Reduce (Shuffle )
(3) : Reduce
Hadoop
Hadoop
8-18
8 MapReduce
8.3.4.1
Hadoop
()
java.util.Map
(1) (Text )
(2) (IntWritable , DoubleWritable )
(3) (BytesWritable )
(4) (BooleanWritable )
(5) (ArrayWritable , TwoDArrayWritable )
(6) Map (MapWritable )
8.3.4.2
Key Hadoop
Key Value WritableComparable
implement Value Writable
implement
org.apache.hadoop.io.Writable (Value )
org.apache.hadoop.io.WritableComparable (Key Value
)
set : (: )
get : ( : )
write : (:
)
readFields : (:
)
compareTo : (: Object
, : -1:, 0:, 1:
)WritableComparable
toString :
equals : (: Object
, : true / false)
8-19
8 MapReduce
Map Reduce
Key
Comparator Key
Comparator :
org.apache.hadoop.io.WritableComparator
compare :
Comparator
Hadoop MapReduce Shuffle Key
Key
8.2.4 PageRank
(1) Web
(2)
(1) Web Web
(2)Hadoop
DoubleWritable 8-14 Web
WebDataWritable
8-20
8 MapReduce
8-14
WebDataWritable WebData 8-15
Web
WebData WebDataWritable
WebData MapReduce
8-21
8 MapReduce
8-15 WebData
Map Reduce Hadoop
MapReduce MapReduce
8.3.5 MapReduce
8.3.5 Shuffle
Shuffle Map Key Reduce Hadoop
Shuffle 3
Partition : Reduce Key Value
Grouping
Partitioner
Partitioner
org.apache.hadoop.mapreduce.Partitioner
getPartition : Key Value Reduce Key
Reduce
Hadoop Partition Key Value
8-22
8 MapReduce
8-16 Partitioner
8-16 PageRank Web Key
Web Partitoner
getPartition
Key int float Partition
Grouping
Grouping
org.apache.hadoop.io.RawComparator
compare : KeyValue Key
8.3.6
Hadoop Map
Hadoop
8-23
8 MapReduce
8.3.6.1
Hadoop
(1)
1 TextInputFormat
TextInputFormat KeyValue
Key LongWritable
Value Text Key Value
(2)
Hadoop SequenceFile
SequenceFileInputFormat
8.3.6.2
FileInputFormat Hadoop
org.apache.hadoop.mapreduce.lib.input.FileInputFormat
createRecordReader : (RecordReader)
isSplitable :
1
(RecordReader)
RecordReader
org.apache.hadoop.mapreduce.RecordReader
initialize : KeyValue KeyValue
nextKeyValue : Map KeyValue
8-24
8 MapReduce
getCurrentKey : Key
getCurrentValue : Value
close :
8.2.4 PageRank TextInputFormat
PageRank
8-17
8-25
8 MapReduce
8-17
8.3.7
Hadoop Reduce
Hadoop
8-26
8 MapReduce
8.3.7.1
Hadoop
(1) : TextOutputFormat
KeyValue
Key Value
(2) : SequenceFileOutputFormat
Hadoop SequenceFile
MapReduce SequenceFile
8.3.7.2
FileOutputFormat (TextOutputFormat
)TextOutputFormat
(1)
(2) KeyValue
(3) ()
8.3.8
MapReduce
MapReduce
(1) Map Reduce
MapReduce Hadoop
(2)
Key
8-27
8 MapReduce
8.4 MapReduce
Hadoop MapReduce
MapReduce Hadoop
MapReduce
Key
MapReduce
Hadoop MapReduce MapReduce
Hadoop MapReduce
8.4.1 Pig
Pig Pig Latin Hadoop MapReduce
8.4.1.1 Pig
Pig
8-28
8 MapReduce
Key
Key
(AND, OR, NOT)
Key (MAX)(MIN)(AVG)(SUM)(COUNT)
(CONCAT)
Key
LOG = LOAD hogehoge.csv AS (id:charaarray, score:int);
8-29
8 MapReduce
HDFS
Pig HDFS Hadoop
cat,cd,copyFromLocal,copyToLocal,cp,ls,mkdir,mv,pwd,rm,rmf
Pig exec, run
Job kill
set
grunt > cat hogehoge.txt
aaa bbb
ccc ddd
grunt > run pigscript.pig
# Pig
Pig
KEY Java
Pig REGISTER Java
(jar )
8.4.1.2 Pig
Pig (Hadoop ) Pig
Java Java Pig
Pig Pig
8-30
8 MapReduce
Pig Java
Java Pig
# Pig
user $ java -cp $PIGDIR/pig.jar:$HADOOP_CONF_DIR org.apache.pig.Main
grunt > Pig
# Pig
user $ java -cp $PIGDIR/pig.jar:$HADOOP_CONF_DIR org.apache.pig.Main
script.pig
script.pig MapReduce
8.4.2 Hive
Hive Pig HiveQL SQL
MapReduce Hive RDBMS
HiveQL
HiveQL
Hive MapReduce
Hadoop MapReduce
8.4.2.1 Hive
Hive
Hive SQL
CREATE TABLE sample(id STRING, score INT)
ROW FORMAT DELIMITED FIELDS TERMINATED BY 0A
STORED AS TEXTFILE;
8-31
8 MapReduce
SQL
UPDATE DELETE Hive
SELECT SELECT
GROUP BY DISTINCT SORT BY
ORDER BY
JOIN UNION
8.4.2.2 Hive
Hive Hive (Hadoop ) Hive
Hive
Hive metadata
metadata Hive
user $ cd $HIVE_HOME
user $ ./bin/hive
Hive
8-32
8 MapReduce
MapReduce MapReduce
Key Value MapReduce
8.5 MapReduce
Pig Hive MapReduce
8.5.1
Map Reduce
Hadoop MapReduce MapReduce
Hadoop Hadoop
8.5.1.1 Map
Hadoop Map
64MB 1
Map 64MBB
8-33
8 MapReduce
FileInputFormat setMaxSplitSize
mapred.max.split.size1
Map
mapred.max.split.size
HDFS 1 Map
HDFS
dfs.block.size
1 HDFS
HDFS HDFS
8.5.1.2 Reduce
Reduce
mapred.reduce.tasksMapReduce
Job setNumReduceTasks
Reduce 1
Reduce 1
8.5.1.3
Map Reduce
JobTracker Web
Web Map Reduce
JobTracker
JobTracker MapReduce
MapReduce API
Job getTaskCompletionEvent MapReduce
8-34
8 MapReduce
getTaskCompletionEvent
8.5.2
Map Reduce
3
stdout : Map Reduce System.out.println
JavaVM stdout
stderr : Map Reduce System.err.println
Failed
MapReduce 24
8.5.3
MapReduce
Hadoop 1 MapReduce
Hadoop
Job Counters
Map Reduce Map
Map Map
Map
FileSystemCounters
HDFS
Map-Reduce Framework
8-35
8 MapReduce
Map Reduce
(2)FileSystem Counters
Hadoop
Context getCounter
/* : Map */
void map(Key KeyValue value, Context context) {
context.getCounter(TestMapper.class.getSimpleName(),
).increment()
}
JobTracker Web Job
8.5.4 MapReduce
MapReduce
2
: MB MB- MB, GB
:
(1) :
8-36
8 MapReduce
MapReduce
8-1
8-1
MB
MB MB
GB
()
()
()
memcached
RDBMS
KV
MapReduce-AP
No.
MapReduce
RDBMS KV
memcached KV
RDBMS
KV
RDBMS
Reduce
Partition
Reduce Key Value
Reduce
8-37
8 MapReduce
8.5.1 Reduce
8.6
MapReduce
Hadoop MapReduce
PageRank MapReduce Pig
Hive MapReduce MapReduce
Hadoop MapReduce
MapReduce
MapReduce
Hadoop
8-38
9 Hadoop
Hadoop
Hadoop
MapReduce Hadoop
Hadoop
Hadoop MapReduce
MapReduce Hadoop Hadoop
Hadoop
MapReduce Hadoop
MapReduce Hadoop
Hadoop MapReduce Map
Reduce
MapReduce
MapReduce
MapReduce
Hadoop
Hadoop
9.1
9.1.1
2
(1)
(2)
100%
9-1
9 Hadoop
9.1.2
CPU
9-1
9-1
No.
CPU
OS
OS
9-1
20
A
A
30
9-1
9-2
9-2
9 Hadoop
15
A
A
20
9-2
9-2
9-2
No.
9.1.3
9.1.3.1
9-3
9-3
9 Hadoop
9-3
No.
CPU (
)
2
()
CPU
()
()
9.1.3.2
9.1.3.1
9.1.3.3
9-4
9-4
No.
9-4
9 Hadoop
9.2
9-2
9.2.1
9-5
9-5
No.
CPU HDD
No.1 9-3
No.2
9-6 5
9-6
No.
CPU
NIC
S1
S2
S3
Xeon E5504
6GB
SAS
2GHz
300GB
Xeon E5345
8GB
SAS
2.33GHz
146GB
4 2
Xeon 5148
2GB
SAS
1Gbps
1Gbps
1Gbps
18
16
HP
2009
DL360G6
HP
2008
DL380G5
HP
2006
10
2.33GHz
72GB
DL360G5
AH480A
9-5
9 Hadoop
No.
CPU
NIC
S4
Core 2 Duo
2GB
SATA
T9400
250GB
2.53GHz
1Gbps
50
NEC
2009
Express5800
HP
2008
2
2
5
S5
9.2.2
Xeon X5460
6GB
SAS
1Gbps
3.16GHz
146GB
DL360G5
AK839A
Hadoop Hadoop
9.2.3 Hadoop
9.1.1 Hadoop MapReduce
(1)
1 MapReduce MapReduce
(2)
1 MapReduce
9.3 Hadoop
Hadoop
9-6
9 Hadoop
MapReduce 9-7 3
9-7 MapReduce
No.
Map
Shuffle
Map
Key
Reduce
Shuffle
Map
Reduce
TaskTracker
Map
TaskTracker
Shuffle
Reduce
JobTracker
TaskTracker
9-7
9 Hadoop
DataNode TaskTracker
DataNode TaskTracker
NameNode
Reduce
Map
Shuffle
Reduce
TaskTracker
JobTracker
(MapReduce)
(HDFS)
9.3.2 MapReduce
MapReduce Hadoop
9-7 9-3 9-4 MapReduce
Map
Map 9-5
HDFS
Map
Map
Hadoop
DataNode
Map
DataNode
Map
9-8
9 Hadoop
JobTracker
T T T
JobTracker
HDFS
(DataNode
DataNode)
Map
Map
JobTracker
()
Map
HDFS
9-5 Map
Map 9-8
9-8 Map
No.
CPU
Map
Map
Map
Map
Shuffle
Shuffle Reduce 9-4 9-6
Map
Key
JavaVM
IO
9-9
9 Hadoop
JobTracker
T T T
Sort
Map
Map
Map
ShuffleJobTracker
1
9-6 Shuffle
Shuffle 9-9
9-9 Shuffle
No.
Map
Map
Reduce
Reduce 9-7 Shuffle
Reduce Reduce
(HDFS )
9-10
9 Hadoop
JobTracker
Reduce
ReduceHDFS
Reduce
Reduce
HDFS
HDFS
9-7 Reduce
Reduce 9-10
9-10 Reduce
No.
CPU
Reduce
Reduce
Reduce
Reduce
Reduce
Hadoop MapReduce
CPU
MapReduce
Hadoop
9-11
9 Hadoop
9.3.3
Hadoop
Hadoop
9.3.3.1 CPU
CPU Map Reduce
MapReduce Map Reduce
CPU
MapReduce CPU
9-8
CPU
Map
Reduce
CPU
Map
Reduce
CPU
CPU
Map
CPU
Reduce
Map
Reduce
9-12
9 Hadoop
CPU
Map
CPU
Reduce
Map
Reduce
M
Map
MMMMMM
RRRRRR
CPU
M
CPU
Reduce
CPU
9.3.3.2
9-10
()
Reduce
Reduce
Reduce
Reduce
Reduce
Map
Map
()
Map
Map
Map
TaskTracker
DataNode
OS(
)
Java VM
Java
9-10
9-13
:
( 200MB)
9 Hadoop
Hadoop Map
Reduce JavaVM
MapReduce
GC java.lang.OutOfMemoryError
Record Buffer
index Buffer
MapBuffer
Data Buffer
MapBuffer
9-11 Map
Reduce Map Reduce
Map Reduce
MapReduce
CPU1
MapReduce
CPU1
Map Reduce
CPU
9-14
9 Hadoop
9.3.3.3
MapReduce
Map Reduce
Hadoop Map Reduce ( 10%)
JobTracker 1
JobTracker Map Reduce
JobTracker
# Map - JobTracker
2010-02-02 17:14:24,230 WARN org.apache.hadoop.mapred.JobInProgress: No room for
map task. Node slave001 has 14135296 bytes free; but we expect map to take
109193991
# Reduce - JobTracker
2010-02-02 17:14:24,231 WARN org.apache.hadoop.mapred.JobInProgress: No room for
reduce task. Node tracker_slave001:localhost.localdomain/127.0.0.1:44741 has
14135296 bytes free; but we expect reduce input to take 1091939917
JobTracker 1
# -
2010-02-02 16:51:37,581 FATAL org.apache.hadoop.mapred.TaskRunner: Task
attempt_201001061158_0367_m_000003_0 failed : org.apache.hadoop.fs.FSError:
java.io.IOException: No space left on device
at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write
(RawLocalFileSystem.java:192)
()
Caused by: java.io.IOException: No space left on device
... 8 more
MapReduce
9-15
9 Hadoop
Map Reduce ( 4
) mapred-site.xml MapReduce
mapred.map.max.attempts, mapred.reduce.max.attempts
Hadoop MapReduce
MapReduce
( 4)
mapred.max.tracker.blacklists
9.3.3.4
9-6
9.3.4
Hadoop
Hadoop
(CPU IO)
9.3.4.1 CPU
CPU CPU (top sar
%user %system ) 100%
(%iowait) CPU 100%CPU
9-16
9 Hadoop
Hadoop Map
Reduce CPU 100%
CPU
Map Reduce
Map Reduce CPU
CPU
()
CPU
Hadoop (PiEstimator)
PiEstimator 9-12 PiEstimator
CPU
CPU
PiEstimator CPU
Map
Shuffle
Reduce
9-12 PiEstimator
9.3.4.2
IO
iostat
await( IO ) avgqu-sz(IO
)%util( IO CPU )
MapReduce Map Reduce
MapReduce
9-17
9 Hadoop
Hadoop TeraSort
TeraSort 9-13 TeraSort KeyValue
MapReduce
TeraSort
Map
Shuffle
Reduce
9-13 TeraSort
9.3.4.3
sar
(sar -n DEV)(rxbyt/stxbyt/s)
Hadoop
Shuffle Map
Reduce HDFS
Shuffle Reduce
9-18
9 Hadoop
Shuffle
Shuffle
Hadoop
Shuffle
Map Shuffle
Map
Shuffle
Reduce
Reduce Reduce
9-14
9.4 Hadoop
MapReduce Hadoop
9.4.1 Hadoop
Hadoop Map Reduce
Shuffle Reduce
Map
9.3.2 Map 2
(1) Map CPU
9-19
9 Hadoop
(2) Map IO
(1) Map top sar
CPU 100%
CPU
9-15 Map
CPU
(CPU4)
M M M M
MapCPUMap
(CPU4)
M M M M
M M
M M
M M
MapCPUMap
M Map
CPU
9-15 Map CPU
(2)Map
IO 9-16 CPU
Map
(2)
mmm
mm
Map1Map
(2)
mmm
mm
Map2Map
M Map
CPU
m Map
9-16 Map
9-20
9 Hadoop
Reduce
9-9 9-10 Reduce
(CPU4)
R R R R
ReduceCPUReduce
(CPU4)
R R R R
ReduceCPUReduce
R Reduce
CPU
9-17 Reduce CPU
(2)Reduce HDFS
HDFS Reduce
MapReduce
9-18 HDFS
IO
9-21
9 Hadoop
(2)
r r r
r r
Reduce1Reduce
(2)
r r r
r r
Reduce2Reduce
R Reduce
CPU
r Reduce(HDFS)
9-18 Reduce
(3)Shuffle Map
Map
m
m
R
m m m
m
m
m
m
mm
R
m
9-19 Map
Hadoop MapReduce
9-22
9 Hadoop
9-11
9-11 Hadoop
No.
Map
CPU
Map
Map
Map
Reduce
CPU
Reduce
Reduce
Reduce
Reduce
Map
Hadoop
9.4.2 Hadoop
Hadoop
9-11 Hadoop
9-12
9-12 MapReduce Hadoop
No.
Map
CPU
mapred.tasktracker.map.
tasks.maximum
Map
mapred.local.dir
Map
Map
Reduce
CPU
mapred.tasktracker.redu
ce.tasks.maximum
Reduce(Shuffle)
Reduce
dfs.data.dir
Reduce
HDFS
tasktracker.http.threads
Shuffle Map
Reduce
mapred.reduce.parallel.c
Map
opies
9-23
9 Hadoop
CPU
9.4.2.1
Hadoop
9-6
9-13 TeraSort
S1
18
S2
S4
S4
19
S5
58
96
107
116
( 1 )
(100%)
(165.5%)
(184.5%)
(200%)
2531
1961
1811
1775
( 1 )
(100%)
(77.5%)
(71.6%)
(70.1%)
1 4 MapReduce
(: 2531 4: 1775 )
Hadoop MapReduce
Hadoop
Hadoop
9-24
9 Hadoop
9-14 Hadoop
No.
mapred.local.dir
dfs.data.dir
9.4.2.2
Map Reduce Map Reduce
88 (S1 17 , S2 4 , S3 16 , S4 43 , S5 8 )
Map CPU 1, 1.5, 2, 3, 4
5 5
PiEstimator 9-15
9-15 PiEstimator
No.
Map
()
()
CPU 1
483
487
3.35
CPU 1.5
481
483
1.27
CPU 2
485
497
6.68
CPU 3
500
505
2.93
CPU 4
481
511
17.65
9-25
9 Hadoop
Reduce TeraSort
8 (S1 8 ), CPU4
40GB TeraSort
Map : 4(CPU 1) , 6 (CPU 1.5)
Reduce : 2(CPU 0.5) 6(CPU 1.5)
TeraSort 9-20
Map4
Map6
900
800
700
()
600
500
400
300
200
100
0
2
4
Reduce
9-20 TeraSort
Map Reduce 35
(CPU
9-21
9-26
9 Hadoop
No
Ma
Reduce 3
Reduce 4
Reduce5
9-21 TeraSort
CPU (Map Reduce
4 )
Idle CPUCPU
CPU
100%Reduce
+1
9-16
9-16
No.
mapred.tasktracker.map.tasks.maximum
CPU 1
1.5
mapred.tasktracker.reduce.tasks.maximum
CPU
CPU + 1
Map Reduce
(1) Hadoop(DataNodeTaskTracker) OS 1GB
9-27
9 Hadoop
(4) (3)Reduce
Map
Map Reduce 9-17
1 JavaVM 200MB
(3)1 JavaVM
450MB
9-17
No.
S1
S2
S3
S4
S5
Map
Reduce
JavaVM MapReduce
MapReduce
9.5 Hadoop
MapReduce Hadoop
Map 9-18
9-28
9 Hadoop
Map
Map
Map
M
M
M Map
CPU
Map
9-22 Map
9-18 Map
No.
Hadoop
Map
MapReduce
(0.19 mapred.map.tasks
)
mapred.child.java.opts
1 Map
( 200MB)
Map Partition
io.sort.mb
Key
( 100)
4
io.sort.record.percent
Map
( 0.05)
1 Map Map
2 Hadoop Map
9-29
9 Hadoop
Map 100MB
R Reduce
Reduce Reduce
Reduce
CPU
9-23 Reduce
9-19 Reduce
No.
Hadoop
Reduce
mapred.reduce.tasks
Reduce (
1)
mapred.child.java.opts
1 Reduce
(
200MB)
3
4
5
Map
mapred.job.shuffle.mer
Map
ge.percent
( 0.66)
mapred.job.shuffle.inp
Shuffle
ut.buffer.percent
( 0.70)
mapred.reduce.parallel
1 Shuffle Map
.copies
( 5)
9-30
9 Hadoop
9.5.3 MapReduce
9.5.2
9.5.2 Map
Reduce Map Reduce
9.5.3.1 Map
Map CPU
MapReduce Map
CPU PiEstimator Map
TeraSort
GB Hadoop
Map
(1) Map 1 (
) Map
Map
(2) 1 Map JavaVM 5 1
Map
(1)
(3) MapReduce Map
30
(1)(3) Map
CPU Map
CPU Map PiEstimator 1Map
9-31
9 Hadoop
1Map ()
1 4 : 2.5 PiEstimator
2 : 5.0 PiEstimator
3 : 7.5 PiEstimator
13 : 88 (S1 17 , S2 4 , S3 16 , S4 43 , S5 8 ),
250
4 : 44 (S1 9 , S2 2 , S3 8 , S4 21 , S5 4 ),
126
1Map PiEstimator 9-24
1600
1400
1200
()
2
1000
800
600
400
200
0
0
10
20
30
40
50
Map/Map
60
70
80
9-32
9 Hadoop
180000
160000
Map()
140000
120000
100000
80000
60000
40000
20000
0
0
10
20
30
40
50
60
70
80
Map/Map
Map Map
30
Map
Map TeraSort Map
1Map () Map
Reduce
1 : 8 (S4 : CPU2 ) , 16, 40GB
2 : 8 (S4 : CPU2 ) , 16, 80GB
3 : 16 (S4 : CPU2 ) , 32, 40GB
1Map JavaVM 200MB( 100MB )
9-33
9 Hadoop
9-20 Map
1 Map
2 Map
3 Map
(MB)
()
()
()
500
80 (5 )
160 (10 )
80 (2.5 )
250
160 (10 )
304 (19 )
160 (5 )
130
304 (19 )
608 (38 )
304 (9.5 )
100
400 (25 )
800 (50 )
400 (12.5 )
83.3
480 (30 )
960 (60 )
480 (15 )
65.8
608 (38 )
1200 (75 )
608 (19 )
33.3
1200 (75 )
2400 (150 )
1200 (37.5 )
21.9
1824 (114 )
3648 (228 )
1824 (57 )
9-26
Map
1-
2-
3-
4000
3500
3000
()
No.
2500
2000
1500
1000
500
0
0
20
40
60
80
100
Map/Map
120
140
160
Map
9-34
9 Hadoop
1( 9-27 ) 9-28
3
CPU
(1) Map
(2) Map
(3) Map
Map
(1) (2)
(3)
1-
4000
3500
()
3000
2500
2000
1500
1000
500
0
0
20
40
60
80
100
Map/Map
120
9-27 TeraSort ( 1)
9-35
140
160
9 Hadoop
(1) CPU
(2)
(3)
9-28 TeraSort
2
(1) Map CPU WAIT CPU
IO
(2) Map
MapReduce Shuffle Shuffle
(1)(2)Hadoop 1
(1) 1Map ()
9-29 Map
9-30 Map Spill Records
9-36
9 Hadoop
1Map()
()
1Map
2.5E+09
1600
2.0E+09
1400
()
1200
1.5E+09
1000
800
1.0E+09
600
400
5.0E+08
1Map
(B)
1800
200
0
0.0E+00
0
10
250MB
15
20
25
Map/Map
30
35
40
9-29 Map ( 1)
()
1.2E+09
1600
1.0E+09
()
1400
1200
8.0E+08
1000
6.0E+08
800
600
4.0E+08
400
2.0E+08
200
0
1MapSpill Records
1800
0.0E+00
0
10
15
20
25
30
35
40
Map/Map
9-30 Map Spill Records ( 1)
Map Spill Records
IO
9-37
9 Hadoop
Record Buffer
io.sort.mbio.sort.record.percent
16 JavaVM (
330KB)Record Buffer Map (Hadoop
io.sort.spill.percent)Spill Data Buffer
Record Buffer
(3) Map Reduce
(1)(2) Spill
Map Reduce
1 (Hadoop io.sort.factor)
1
Map
1 Map Data Buffer Map
Data Buffer(io.sort.mb)JavaVM
1 Record Buffer
MapReduce 1
9-38
9 Hadoop
Record Buffer
9-30 Map 2 Spill Records
(3) Spill Records
10 9 Spill
Out (3) Spill Out Record Buffer
1 10 Map
1 100 260MB
9-28(3) Map
Reduce
1Reduce Map
Reduce
1 1Reduce Shuffle 9-31
Shuffle Map
Hadoop Map
Map Shuffle
1ReduceShuffle
80
3500
70
3000
60
2500
50
2000
40
1500
30
1000
20
500
10
1ReduceShuffle()
()
4000
0
0
20
40
60
80
100
Map/Map
120
140
160
9-39
9 Hadoop
1-
2-
3-
1-1Map
2-1Map
3-1Map
4000
900
3500
800
700
()
3000
600
2500
500
2000
400
1500
300
1000
200
500
1Map(MB)
3 9-32
100
0
0
20
40
60
80
100
Map/Map
120
140
40MB
0
160
Map
9-40
9 Hadoop
IO
1Map 1Reduce JavaVM
(5 1 )
Shuffle
Hadoop Map HDFS
Map
1 Map
HDFS Map
9.5.3.2 Reduce
Reduce Map
TeraSort
Reduce Map
Reduce
Reduce Reduce
Reduce
(1) Shuffle Reduce
(1)(2) Reduce
Reduce
Reduce
Map Reduce Reduce
1 JavaVM 200MB
1: 8 (S4: CPU2 ), 16, 40GB
2: 8 (S4: CPU2 ), 16, 80GB
3: 16 (S4: CPU2 ), 32, 40GB
9-41
9 Hadoop
Reduce 9-21
9-21 Reduce
No.
Reduce
0.5
0.5
0.25
16
0.5
32
64
128
256
16
16
512
32
32
16
1192
74.5
74.5
37.25
9-33
2
Reduce
1-
2-
3-
4000
3500
3000
()
2500
2000
1500
1000
500
0
0
10
20
30
40
50
60
70
Reduce/Reduce
80
90
100
9-33 Reduce
Reduce TeraSort
9-42
9 Hadoop
Map CPU 3
9-34 9-35
3
Reduce Reduce ( (1) )
Reduce Reduce ( (4) )
Reduce Reduce ( (2),(3) )
(1)
Reduce
1600
1400
()
1200
1000
800
600
400
(2)
(3)
(4)
200
0
0
10
20
30
40
50
60
Reduce/Reduce
70
80
9-34 Reduce ( 3)
(1):(2):
(3):(4):
9-35 Reduce ( 3)
9-43
9 Hadoop
CPU
Reduce CPU
(Idle CPU)
Reduce 9-36
1Reduce 9-35
CPU WAIT CPU
Map Spill Records
Reduce Spill Records 9-37
-40GB/32
1Reduce-40GB/32
1600
6000
5000
1200
()
4000
1000
800
3000
600
2000
400
1000
200
0
04
0
10
20
30
40
50
Reduce/Reduce
60
70
9-36 Reduce ( 3)
9-44
80
1Reduce(MB)
1400
9 Hadoop
1600
8.0E+08
1400
7.0E+08
1200
6.0E+08
1000
5.0E+08
800
4.0E+08
600
3.0E+08
400
2.0E+08
200
1.0E+08
04
ReduceSpill Records
()
-40GB/32
0.0E+00
10
20
30
40
50
Reduce/Reduce
60
70
80
Spill Records
Map
9-45
9 Hadoop
Reduce 1 Map
Spill RecordsCPU WAIT CPU
Reduce
Spill Records
mapred.job.reduce.input.buffer.percentReduce
Map
0.0 Reduce Map
Spill Records
Reduce
Reduce Reduce Shuffle
Shuffle
3 Reduce
Shuffle 9-38
Shuffle Map
Shuffle
1600
30000
1400
25000
()
20000
1000
800
15000
600
Shuffle()
1200
10000
400
5000
200
0
0
0
10
20
30
40
50
Reduce/Reduce
60
70
80
9-46
9 Hadoop
Map Hadoop
Reduce Shuffle
Reduce Reduce
2
Reduce
Reduce 1 Map Shuffle
Map
IO
9.6 MapReduce
MapReduce
MapReduce
9-39
(Map,Reduce)
(Map,Reduce)
(Map,Reduce)
(Map,Reduce)
MapReduce
(Map,Reduce)
(Map)
Reduce
(Map,Reduce)
9-39 MapReduce
9-47
(Map)
(Reduce)
()
9 Hadoop
9.6.1
( 10MB GB )MapReduce
9-39
Map : Map
Reduce : Reduce
Map :
Reduce : Reduce
Map : Map
Reduce : Reduce
Map : Map
Reduce : Reduce
Map Reduce Map
Reduce MapReduce Web
9.6.2 MapReduce
MapReduce
Map : Map
Map : Map
Reduce : Reduce
Map : Map
Reduce : Reduce
Map
Map Map
[]1Map = []Map ([]Map [
]Map )
[]Map = []1Map []Map
[] Map
9-48
9 Hadoop
[][]1Map = ([]Map [
]Map ) ([]Map []Map )
Map []Map 1Map
Reduce
Reduce
[]Reduce = [] ([]
[]Reduce )
Reduce
[]1Reduce = []Reduce ([]Reduce
[]Reduce )
[]1Reduce = []Reduce []Reduce
[]1Reduce = []Reduce [
]Reduce
[]1Reduce = []1Reduce [
]1Reduce []1Reduce
Reduce = []1Reduce []Reduce
[]Reduce
MapReduce
Hadoop MapReduce Reduce
Map
Reduce Map
MapReduce
Map
mapred.reduce.slowstart.completed.maps
0.05
9-49
9 Hadoop
9.6.3 MapReduce
MapReduce
TeraSort
40GB TeraSort
8 (S4 :8 )
Map : 571
Reduce : 1364
Map : 4.010^10 Byte
Reduce : 4.010^10 Byte
Map : 608
Reduce : 256
Map : 16
Reduce : 16
Map
Map
[]1Map = []Map ([]Map [
]Map )
=
15.03 ()
9-50
9 Hadoop
= 430.9 0.98
439.7 ()
Reduce
Reduce
[]Reduce = [] ([]
[]Reduce )
5.010^11 (4.010^10 4.010^10 )
5.010^11 (Byte)
Reduce Reduce
[]1Reduce = []Reduce ([]Reduce
[]Reduce )
= 1364 (256 16)
= 85.25 ()
[]1Reduce = []Reduce []Reduce
= 5.010^11 256
1.5610^8 (Byte)
[]1Reduce = []Reduce [
9-51
9 Hadoop
]Reduce
= 5.010^11 1300
3.8510^8 (Byte)
[]1Reduce = []1Reduce [
]1Reduce []1Reduce
= 85.25 3.8510^8 1.5610^8
= 209.85 ()
Reduce = []1Reduce []Reduce
[]Reduce
= 209.85 (1300 260)
= 1049.25 ()
MapReduce
Reduce Map 0.05(5%)
MapReduce Map Reduce Map
+ Reduce
439.7 0.05 + 1049.25
1071.2 ()
500GB TeraSort
Map : 588 ( 34%)
Reduce : 1325 ( 26%)
MapReduce : 1353 ( 26%)
3
9-52
9 Hadoop
9.7 Hadoop
Hadoop
9.7.1 Hadoop MapReduce
Hadoop MapReduce
Hadoop
Hadoop
9.7.2 Hadoop
MapReduce Hadoop CPU
Hadoop
Hadoop PiEstimator TeraSort
TaskTracker Map Reduce
Map CPU CPU
1.5 Reduce CPU CPU +1
Hadoop
Map Reduce
9.7.3
MapReduce Hadoop
MapReduce
MapReduce
Hadoop Map Reduce
PiEstimator TeraSort
9.7.4 MapReduce
GB MapReduce GB
MapReduce
MapReduce TeraSort
TeraSort
9-53
9 Hadoop
9.6
9-54
10 Hadoop
10 Hadoop
Hadoop Hadoop
Hadoop
()
Hadoop
2
HA
FT
10.1
(1)
HDFS
client
(2)DataNode
namenode
(5)
(4)
(3)
(3)
datanode
(4)
(3)
datanode
10-1 HDFS
10-1
(4)
datanode
10 Hadoop
HDFS
(1) HDFS NameNode
NameNode
(2)
HDFS
client
Job
client
HDFS
(4)Job
(1)JobID
(3)Job
jobtracker
(9)Job
(5)
HDFS
client
(7)JAR
(8)Task
(6)Task
HDFS
client
tasktracker
HDFS
client
tasktracker
10-2 Map/Reduce
10-2
HDFS
client
tasktracker
10 Hadoop
Job
(1) Job JobTracker JobID
JobTracker
JobID
(2) (Job )HDFS Job
JobTracker Job
Job JAR
10-1 NameNode
(3) Job JobTracker Job Job
JobTracker
(4) JobTracker Job Job
Hadoop
10.2
10.1
Hadoop
10-3
10 Hadoop
(10.3)
(10.4)
(10.5)
10.3
Hadoop
10-3 NameNode
LAN
namenode
Hadoop
10-3
10.1 HDFS namenode
LAN DataNode NameNode
namenode LAN
3 1 HDFS
3
JobTracker 3
10-4
10 Hadoop
10.3.1
Hadoop
Hadoop
10.3.2
Hadoop
10.3.3
-
10.4
10.2
3 10-1
10-1
No.
HA
10-5
10 Hadoop
No.
FT
CPU
OS
FT
10.2 FT
HA FT
2
Heartbeat DRBD HA
HA Kemari FT
10.4.1
HA FT
10.4.1.1 Heartbeat
Heartbeat HA Heartbeat
Heartbeat
1 1 N 1
10-6
10 Hadoop
Heartbeat 10-4 /
heartbeat
(NIC)
Heartbeat
LAN
Heartbeat
he artbeat
he artbeat
LAN
10-4 Heartbeat
10.4.1.2 Kemari
Kemari
Kemari
HA
Kemari
Xen 2
LAN
I/O I/O
CPU
Kemari 10-5
10-7
10 Hadoop
Heartbeat
LAN
OS
OS
LAN
10-5 Kemari
10.4.1.3 DRBD
DRBD
DRBD
/
DRBD 10-6
DRBD
/ RAID
10-8
10 Hadoop
LAN
d rb d()
d rb d()
LAN
10-6 DRBD
10.2
10-9
10 Hadoop
(3)Heartbeat+
LAN
namenode
(1)Heartbeat
Heartbeat
Heartbeat
(2)DRBD
drbd
Hadoop
drbd
Hadoop
Heartbeat/LAN
10-7 HA
10.3
(1) Heartbeat Hadoop
NameNodeJobTracker 10-7
10.4.3 HA Kemari FT
10.4.2 HA Kemari FT
HA
FT
Kemari FT Hadoop
Kemari FT 10-8
10-10
10 Hadoop
(1)
()
namenode
Hadoop
Hadoop
(5)Heartbeat+
OS
OS
LAN
namenode
(2)Kemari
OS
OS
(3)Heartbeat
Heartbeat
Heartbeat
(4)DRBD
drbd
(VM)
Heartbeat//
Kemari LAN
drbd
(VM)
10-8 FT
10.3
(1) Hadoop
(2) Kemari
(4) DRBD
Hadoop
NameNodeJobTracker 10-8
10.5
10.4 HA FT
10-11
10 Hadoop
10.5.1 HA
Heartbeat DRBD HA
10.5.1.1
HA 10.3
10-2
10-2 (HA )
No.
Hadoop Heartbeat
DRBD
2
DRBD
(
)
IP
Heartbeat
Bonding()
() Bonding
DRBD Heartbeat /
DRBD
Heartbeat 10-9
10-12
10 Hadoop
IP
IP
Hadoop
Hadoop
DRBD()
DRBD()
10-9 (HA )
10.5.1.2
HA
Hadoop
Heartbeat Hadoop
(RA)
HA LAN
Heartbeat 3
VIPCheck IP
SFEX
STONITH
STONITH
STONITH
OS HP
iLO2IBM IMM
10.2
OS ssh STONITH
10-13
10 Hadoop
STONITH
STONITH 10-10
LAN
4.
1.
2.
Heartbeat
he artbeat
5.
he artbeat
LAN
3.
10-10 STONITH
(STONITH )
10.5.1.3
HA
DRBD
10-3 HA DRBD
10-14
10 Hadoop
Protocol C
Hadoop
DRBD
Heartbeat
3 ProtocolC
ProtocolC
ProtocolA:TCP
ProtocolB:
ProtocolC:
10.5.1.4
HA
LAN
Heartbeat LAN
10-15
10 Hadoop
LAN
IP
LAN
LAN 2
bonding
Heartbeat
Heartbeat
10.5.1.5
HA
10.5.1.6
10-4
10-11
10-16
10 Hadoop
(3)Heartbeat+
LAN
4
3
namenode
(1)Heartbeat
heartbeat
heartbeat
(2)DRBD
drbd
drbd
5
Hadoop
Heartbeat/LAN
Hadoop
10-11 (HA )
10-4 (HA )
No.
LAN
LAN
Heartbeat LAN
10.5.1.7
HA
Hadoop TeraSort
10-5
10-17
10 Hadoop
10-5 (HA )
No.
LAN
LAN
Heartbeat LAN
10.5.2 FT
HA Kemari FT
10.5.2.1
FT 10.3
10-6
10-6 ( FT )
No.
Kemari
Heartbeat
Kemari
DRBD
Heartbeat
Bonding
10-18
10 Hadoop
10.5.2.2
FT
Kemari
Kemari FT Kemari
( Kemari RA)Heartbeat Kemari RA
(1) Kemari RA
(2) Heartbeat RA Kemari
(3) Kemari
pause
(4) Kemari
(5) NameNode/JobTracker
NameNode
OS (Dom-U)
Xen
Kemari
xc_kemari_save
Kemari
xc_kemari_restore
Kemari FT 10-12
NameNode
OS (Dom-U)
Xen
Kemari RA
Heartbeat
OS (Dom-0)
DRBD
Heartbeat
DRBD
OS (Dom-0)
(VM)
(VM)
10-12 FT
10-19
10 Hadoop
(1) Heartbeat
(2) Kemari RA
(3) Gratious ARP MAC
(4) Kemari
(5) Gratious ARP MAC
(6)
Kemari FT 10-13
OS (Dom-U)
Xen
Kemari
xc_kemari_save
NameNode
Kemari
xc_kemari_restore
NameNode
OS (Dom-U)
Xen
Kemari RA
Kemari RA
Heartbeat
Heartbeat
OS (Dom-0)
DRBD
DRBD
(VM)
OS (Dom-0)
(VM)
10-13 FT
IP
ARP Heartbeat
Gratious ARP
10.5.2.3
FT
DRBD
Kemari Xen Xen
()/
()
10-20
10 Hadoop
10.2
DRBD DRBD
10-7
10-7 DRBD ( FT )
No.
Protocol C
Heartbeat //
DRBD
DRBD Heartbeat
OS
10.5.2.4
FT
LAN
Heartbeat LAN
LAN
LAN
Kemari FT 3
Heartbeat:
DRBD:
Kemari:
Hadoop NameNode/JobTracker I/O I/O
LAN 10-14
10-21
10 Hadoop
10-14
LAN
LAN LAN
LAN 10GNIC
1GNIC
LAN Bonding
3 LAN
Kemari RA
LAN 3
10.5.2.5
HA
Xen
Kemari Xen Xen Kemari
FT Xen
(1)
Kemari FT
CPU (Intel-VT, AMD-V)
PV PV
10-22
10 Hadoop
I/O
(2)
NameNode/JobTracker CPU
I/O
Kemari CPU
CPU 1
NameNode HDFS HDFS
JobTracker MapReduce
NameNode/JobTracker 2.5GB
8G
10.5.2.6
10-8
10-15
10-23
10 Hadoop
(5)Heartbeat+
LAN
4
3
namenode
(2)Kemari
OS
OS
(3)Heartbeat
heartbeat
heartbeat
(4)DRBD
drbd
drbd
(VM)
Heartbeat//
Kemari LAN
(VM)
10-15 ( FT )
10-8 ( FT )
No.
LAN
LAN
Heartbeat LAN
10.5.2.7 Kemari
FT
Hadoop TeraSort
10-9
10-24
10 Hadoop
10-9 ( FT )
No.
LAN
LAN
Heartbeat LAN
10.6
HA Kemari FT
10.6.1 HA
Heartbeat NameNode JobTracker /
JobTracker
NameNode HDFS DataNode
Safemode Safemode
NameNode
10.6.2 FT
Kemari
Hadoop
10-25
10 Hadoop
FT
10-10 Kemari
Kemari 10-10
Netperf: LAN
NNBench: NameNode
TeraSort: Hadoop
Kemari
10-10 Kemari
No.
1
Kemari
LAN
Kemari
Kemari
Netperf
750(Mb
20(Mbp
ps)
s)
NNBench
203(s)
2393(s)
17.5
16
NameNode
[Map :4
:1[byte]
:1[byte]
:5000]
3
Hadoop
Terasort
[:10G]
Kemari FT I/O
LAN
10-10 1/40
10-26
10 Hadoop
NNBench
10 Kemari
64MB(Hadoop ) Terasort
Hadoop MB
Hadoop
10.6.3
FT OS
Hadoop
Hadoop
FT
Hadoop
FT CPU
Kemari FT
10-27
11 Hadoop
11 Hadoop
Hadoop Hadoop
Hadoop
12
13
11.1
Hadoop
Hadoop
Hadoop
11.1.1 Hadoop
Hadoop
11-1
11-1
11 Hadoop
11-1 Hadoop
Hadoop
Hadoop Hadoop
Hadoop NameNode, JobTrackerHadoop
Hadoop
Hadoop
11-2
11-2
11 Hadoop
11-2 Hadoop
Hadoop 11-1
11-1 Hadoop
No.
Hadoop
Hadoop
Hadoop
11.1.2 Hadoop
Hadoop Hadoop
Hadoop
Hadoop
Hadoop
11.3 11.4
11.5
11-3
11 Hadoop
11.1.3
11-2
11-2
No.
OS
Hadoop
Hadoop
12
Hadoop
Hadoop HDD NIC
Hadoop
Hadoop
Hadoop
11-4
11 Hadoop
Hadoop
Hadoop
Hadoop
Hadoop
Hadoop
Hadoop
Hadoop
11-5
11 Hadoop
11.2
11.2.1
Hadoop 11-3
11-3
No.
24H/365D.
9:00-17:00
11.2.2 Hadoop
11-3 Hadoop
Job
L3
L3
Hadoop (DataNode/TaskTracker)
L2
L2
L2
L2
L2
NameNode
Namenode
Hadoop 100
JobTracker
JobTracker
Core2 Duo
40
11-3 Hadoop
11-6
Hadoop
Core2 Duo
10
11 Hadoop
11.3
Hadoop
Hadoop
11.3.1
Hadoop Hadoop
Hadoop
11.3.2
OS
11-4
11-4
No.
L3
Hadoop Hadoop
L2
Hadoop
Hadoop
NameNode, JobTracker
Hadoop
DataNode, TackTracker
Hadoop
Hadoop Hadoop
11-7
11 Hadoop
Hadoop
Hadoop
11.3.3
Hadoop
Hadoop Hadoop
Hadoop
Hadoop
11-5
11-5
No.
Hadoop
Hadoop
Hadoop
Hadoop
HinemosNagios
SNMP SNMP-TRAP
11-8
11 Hadoop
Hadoop
Hadoop Hadoop
20 60
Hadoop 120
1 40
Hadoop
Hadoop
11-6
11-6
No.
Hadoop
MapReduce
3
4
HDFS
DataNode
HDFS
TaskTracker
11-9
11 Hadoop
11.3.4
Hadoop Hadoop
CPU
Hadoop
/
11-7
11-7
No.
Hadoop
10 100 Hadoop
1000
Hadoop
11-10
11 Hadoop
11-8
11-8
No.
Hadoop
Hadoop
Hadoop
Hadoop
HDFS
HDFS HDFS
Hadoop
10
11
HDFS
HDFS
11.3.5
Hadoop
Hadoop
Hadoop
Hadoop
Hadoop
11-11
11 Hadoop
Hadoop
11-9
11-9
No.
L3
2
3
L2
4
5
Hadoop
OS
NIC
Hadoop
OS
NIC
OS
NIC
10
11-10
11-10
No.
11-12
11 Hadoop
11.4
11.4.1
10
11-11
11-11 Hadoop
No.
Hadoop
Hadoop
Hadoop
10
Hadoop
11-12
11-12
No.
( 11-11 No)
Hadoop
1,9,10
Hadoop
Hadoop
Hadoop
1,4,5,6
Hadoop
7,8,
11-13
11 Hadoop
11.4.2 Hadoop
Hadoop
11.4.3 Hadoop
Hadoop Hadoop Hadoop
Hadoop
Ganglia Hadoop
12
11.4.4 Hadoop
Hadoop Hadoop Hadoop
Hadoop
Hadoop
Hadoop
11-13
11-13 Hadoop
No.
Hadoop
Hadoop
11-14
11 Hadoop
Ganglia
Ganglia 12
Ganglia Ganglia
gmond
gmond
Ganglia
Ganglia 11-13
Ganglia
gmond (XML)
11-4 11-14
gmond
gmond
gmond
gmond
gmond
gmond
gmond
gmond
gmond
gmond
gmond
gmond
11-4 Ganglia
11-14 Ganglia
No.
Ganglia metric
HDD
metric gmond
Ganglia
3
4
gmond
gmond
11-15
11 Hadoop
No.
Hadoop Ganglia
Hadoop
11-5
11-5 Ganglia
r7-1-0-01
7
11.4.5 Hadoop
Hadoop
Hadoop
OS Hadoop 30
Hadoop
10 100 Hadoop
100
Puppet
Puppet
11-6 Puppet
11-16
11 Hadoop
Hadoop
OS
(puppetrun)
Puppet
Ganglia
-Hadoop NameNode
-Hadoop DataNode
-Hadoop
-Hadoop
CPU/
11-6 Puppet
Puppet 11-5
11-15 Puppet
No.
Puppet
push
md5
Puppet Puppet
Hadoop
MapReduce HDFS ...
11-17
11 Hadoop
Hadoop Puppet
factorpuppetrun
Puppet
15.4.9
11.4.6 Hadoop
Hadoop
Puppet Puppet
Hadoop
11-16
11-16
No.
Puppet
Puppet
Hadoop
Puppet
11-7
11-18
11 Hadoop
11.5
Hadoop
11.3 Hadoop 11.4
11.5.1
11-17
No6
11-17
No.
L3
L2
Hadoop
Hadoop
11.4 Hadoop
11.5.2
11-18
11-18
No
1
SNMP
2
3
Hadoop
Hadoop
14.4 Hadoop
11.4 Hadoop
JobTracker
11-19
11 Hadoop
No
NameNode
11-8 Hadoop
11-9 r7-1-0-01
7
11-9 Hadoop
11.5.3
11-19
11-19
No.
3
4
11.4 Hadoop
11-20
11 Hadoop
No.
5
11.4 Hadoop
11.4 Hadoop
11-20
11-20
No.
Hadoop
Hadoop
Hadoop
Hadoop
HDFS
NameNode HDFS
OS
OS
11.4
Hadoop
10
11
11.4 Hadoop
HDFS
SecondaryNameNode JobTracker
11-21
11 Hadoop
11.5.4
11-21 11-22
11-21
No.
1
L3
OS
2
3
L2
4
5
Hadoop
Hadoop
OS
OS
10
11-22
No.
11-22
11 Hadoop
11.6
11.6.1
Hadoop
Hadoop
Hadoop
Hadoop
Hadoop
Hadoop
11.6.2
Hadoop
Hadoop Hadoop
Hadoop
Hadoop
11-23
12 Hadoop
12 Hadoop
Hadoop
Hadoop
RedHat Enterprise
Linux Kickstart
Puppet
12.1
Hadoop Hadoop
12.1.1 Hadoop
Hadoop Hadoop NameNodeJobTracker
Hadoop DataNodeTaskTracker 12-1
Hadoop
12-1 100 Hadoop
No.
Hadoop
NameNode , JobTracker
Hadoop
DataNode, TaskTracker
96
Hadoop
IP
Hadoop
Hadoop
Hadoop 96
Hadoop
1000 Hadoop
80
Hadoop 1000
12-1
12 Hadoop
Hadoop L2
Hadoop Hadoop
Hadoop
12.1.2 Hadoop
Hadoop
Hadoop
OS
Hadoop IA
Hadoop
Hadoop
Hadoop
12-2
12-2 Hadoop
No.
Hadoop
12-2
12 Hadoop
12.2
12.2.1
12-1
Job
L3
L3
Hadoop (DataNode/TaskTracker)
L2
L2
L2
L2
L2
NameNode
Namenode
Hadoop 100
JobTracker
JobTracker
Core2 Duo
40
Hadoop
Core2 Duo
10
12-1
L3
L3 DHCP
12.2.2
Hadoop 12.1
Hadoop
Hadoop
12-3
12 Hadoop
Hadoop
12.2.3
OS
12.3
12.3.1 Hadoop
Hadoop Hadoop
12-2
Hadoop
Hadoop
Hadoop
12-3
12-3
No.
12-4
12 Hadoop
No.
12.3.2
HPC
HPC
Hadoop HPC
Hadoop
Kickstart
Kickstart+ Puppet
Kickstart
Puppet
rocks
Kickstart
OSCAR
Kickstart GUI
Kickstart
12-5
12 Hadoop
12-4
12-4
No.
Kickstart+Puppet
rocks
OSCAR
OS
GUI
GUI
GUI
Roll
Kickstart
rocks OSCAR
Rocks, OSCAR
12-6
12 Hadoop
Kickstart Puppet
12.4
Kickstart puppet
KickStart
DNS DHCP
TFTP
HTTP
Puppet
12.4.1 Kickstart
Kickstart
Kickstart
OS
Kickstart
OS
MAC
IP ,
96 MAC
Kickstart
12-7
12 Hadoop
12.4.2 Kickstart
Kickstart 12-5 12-6
DHCP DNS 12.4.8
12-5 Kickstart
No.
ON
PXE
DHCP IP
DHCPDISCOVER
DHCP
TFTP
OS
OS
IP .
HTTP
(Kickstart )
Kickstart
OS
Kickstart
12-6 Kickstart
No.
IP
PXE
DHCP
PXE
TFTP
OS
1
3
OS
OS
HTTP
4
5
OS
Kickstart
OS
12-8
HTTP
HTTP
12 Hadoop
No.
12.4.3 Puppet
Puppet
Ruby Puppet
12-2
Hadoop
OS
(puppetrun)
Puppet
-Hadoop NameNode
-Hadoop DataNode
-Hadoop
Ganglia
-Hadoop
CPU/
12-2 Puppet
12-9
12 Hadoop
12-7 Kickstart
No
1
1-1
1-2
2
2-1
Hadoop
1-1 Hadoop
Kickstart
IP IP
1-2 IP
IP 192.168.3.40
2-1 Hadoop
CPU
12.4.5
1-1
Kickstart IP MAC
IP
NIC MAC
MAC Kickstart IP
12-10
12 Hadoop
DNS
MAC
NIC MAC
NIC
MAC IP
DHCP hosts
Kickstart IP
IP
IP
DNS
12-8 MAC
DDNS
12-8
No.
1
2
MAC
DDNS
hosts
DNS A
DHCP
BIND A
hosts
3
MAC
12-11
12 Hadoop
No.
5
MAC
DDNS
MAC
DNS
12.4.6
1-2
IP Hadoop
12-9
No.
1
rack1-13u.example.net
rack1switch-port13.example.net
Hadoop
12-10
12-12
12 Hadoop
12-10
No.
MAC
Hadoop L2 mac-address-table(MAC
)
MAC
IP DNS
3
1/0/12
DNS IP A
# /root/scripts/myhostname
r3-1-0-12.example.net
12-11
12-11
No.
12.4.7
12-7 2-1
12-13
12 Hadoop
12-12
No.
CPU Hadoop
MapReduce
12-13 OS Kickstart
12-14
Puppet facter
12-13 OS
No.
OS
(%pre )
OS
(%include
)
12-14
No.
(CPU //)
Puppet facter
Hadoop
12.4.8
12-7 1-11-22-1
Kickstart DNS
DHCP TFTP HTTP Puppet
12-14
12 Hadoop
DNS
NW
example.net.NW
DHCP DNS
DNS
Hadoop
DHCP DynamicDNS
DHCP
DHCP
TFTP DHCP
dhcpd DHCP
DHCP Hadoop
L3 DHCP
DHCP IP
Hadoop IP
IP DHCP
DNS DHCP
IP IP
DHCP DNS A
TFTP
tftpd TFTP
RedHat Enterprise Linux syslinux
TFTP
HTTP
HTTP Apache
12-15
12 Hadoop
HTTP
OS
Kickstart
Kickstart Kickstart
Linux
Kickstart %pre
%post puppet
Puppet
Puppet
Ruby Puppet
Hadoop
Hadoop
12-15
No.
DNS
BIND
OSS
DHCP
dhcpd
OSS
TFTP
tftpd
OSS
HTTP
Apache
OSS
Puppet
Puppet
OSS
12-3
12-16
12 Hadoop
12-3 Hadoop
ON
IP1
OS
DHCP
TFTP
2(OS)
HTTP
DHCP
Puppet
DNS
Puppet Kickstart
Kickstart
12-16 Kickstart
No.
1
2
3
OS
OS
Hadoop Puppet OS
12-17
12 Hadoop
No.
DNS
Puppet
Puppet
Puppet
Puppet
6
Puppet
12.4.10 Puppet
Puppet
12-7 2-1
Puppet
Puppet
12-17 Puppet
No.
common
common
OS
NTP cron
OS
facter
hadoop
hadoop
Hadoop
namenode/jobtracker
ganglia
gmond
gmond
12-18
12 Hadoop
No.
gmetad
gmetad
web
ganglia
Puppet
Puppet
manifest
facter
facter
12-18 facter
facter
No.
racknum
diskcount
mygmetad metad
disklist
12.5
12.5.1 Hadoop
100 Hadoop 50
12-19
12-19
12 Hadoop
12-19
No.
1
46
0.75
1 4
SAS
(72300GBx2)
50
1.75
1 5
SATA
(250GBx2)
12-19 No2
12-20 50
12-20
r6-1-0-01
r6-1-0-02
r6-1-0-03
r6-1-0-04
r6-1-0-05
r6-1-0-06
r6-1-0-07
r6-1-0-08
r6-1-0-09
r6-1-0-10
11:33
11:30
11:33
11:30
11:33
11:30
11:33
11:30
11:33
11:30
12:42 1:09
12:42 1:12
12:42 1:09
12:42 1:12
12:42 1:09
12:42 1:12
12:42 1:09
12:42 1:12
12:42 1:09
12:42 1:12
r7-1-0-01
r7-1-0-02
r7-1-0-03
r7-1-0-04
r7-1-0-05
r7-1-0-06
r7-1-0-07
r7-1-0-08
r7-1-0-09
r7-1-0-10
11:57
11:54
11:57
11:54
11:57
11:54
11:57
11:54
11:57
11:54
13:03 1:06
13:03 1:09
13:03 1:06
13:02 1:08
13:03 1:06
12:57 1:03
12:57 1:00
12:57 1:03
12:57 1:00
12:56 1:02
r7-2-0-01
r7-2-0-02
r7-2-0-03
r7-2-0-04
r7-2-0-05
r7-2-0-06
r7-2-0-07
r7-2-0-08
r7-2-0-09
r7-2-0-10
11:45
11:42
11:45
11:42
11:45
11:42
11:45
11:42
11:45
11:42
13:06 1:21
13:05 1:23
13:06 1:21
13:05 1:23
13:05 1:20
13:00 1:18
13:00 1:15
13:00 1:18
13:00 1:15
13:00 1:18
r7-2-0-11
r7-2-0-12
r7-2-0-13
r7-2-0-14
r7-2-0-15
r7-2-0-16
r7-2-0-17
r7-2-0-18
r7-2-0-19
r7-2-0-20
11:39
11:36
11:39
11:36
11:39
11:36
11:39
11:36
11:39
11:36
12:48 1:09
12:46 1:10
12:48 1:09
12:45 1:09
12:48 1:09
12:46 1:10
12:49 1:10
12:45 1:09
12:49 1:10
12:46 1:10
r7-1-0-11
r7-1-0-12
r7-1-0-13
r7-1-0-14
r7-1-0-15
r7-1-0-16
r7-1-0-17
r7-1-0-18
r7-1-0-19
r7-1-0-20
11:51
11:48
11:51
11:48
11:51
11:48
11:51
11:48
11:51
11:48
12:54 1:03
12:50 1:02
12:54 1:03
12:51 1:03
12:55 1:04
12:52 1:04
12:55 1:04
12:51 1:03
12:55 1:04
12:52 1:04
12-20
12 Hadoop
50
16
14
12
10
8
6
4
2
0
0:50
0:55
1:00
1:30
1:35
1:40
12-4
12.5.2
Hadoop
Hadoop
12-21 12-2
12-21 Hadoop
No.
Hadoop
12-21 No.2
12-21
12 Hadoop
Kickstart Puppet
Hadoop
Hadoop
Hadoop Hadoop
100
1000
10
12-22
13 Hadoop
13 Hadoop
Hadoop
Hadoop
Ganglia JobTracker
WebUI
100 Hadoop
13.1
Hadoop
13.1.1
Hadoop
Hadoop
Hadoop
13.1.2
Hadoop 3
(1)
(2)(3)
(1)~(3)
13-1
13 Hadoop
(1)
Hadoop
Hadoop
(2)
Hadoop
(3)
1 1
Hadoop 1 1
13-2
13 Hadoop
13.2
13.2.1
Namenode
JobTracker
NameNode
Namenode
JobTracker
JobTracker
Hadoop (DataNode/TaskTracker)
13-1
13-1
Hadoop
13.3
13.1
13-3
13 Hadoop
13.3.1 1Hadoop
Hadoop
13-1
13-1
No.
1
2
3
Hadoop
Hadoop
13-1
13.3.1.1 Hadoop
Hadoop MapReduce MapReduce
3 MapReduce
MapReduce MapReduce
MapReduce
MapReduce Map Reduce
4
MapReduce Map Reduce
MapReduce Map Reduce Map
Reduce MapReduce Map
MapReduce Reduce
Hadoop MapReduce Hadoop
speculative execution MapReduce
MapReduce
13-4
13 Hadoop
Hadoop
13-2
13-2 Hadoop
No.
MapReduce
MapReduce
MapReduce
MapReduce Map
MapReduce Reduce
MapReduce Map
7
8
MapReduce
MapReduce
Reduce
MapReduce Map
JobTracker
MapReduce
Reduce
10
MapReduce Map
11
MapReduce
Reduce
13.3.1.2
Hadoop
CPU
CPU CPU
13-5
13 Hadoop
systemuseriowait
system
user iowait CPU I/O
I/O
swap-in swap-out
UsedCachedBufferedSwapped
I/O
NIC
NIC
13-3
13-3
No.
CPU
NameNode
CPU systemuser
JobTracker
iowait
Hadoop
MapReduce
Client
NameNode
Used
CachedBufferedSwapped JobTracker
swap-in
Hadoop
swap-out
MapReduce
Client
13-6
13 Hadoop
No.
NameNode
JobTracker
Hadoop
MapReduce
Client
NameNode
bytes received
JobTracker
Hadoop
bytes sent
MapReduce
Client
Hadoop
NameNodeJobTracker JVM
FullGC
HDFS HDFS
HDFS
Hadoop
13-1
13-2
13-7
13 Hadoop
:
()
13-2
Hadoop 13-4
13-4 Hadoop
No.
JVM
Heap New
NameNode
Heap Old
JobTracker
Heap Permanent
FullGC
2
HDFS
HDFS
NameNode
UnderReplicatedBlocks ( NameNode
)
MissingBlocks
HDFS
CorruptBlocks (
13-8
NameNode
NameNode
13 Hadoop
13.3.1.3 Hadoop
Hadoop Hadoop
13-5 Hadoop
No.
Hadoop
Hadoop
Hadoop
13.3.1.4
MapReduce MapReduce
13-2
13-3~ 13-4
13.3.1.5 1 Hadoop
Hadoop 13-2
13-5
13.3.2 2
13.3.2.1
Hadoop
13-9
13 Hadoop
13-3
13-10
13 Hadoop
13-4
1 1
13-5
13-11
13 Hadoop
13.3.2.2 2
13.3.3 3
Hadoop
Hadoop
13.3.3.1 Hadoop
Hadoop Hadoop
Hadoop 1
Hadoop
Hadoop
Hadoop 1
Hadoop
Hadoop Hadoop
Hadoop
13-6
13-12
13 Hadoop
13-6
13-6
Hadoop 1 13.2
Hadoop MapReduce
MapReduce Hadoop
MapReduce
Hadoop Hadoop
MapReduce 2
13.3.3.2 Hadoop
Hadoop JobTrackerNameNode
Domain- 14
12.1.1
1
13-13
13 Hadoop
13.3.3.3 3
13-6
13-6
No.
Hadoop
Hadoop MapReduce
13.3.4
13.3.1 13.3.3
13.3.4.1
1Hadoop 13-2
13-5
2
3
Hadoop
MapReduce Hadoop
2
13-7
13-7
No.
13-2 13-5
Hadoop MapReduce
13-14
13 Hadoop
No.
Hadoop
13-2 13-7 JobTracker
WebUI JobTracker WebUI Hadoop
13-7JobTracker WebUI
( 13-3
13-4)Hadoop 13-5
13.3.4.2
GangliaMuninCacti
13-7 4
1 GangliaMuninCacti
2 Ganglia
13-15
13 Hadoop
Ganglia
1
Ganglia
3 GangliaMuninCacti
MapReduce
MapReduce
GangliaMuninCacti
4 Ganglia
CactiMunin
13-8
13-8
No.
Ganglia
Munin
Cacti
13-2 13-5
MapReduce
Ganglia
13-16
13 Hadoop
13.3.4.3 Ganglia
Ganglia
WebFrontend
Client
gmetad
HDD
:
gmond
gmond
gmond
gmond
gmond
gmond
gmond
gmond
gmond
gmond
gmond
gmond
13-8Ganglia
13-8 Ganglia
gmond
gmetad gmond
WebFrontend gmetad
13.3.4.4 Ganglia
Ganglia CPU
1
Ganglia 2
Hadoop Ganglia
gmetric
13-17
13 Hadoop
Hadoop Ganglia
Hadoop Ganglia MapReduce HDFS Hadoop
gmetric
gmetric ganglia-gmond rpm
gmetric
gmond gmond
13-9
gmetric
cron
gmond
gmetric
gmond
gmond
gmond
gmond
gmond
:
13-9 gmond
13.3.4.5 Ganglia
13.3.4.4 1
13-18
13 Hadoop
Ganglia
php conf.php $optional_graphs
php conf.php
13-10
13-10
host_extra.tpl
host_extra.tpl
13-9
13-9
No.
Heap New
Heap
13-19
13 Hadoop
No.
Heap Old
Heap Permanent
4
5
swap-inout
swap-in
swap-out
13.3.5
1 3 Ganglia
2
13.3.5.1 1 3 Ganglia
Hadoop
13.3.1
13-10 Hadoop
13-10
No.
MapReduce
MapReduce
JobTracker WebUI
MapReduce
JobTracker WebUI
MapReduce
13-20
13 Hadoop
No.
MapReduce
MapReduce
JobTracker WebUI
MapReduce
JobTracker WebUI
MapReduce
JobTracker WebUI
MapReduce
JobTracker WebUI
MapReduce
JobTracker WebUI
MapReduce
JobTracker WebUI
MapReduce
JobTracker WebUI
MapReduce
JobTracker WebUI
MapReduce
JobTracker WebUI
Ganglia
Ganglia
Ganglia
Ganglia
Ganglia
Ganglia
Gangli a
Ganglia
MapReduce
Map
MapReduce
Reduce
MapReduce
Map
MapReduce
Reduce
MapReduce
Map
MapReduce
Reduce
10
MapReduce
Map
11
MapReduce
Reduce
12
13
CPU
SystemUseriowait
14
15
16
17
UsedCached
BufferedSwapped
18
swap-in
19
swap-out
13-21
13 Hadoop
No.
Ganglia
Ganglia
Ganglia
20
21
bytes received
22
bytes sent
23
Heap New
Ganglia
24
Heap Old
Ganglia
25
Heap Permanent
Ganglia
26
FullGC
Ganglia
27
Ganglia
Ganglia
28
29
HDFS
Ganglia
30
UnderRepulicatedBlocks
Ganglia
31
MissingBlocks
Ganglia
32
CorruptBlocks
Ganglia
33
JobTracker WebUI
( 13-3~ 13-4)Hadoop
13-5
Hadoop MapReduce
13-11
13-22
13 Hadoop
13.3.3 Hadoop
13-12
13-23
13 Hadoop
Hadoop
Hadoop
MapReduce
Ganglia
13-9 Ganglia 13-10
13-24
13 Hadoop
1.
2.
3.Heap
4.swap-inout
5.
6.
13-13 Ganglia
JobTracker WebUI Ganglia
Hadoop
1
2
3
Hadoop
13.3.5.2
Hadoop Hadoop
Hadoop
13-25
13 Hadoop
I/O
13-14 CPU
102648668896105
CPU
Hadoop 200
13-15 13-12
100
90
80
CPU
70
60
50
40
WAIT CPU
System CPU
Nice CPU
User CPU
Idle
30
20
10
0
0
10
26
48
66
88
96
13-15 CPU
13-26
105
13 Hadoop
13-12 CPU
No.
WAIT
System
Nice
User
CPU
CPU
CPU
CPU
Idle
WAITCP
U
1.160761 1.041576
10
26
48
11.54683
66
88
22.68472 3.345284
96
105
26.91957 3.889626
0.000054
2.636881 0.000990
0.001136
0.038984
4.151413 93.64630
5.971237
6.439261
6.484171
79.84410 0.240559
67.52965 0.257781
62.66791 0.256377
WAIT CPU
0.25%WAIT CPUWAIT CPU
10% CPU 90 / 0.25 = 360 CPU
Idle
13.4
13.4.1
Hadoop
Hadoop
13-27
13 Hadoop
Hadoop
Ganglia
Ganglia
Hadoop Ganglia
Hadoop
Ganglia
100 Hadoop
13.4.2
3
13.4.2.1
13-10 CPU
MapReduce
Ganglia MapReduce
MapReduce
13.4.2.2 I/O
13.3.5
gmetad I/O
Hadoop Hadoop
Hadoop
gmetad I/O
gmetad I/O
SSD
RAM
PC
SSD Hadoop
I/O
13-28
13 Hadoop
RAM gmetad
I/O RAM RAM
13.4.2.3
Ganglia 1 1 1 1 1 5
2
1
13-29
6
2
I.1
I-1
Job
traffic-report.jar
traffic-report.properties
HDFS
I-1
traffic-report.jar
traffic-report.properties
MapReduce 2
I-1
traffic-report.properties
I-1
I-1
No.
InputSplit
InputSplit
Reduce
Reduce
10
Reduce
HDFS
2
I-2
I.2
I.2.1
I.2.1.1
I-2
Hadoop/
Job
L3
L2
NameNode
Namenode
JobTracker
JobTracker
r2
L2
L2
L2
L2
L2
L2
Hadoop
(DataNode/TaskTracker)
r6
10
I-2
I-3
r5
18
r4
16
r3
12
r7
40
I.2.1.2
Hadoop Hadoop
Hadoop
Hadoop I-2
1 r2
I-2 Hadoop
No.
JobTracker
Kemari FT
JobTracker
Kemari
JobTracker
jt
JobTracker
hjt1
Kemari
jt
DL380G5
QC XE5345
jt
QuadCore/2.33GHz x2
32GB
HDD 146GB x2
JobTracker
hjt2
Kemari
jt
DL380G5
QC XE5345
hjt1 jt
QuadCore/2.33GHz x2
32GB
HDD 146GB x2
NameNode
nn
NameNode
Kemari FT
NameNode
Kemari
NameNode
Kemari
hnn1
nn
DL380G5
QC XE5345
I-4
No.
nn
QuadCore/2.33GHz x2
32GB
HDD 146GB x2
6
NameNode
hnn2
Kemari
jt
DL380G5
QC XE5345
hnn nn
QuadCore/2.33GHz x2
32GB
HDD 146GB x2
JobClient
job
JobClient
DELL R410
Intel(R)Xeon(R)CPUE55
06 2.13GHZ x 8
8GB
HDD 13GB
Hadoop
CPU
Hadoop CPU
CPU
NameNode HDFS
NameNode Hadoop
JobTracker Job
CPU
Hadoop
I-3 NameNode
No.
250Byte
150Byte
180Byte
I-5
NameNode JobTracker
Hadoop
SAS RAID1
Hadoop
LAN FT
LAN
FT LAN
10GBps 4
FT 10G NIC
Hadoop
OS
OS
iLO
Hadoop
Hadoop
CPU
I-4
I-4 Hadoop
No.
CPU
HDD
DL380G5
Xeon
8GB
SAS 146GB x 2
r3
XE5345
QuadCore/
I-6
2.33GHz x2
DL360G5
Xeon
XX5460
QuadCore/
6GB
SAS 146GB x 2
r3
2GB
SAS 72GB x 2
16
r4
6GB
SAS300GB x 2
18
r5
2GB
SATA 250GB x 2
10
r6
40
r7
3.16G
DL360G5
Xeon
LV DC X5148
DualCore/
2.33G
DL360G6
Xeon
XE 5504
QuadCore/2
1P4C
Express
Core2 Duo
5800
T9400
iR110a-1
Hadoop
CPU
Hadoop
(2010 1 )Intel
Xeon 5500 2GHz 4
CPU
CPU
CPU
spec.org TPC
TPC
Hadoop (Map Reduce )
Java VM
200MB()CPU 1 Map 1
Reduce 1 1
I-7
JavaVM Hadoop
(TaskTrackerDataNode )OS
CPU1 1GB
Hadoop Shuffle Reduce
1Gbps
LAN
1 Hadoop
PXE
BIOS BIOS
PXE
PXE
OS
I-2
r2
I-5
No.
E8400
NagiosGanglia
Intel(R)Pentium(R)4CPU
3.00GHZ x 2
mg1
2G
HDD 120GB
2
pp1
E8400
E8400 3.00GHZ x 2
:2GB
I-8
No.
HDD:42GB x1
3
pp2
I-6
I-6
No.
1
L3
WS-C3750G
10/100/1000
24
(EtherChannel)
SFP
-24TS-E
4
WS-C3750E
-24TD-S
10/100/1000
24X2
(EtherChannel)
10
2
L2
WS-C3750G
-24TS-E
10/100/1000
(EtherChannel)
24
SFP
4
I-9
I-7 -1
No.
Hadoop Hadoop
Gigabit
Hadoop
Telnet SSH
L3
L3 L3
DHCP
IP
IP DHCP IP
DHCP
L3 DHCP
I-10
I-8 -2
No.
I-7
I-7
Hadoop Hadoop
1 LAN
L3
I.2.1.3
I-9 OS
CentOS 5.3
I-9
No.
JobTracker
hadoop0.20.1
ganglia-gmond3.1.2
nagios-plugin1.4.14
puppet0.24.8
NameNode
hadoop0.20.1
ganglia-gmond3.1.2
nagios-plugin1.4.14
puppet0.24.8
JobTracker xen3.0.3
JobTracker drbd8.3.2
NameNode heartbeat(2.1.4)
I-11
No.
NameNode kemari(v1)
ganglia-gmond3.1.2
nagios-plugin1.4.14
puppet (0.24.8)
4
Job
hadoop0.20.1
ganglia-gmond3.1.2
nagios-plugin1.4.14
puppet0.24.8
puppet-server0.24.8
bind-chroot(9.3.4)
bind-libs(9.3.4)
bind-utils(9.3.4)
bind(9.3.4)
caching-nameserver(9.3.4)
ganglia-gmond3.1.2
nagios-plugin1.4.14
puppet0.24.8
ypbind(1.19)
nagios-3.2.0-1
ganglia-gmetad(3.1.2)
ganglia-web(3.1.1)
ganglia-gmond3.1.2
libganglia3_1_0-3.1.2
nagios-plugin1.4.14
net-snmp(5.3.2.2)
puppet0.24.8
puppet-server0.24.8-1
bind-chroot(9.3.4)
bind-libs(9.3.4)
bind-utils(9.3.4)
bind(9.3.4)
caching-nameserver(9.3.4)
bind 9.3.4
ganglia-gmond3.1.2
nagios-plugin1.4.14
I-12
No.
puppet0.24.8
ypbind(1.19-11)
Hadoop
hadoop0.20.1
ganglia-gmond3.1.2
nagios-plugin1.4.14
puppet0.24.8
I-10
I-10
No.
hadoop-0.20.1
http://www.apache.org/dyn/closer.c
Hadoop
gi/hadoop/core/
http://issues.apache.org/jira/brows
e/MAPREDUCE-112
http://issues.apache.org/jira/brows
e/MAPREDUCE-118
http://issues.apache.org/jira/brows
e/MAPREDUCE-1182
http://issues.apache.org/jira/brows
e/HADOOP-5759
https://issues.apache.org/jira/brow
se/HADOOP-4675
2
BIND
ypbind-1.19-11.el5.x86_
64.rpm
bind-chroot-9.3.4-10.P1
.el5x86_64.rpm
bind-libs-9.3.4-10.P1.el
5x86_64.rpm
bind-utils-9.3.4-10.P1.e
l5.x86_64.rpm
bind-9.3.4-10.P1.el5.x8
6_64.rpm
I-13
CentOS5.3
No.
caching-nameserver-9.
3.4-10.P1.el5.x86_64.rp
m
3
DRBD
drbd-8.3.2.tar.gz
http://oss.linbit.com/drbd/
Ganglia
ganglia-3.1.2.tar.gz
http://sourceforge.net/projects/gan
glia/files/ganglia%20monitoring%2
0core/
Heartbeat
Heartbeat
Heartbeat
heartbeat-2.1.4-1.rhel5.
http://www.linux-ha.org/wiki/Dow
x86_64.RPMS.tar.gz
nload/ja
hb-monitor-1.02-1.hb21
4.x86_64.rpm
http://www.linux-ha.org/wiki/Cont
rib/ja
Kemari
Kemari
http://sourceforge.net/projects/kem
kemari-xen-testing.tar.
ari/files/-kemari-v1
bz2
Kemari RA
ha-tools.tar.bz2
7
Nagios
nagios-3.2.0.tar.gz
http://www.nagios.org/download/co
re
Net-SNMP
net-snmp-5.3.2.2-5.el5
CentOS5.3
net-snmp-utils-5.3.2.25.el5
net-snmp-perl-5.3.2.2-5
.el5
net-snmp-libs-5.3.2.2-5.
el5
9
Puppet
puppet-server-0.24.8-1.
http://download.fedora.redhat.com
el5.1.noarch.rpm
/pub/epel/5/x86_64/repoview/letter
puppet-0.24.8-1.el5.1.n
_p.group.html
I-14
No.
oarch.rpm
facter-1.5.2-2.el5.noarc
h.rpm
10
I.2.2
Xen
kemari-xen-testing.tar.
http://sourceforge.net/projects/kem
bz2
ari/files/
I.2.2.1
I-3
I-15
Hadoop
r2
eth1
:
192.168.10.0/24
: 192.168.102.0/24
: 192.168.102.1
: 192.168.102.2192.168.102.50
(IP)
()
eth0
/()
()
eth0
()
Hadoop
10Gbps
eth2
eth2
eth3
eth3
eth0
Hadoop
eth1
JobClient
JobTracker
U27-26
eth1
eth0
JobTracker
U25-24
eth1
eth0
NameNode()
U22-21
eth1
eth0
10Gbps
eth2
eth2
eth3
eth3
eth0
eth0
eth0
:
:
IP:
NameNode
U20-19
eth1
eth0
Member
WS-C3750G-24TS-E
stack
WS-C3750G-24TS-E
: 192.168.107.0/24
: 192.168.107.1
Master
IP:
192.168.107.16192.168.107.128
(DHCP)
WS-C3750G-24TS-E
L3SW
WS-C3750E-24TD-S
L3SW
DL380G5(M8G)
U11-10
DL380G5(M8G)
U13-12
DL380G5(M8G)
U16-15
DL380G5(M8G)
U18-17
DL360G5(QC3.16)
U23
DL360G5(QC3.16)
U25
DL360G5(QC3.16)
U27
DL360G5(QC3.16)
U29
DL360G5(QC3.16)
U31
DL360G5(QC3.16)
U33
DL360G5(QC3.16)
U35
DL360G5(QC3.16)
U37
eth0
eth0
eth0
eth0
eth0
eth0
eth0
eth0
eth0
eth0
eth0
eth0
WS-C3750E-24TD-S
L3SW
DL360G5(DC2.33)
U12
DL360G5(DC2.33)
U13
DL360G5(DC2.33)
U15
DL360G5(DC2.33)
U16
DL360G5(DC2.33)
U18
DL360G5(DC2.33)
U19
DL360G5(DC2.33)
U21
DL360G5(DC2.33)
U22
DL360G5(DC2.33)
U27
DL360G5(DC2.33)
U28
DL360G5(DC2.33)
U30
DL360G5(DC2.33)
U31
DL360G5(DC2.33)
U33
DL360G5(DC2.33)
U34
DL360G5(DC2.33)
U36
DL360G5(DC2.33)
U37
eth0
eth0
eth0
eth0
eth0
eth0
eth0
eth0
eth0
eth0
eth0
eth0
eth0
eth0
eth0
eth0
WS-C3750G-24TS-E
L3SW
DL360G5(DC2.33)
U12
DL360G5(DC2.33)
U13
DL360G5(DC2.33)
U15
DL360G5(DC2.33)
U16
DL360G5(DC2.33)
U18
DL360G5(DC2.33)
U19
DL360G5(DC2.33)
U21
DL360G5(DC2.33)
U22
DL360G5(DC2.33)
U27
DL360G5(DC2.33)
U28
DL360G5(DC2.33)
U30
DL360G5(DC2.33)
U31
DL360G5(DC2.33)
U33
DL360G5(DC2.33)
U34
DL360G5(DC2.33)
U36
DL360G5(DC2.33)
U37
eth0
eth0
eth0
eth0
eth0
eth0
eth0
eth0
eth0
eth0
WS-C3750E-24TD-S
L3SW
Express5800
U30F
Express5800
U30B
Express5800
F31F
Express5800
U31B
Express5800
U32F
Express5800
U32B
Express5800
U33F
Express5800
U33B
Express5800
U34F
Express5800
U34B
eth0
eth1
eth0
eth1
eth0
eth1
eth0
eth1
eth0
eth1
eth0
eth1
eth0
eth1
eth0
eth1
eth0
eth1
eth0
eth1
eth0
eth1
eth0
eth1
eth0
eth1
eth0
eth1
eth0
eth1
eth0
eth1
: 192.168.103.0/24
: 192.168.103.1
: 192.168.104.0/24
: 192.168.104.1
: 192.168.105.0/24
: 192.168.105.1
: 192.168.106.0/24
: 192.168.106.1
eth1
IP:
192.168.103.16192.168.103.128
(DHCP)
IP:
192.168.104.16192.168.104.128
(DHCP)
IP:
192.168.105.16192.168.105.128
(DHCP)
IP:
192.168.106.16192.168.106.128
(DHCP)
eth1
r3
r4
r5
r6
eth1
eth1
Member
WS-C3750E-24TD-S
stack
Express5800
U28F
Express5800
U28B
Express5800
U29F
Express5800
U29B
Express5800
U30F
Express5800
U30B
Express5800
U31F
Express5800
U31B
Express5800
U32F
Express5800
U32B
Express5800
U33F
Express5800
U33B
Express5800
U34F
Express5800
U34B
Express5800
U35F
Express5800
U35B
Express5800
U36F
Express5800
U36B
Express5800
U37F
Express5800
U37B
Master
eth1
eth1
eth1
eth1
eth1
eth1
eth1
eth1
eth1
eth1
eth1
eth1
eth1
eth1
eth1
eth1
eth1
eth1
eth1
eth1
r7
Hadoop
I-3
I.2.2.2
I-11
I-16
Express5800
U18F
Express5800
U18B
Express5800
U19F
Express5800
U19B
Express5800
U20F
Express5800
U20B
Express5800
U21F
Express5800
U21B
Express5800
U22F
Express5800
U22B
Express5800
U23F
Express5800
U23B
Express5800
U24F
Express5800
U24B
Express5800
U25F
Express5800
U25B
Express5800
U26F
Express5800
U26B
Express5800
U27F
Express5800
U27B
I-11
r2
SWNo
r2
r2
U19-20
r2
U21-22
r2
r2
U24-25
r2
U26-27
r2
r2
r2
r2
r3
r3
r3
r3
r3
r3
r3
r3
r3
r3
r3
r3
r3
r4
r4
r4
r4
r4
r4
r4
r4
r4
r4
r4
r4
r4
r4
r4
r4
r4
r5
r5
r5
r5
r5
r5
r5
r5
r5
r5
r5
r5
r5
r5
r5
r5
r5
r5
r5
U11-10
U13-12
U16-15
U18-17
U23
U25
U27
U29
U31
U33
U35
U37
U13
U15
U16
U18
U19
U21
U22
U27
U28
U30
U31
U33
U34
U36
U37
Gi0/1
Gi0/2
Gi0/3
Gi0/4
Gi0/5
Gi0/6
Gi0/7
Gi0/8
Gi0/9
Gi0/10
Gi0/11
Gi0/12
Gi0/1
Gi0/2
Gi0/3
Gi0/4
Gi0/5
Gi0/6
Gi0/7
Gi0/8
Gi0/9
Gi0/10
Gi0/11
Gi0/12
Gi0/13
Gi0/14
Gi0/15
Gi0/16
-
U10
U11
U13
U14
U16
U17
U20
U21
U23
U24
U27
U30
U31
U33
U34
U36
U37
U37
WS-C3750G-24TS-E x2
Gi1/0/21(eth0)
DL380G5(M32G)
Gi2/0/21(eth1)
Gi1/0/22(eth0)
DL380G5(M32G)
Gi2/0/22(eth1)
Gi1/0/23(eth0)
DL380G5(M32G)
Gi2/0/23(eth1)
Gi1/0/24(eth0)
DL380G5(M32G)
Gi2/0/24(eth1)
Gi1/0/12
Compaq dc7800 SFF
Gi2/0/12
MATE ME-8 MY30A/E-8
Gi1/0/13
Compaq dc7800 SFF
Gi1/0/17(eth0)
DELL R410
Gi1/0/17(eth1)
U12
Gi0/1
Gi0/2
Gi0/3
Gi0/4
Gi0/5
Gi0/6
Gi0/7
Gi0/8
Gi0/9
Gi0/10
Gi0/11
Gi0/12
Gi0/13
Gi0/14
Gi0/15
Gi0/16
Gi0/17
Gi0/18
WS-C3750G-24TS-E
DL380G5 XE5345
DL380G5 XE5345
DL380G5 XE5345
DL380G5 XE5345
DL360G5 XX5460
DL360G5 XX5460
DL360G5 XX5460
DL360G5 XX5460
DL360G5 XX5460
DL360G5 XX5460
DL360G5 XX5460
DL360G5 XX5460
WS-C3750E-24TD-S
DL360G5 LV DC X5148
DL360G5 LV DC X5148
DL360G5 LV DC X5148
DL360G5 LV DC X5148
DL360G5 LV DC X5148
DL360G5 LV DC X5148
DL360G5 LV DC X5148
DL360G5 LV DC X5148
DL360G5 LV DC X5148
DL360G5 LV DC X5148
DL360G5 LV DC X5148
DL360G5 LV DC X5148
DL360G5 LV DC X5148
DL360G5 LV DC X5148
DL360G5 LV DC X5148
DL360G5 LV DC X5148
WS-C3750E-24TD-S
DL360G6 XE 5504 1P4C
DL360G6 XE 5504 1P4C
DL360G6 XE 5504 1P4C
DL360G6 XE 5504 1P4C
DL360G6 XE 5504 1P4C
DL360G6 XE 5504 1P4C
DL360G6 XE 5504 1P4C
DL360G6 XE 5504 1P4C
DL360G6 XE 5504 1P4C
DL360G6 XE 5504 1P4C
DL360G6 XE 5504 1P4C
DL360G6 XE 5504 1P4C
DL360G6 XE 5504 1P4C
DL360G6 XE 5504 1P4C
DL360G6 XE 5504 1P4C
DL360G6 XE 5504 1P4C
DL360G6 XE 5504 1P4C
DL360G6 XE 5504 1P4C
nn
NameNode
IP
192.168.102.1
192.168.103.1
192.168.104.1
192.168.105.1
192.168.106.1
192.168.107.1
192.168.102.10
r2
L3
MEMGB
r2.example.netvlan102
vlan103/vlan104/vlan105/vlan107
CISCO
VLANGWIP
IPDHCP
hnn1
NameNode
192.168.102.11
ILO: 192.168.102.201(Gi1/0/19)
HP
32
hnn2
NameNode 192.168.102.12
ILO: 192.168.102.202(Gi1/0/20)
HP
32
jt
JobTracker
192.168.102.20
hjt1
JobTracker
192.168.102.21
ILO: 192.168.102.203(Gi2/0/19)
HP
32
hjt2
JobTracker 192.168.102.22
ILO: 192.168.102.204(Gi2/0/20)
HP
32
pp1
pp2
mg1
192.168.102.2
/( 192.168.102.3
192.168.102.5
puppet/DNS/DHCP/TFTP
puppet/DNS/DHCP/TFTP
Nagios/Ganglia
HP
NEC
HP
4
4
4
2
2
2
job2
Job
r3
r3-1-0-01
r3-1-0-02
r3-1-0-03
r3-1-0-04
r3-1-0-05
r3-1-0-06
r3-1-0-07
r3-1-0-08
r3-1-0-09
r3-1-0-10
r3-1-0-11
r3-1-0-12
r4
r4-1-0-01
r4-1-0-02
r4-1-0-03
r4-1-0-04
r4-1-0-05
r4-1-0-06
r4-1-0-07
r4-1-0-08
r4-1-0-09
r4-1-0-10
r4-1-0-11
r4-1-0-12
r4-1-0-13
r4-1-0-14
r4-1-0-15
r4-1-0-16
r5
r5-1-0-01
r5-1-0-02
r5-1-0-03
r5-1-0-04
r5-1-0-05
r5-1-0-06
r5-1-0-07
r5-1-0-08
r5-1-0-09
r5-1-0-10
r5-1-0-11
r5-1-0-12
r5-1-0-13
r5-1-0-14
r5-1-0-15
r5-1-0-16
r5-1-0-17
r5-1-0-18
I-17
DELL
192.168.103.254 IP
CISCO
HP
HP
HP
HP
HP
HP
HP
HP
HP
HP
HP
HP
CISCO
HP
HP
HP
HP
HP
HP
HP
HP
HP
HP
HP
HP
HP
HP
HP
HP
CISCO
HP
HP
HP
HP
HP
HP
HP
HP
HP
HP
HP
HP
HP
HP
HP
HP
HP
HP
192.168.104.254 IP
192.168.105.254 IP
8
-
8
-
8
8
8
8
4
4
4
4
4
4
4
4
8
8
8
8
6
6
6
6
6
6
6
6
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
SWNo
r6
U30F Gi0/1
r6
U30B Gi0/2
r6
U31F Gi0/3
r6
U31B Gi0/4
r6
U32F Gi0/5
r6
U32B Gi0/6
r6
U33F Gi0/7
r6
U33B Gi0/8
r6
U34F Gi0/9
r6
U34B Gi0/10
r6
r7
U18F Gi1/0/1
r7
U18B Gi1/0/2
r7
U19F Gi1/0/3
r7
U19B Gi1/0/4
r7
U20F Gi1/0/5
r7
U20B Gi1/0/6
r7
U21F Gi1/0/7
r7
U21B Gi1/0/8
r7
U22F Gi1/0/9
r7
U22B Gi1/0/10
r7
U23F Gi1/0/11
r7
U23B Gi1/0/12
r7
U24F Gi1/0/13
r7
U24B Gi1/0/14
r7
U25F Gi1/0/15
r7
U25B Gi1/0/16
r7
U26F Gi1/0/17
r7
U26B Gi1/0/18
r7
U27F Gi1/0/19
r7
U27B Gi1/0/20
r7
U28F Gi2/0/1
r7
U28B Gi2/0/2
r7
U29F Gi2/0/3
r7
U29B Gi2/0/4
r7
U30F Gi2/0/5
r7
U30B Gi2/0/6
r7
U31F Gi2/0/7
r7
U31B Gi2/0/8
r7
U32F Gi2/0/9
r7
U32B Gi2/0/10
r7
U33F Gi2/0/11
r7
U33B Gi2/0/12
r7
U34F Gi2/0/13
r7
U34B Gi2/0/14
r7
U35F Gi2/0/15
r7
U35B Gi2/0/16
r7
U36F Gi2/0/17
r7
U36B Gi2/0/18
r7
U37F Gi2/0/19
r7
U37B Gi2/0/20
r7
WS-C3750G-24TS-E
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
WS-C3750E-24TD-S x2
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
Express5800
r6
r6-1-0-01
r6-1-0-02
r6-1-0-03
r6-1-0-04
r6-1-0-05
r6-1-0-06
r6-1-0-07
r6-1-0-08
r6-1-0-09
r6-1-0-10
r7
r7-1-0-01
r7-1-0-02
r7-1-0-03
r7-1-0-04
r7-1-0-05
r7-1-0-06
r7-1-0-07
r7-1-0-08
r7-1-0-09
r7-1-0-10
r7-1-0-11
r7-1-0-12
r7-1-0-13
r7-1-0-14
r7-1-0-15
r7-1-0-16
r7-1-0-17
r7-1-0-18
r7-1-0-19
r7-1-0-20
r7-2-0-01
r7-2-0-02
r7-2-0-03
r7-2-0-04
r7-2-0-05
r7-2-0-06
r7-2-0-07
r7-2-0-08
r7-2-0-09
r7-2-0-10
r7-2-0-11
r7-2-0-12
r7-2-0-13
r7-2-0-14
r7-2-0-15
r7-2-0-16
r7-2-0-17
r7-2-0-18
r7-2-0-19
r7-2-0-20
I-18
IP
192.168.106.254
IP
192.168.107.254
IP
MEMGB
CISCO
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
CISCO
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
NEC
2
2
II
II
No.
GPS
Hadoop Hadoop
Hadoop Hadoop
/
IT PC
IA
Open
Source Initiative
10
GPL, Apache
Hadoop
II-1
II
No.
MapReduce
Map Reduce 2
Map
Reduce Map
Map HDFS
Reduce HDFS
MapReduce
Map Reduce
MapReduce
JobTracker TaskTracker 2
HDFS
Hadoop
64MB
1
1 3
HDFS
NameNode
DataNode 2
Hadoop
Hadoop
JobTracker NameNode
Hadoop
Hadoop
TaskTracker
DataNode
FT
II-2
II
No.
Kemari
FT
I/O I/O
Kemari
HA
http://www.osrg.net/kemari/
HA
2
2
Heartbeat
HA
http://linux-ha.org
Kickstart
Kickstart
Kickstart
Puppet
Puppet Puppet
http://reductivelabs.com/products/puppet/
II-3
II
No.
Ganglia
CPU
Web
http://ganglia.sourceforge.net/
II-4