Академический Документы
Профессиональный Документы
Культура Документы
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-1
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-2
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-3
Chapter
Topics
WriDng
a
MapReduce
Program
in
Java
Basic
MapReduce
API
Concepts
WriFng
MapReduce
ApplicaFons
in
Java
The
Driver
The
Mapper
The
Reducer
Speeding
up
Hadoop
Development
by
Using
Eclipse
Dierences
Between
the
Old
and
New
MapReduce
APIs
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-4
Input
Format
Input
Split
1
Input Split 2
Input Split 3
Record Reader
Record Reader
Record Reader
Mapper
Mapper
Mapper
ParFFoner
ParFFoner
ParFFoner
Reducer
Output Format
Output Format
Output File
Output File
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-5
Map
Reduce
aardvark
cat
mat
on
sat
sofa
the
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-6
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-7
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-8
Example:
TextInputFormat
TextInputFormat
The
default
Creates
LineRecordReader
objects
Treats
each
\n-terminated
line
of
a
le
as
a
value
Key
is
the
byte
oset
of
that
line
within
the
le
LineRecordReader
key
value
0
23
52
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-9
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-10
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-11
What
is
Writable?
The
Writable
interface
makes
serializaDon
quick
and
easy
for
Hadoop
Any
values
type
must
implement
the
Writable
interface
Hadoop
denes
its
own
box
classes
for
strings,
integers,
and
so
on
IntWritable
for
ints
LongWritable
for
longs
FloatWritable
for
oats
DoubleWritable
for
doubles
Text
for
strings
Etc.
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-12
What
is
WritableComparable?
A
WritableComparable
is
a
Writable
which
is
also
Comparable
Two
WritableComparables
can
be
compared
against
each
other
to
determine
their
order
Keys
must
be
WritableComparables
because
they
are
passed
to
the
Reducer
in
sorted
order
We
will
talk
more
about
WritableComparables
later
Note
that
despite
their
names,
all
Hadoop
box
classes
implement
both
Writable
and
WritableComparable
For
example,
IntWritable
is
actually
a
WritableComparable
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-13
Chapter
Topics
WriDng
a
MapReduce
Program
in
Java
Basic
MapReduce
API
Concepts
WriDng
MapReduce
ApplicaDons
in
Java
The
Driver
The
Mapper
The
Reducer
Speeding
up
Hadoop
Development
by
Using
Eclipse
Dierences
Between
the
Old
and
New
MapReduce
APIs
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-14
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-15
org.apache.hadoop.fs.Path;
org.apache.hadoop.io.IntWritable;
org.apache.hadoop.io.Text;
org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
org.apache.hadoop.mapreduce.Job;
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-16
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-17
org.apache.hadoop.fs.Path;
org.apache.hadoop.io.IntWritable;
org.apache.hadoop.io.Text;
org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
org.apache.hadoop.mapreduce.Job;
}
Job job = new Job();
job.setJarByClass(WordCount.class);
job.setJobName("Word Count");
FileInputFormat.setInputPaths(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.setMapperClass(WordMapper.class);
job.setReducerClass(SumReducer.class);
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-18
3-19
}
}
Copyright
2010-2014
Cloudera.
All
rights
reserved.
Not
to
be
reproduced
without
prior
wri>en
consent.
3-20
}
}
Copyright
2010-2014
Cloudera.
All
rights
reserved.
Not
to
be
reproduced
without
prior
wri>en
consent.
3-21
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
boolean success = job.waitForCompletion(true);
System.exit(success ? 0 : 1);
}
}
Copyright
2010-2014
Cloudera.
All
rights
reserved.
Not
to
be
reproduced
without
prior
wri>en
consent.
3-22
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-23
job.setMapperClass(WordMapper.class);
job.setReducerClass(SumReducer.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
boolean success = job.waitForCompletion(true);
System.exit(success ? 0 : 1);
}
}
Copyright
2010-2014
Cloudera.
All
rights
reserved.
Not
to
be
reproduced
without
prior
wri>en
consent.
3-24
Next,
specify
the
input
directory
from
which
data
will
be
read,
and
job.setMapOutputKeyClass(Text.class);
the
output
directory
to
which
nal
output
will
be
wri>en.
job.setMapOutputValueClass(IntWritable.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
boolean success = job.waitForCompletion(true);
System.exit(success ? 0 : 1);
}
}
Copyright
2010-2014
Cloudera.
All
rights
reserved.
Not
to
be
reproduced
without
prior
wri>en
consent.
3-25
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-26
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-27
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-28
3-29
an elephant driver
map()
mahout
an elephant driver
IdentityReducer
a knot with two loops
and two loose ends
bow
reduce()
bow
bow
bow
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-30
Specify
the
types
for
the
intermediate
output
keys
and
values
boolean success = job.waitForCompletion(true);
produced
by
the
Mapper.
System.exit(success ? 0 : 1);
}
}
Copyright
2010-2014
Cloudera.
All
rights
reserved.
Not
to
be
reproduced
without
prior
wri>en
consent.
3-31
Specify the types for the Reducers output keys and values.
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
boolean success = job.waitForCompletion(true);
System.exit(success ? 0 : 1);
}
}
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-32
3-33
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-34
3-35
Chapter
Topics
WriDng
a
MapReduce
Program
in
Java
Basic
MapReduce
API
Concepts
WriDng
MapReduce
ApplicaDons
in
Java
The
Driver
The
Mapper
The
Reducer
Speeding
up
Hadoop
Development
by
Using
Eclipse
Dierences
Between
the
Old
and
New
MapReduce
APIs
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-36
value
the
cat
sat
on
the
mat
Key Value
key
value
the
23
aardvark
sat
Mapper
map()
Record Reader
map()
52
on
the
sofa
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-37
java.io.IOException;
org.apache.hadoop.io.IntWritable;
org.apache.hadoop.io.LongWritable;
org.apache.hadoop.io.Text;
org.apache.hadoop.mapreduce.Mapper;
3-38
java.io.IOException;
org.apache.hadoop.io.IntWritable;
org.apache.hadoop.io.LongWritable;
org.apache.hadoop.io.Text;
org.apache.hadoop.mapreduce.Mapper;
3-39
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-40
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-41
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-42
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-43
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-44
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-45
object.
The
key
will
be
the
word
itself,
the
value
will
be
the
number
1.
String line = value.toString();
Recall
that
the
output
key
must
be
a
WritableComparable,
and
the
for
(String
word
{
value
must
be
a:
Wline.split("\\W+"))
ritable.
if (word.length() > 0) {
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-46
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-47
Chapter
Topics
WriDng
a
MapReduce
Program
in
Java
Basic
MapReduce
API
Concepts
WriDng
MapReduce
ApplicaDons
in
Java
The
Driver
The
Mapper
The
Reducer
Speeding
up
Hadoop
Development
by
Using
Eclipse
Dierences
Between
the
Old
and
New
MapReduce
APIs
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-48
SumReducer
key
value list
on
1,1
sat
1,1
sofa
the
1,1,1,1,1,1
reduce()
reduce()
reduce()
on
sat
sofa
the
reduce()
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-49
java.io.IOException;
org.apache.hadoop.io.IntWritable;
org.apache.hadoop.io.Text;
org.apache.hadoop.mapreduce.Reducer;
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-50
java.io.IOException;
org.apache.hadoop.io.IntWritable;
org.apache.hadoop.io.Text;
org.apache.hadoop.mapreduce.Reducer;
context)
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-51
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-52
Intermediate
key
Output
key
and
and
value
types
value
types
Iterable<IntWritable> values, Context context)
@Override
public void reduce(Text key,
throws IOException, InterruptedException {
int wordCount = 0;
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-53
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-54
We
use
the
Java
for-each
syntax
to
step
through
all
the
elements
in
the
collecFon.
In
our
example,
we
are
merely
adding
all
the
values
together.
We
use
value.get()
to
retrieve
the
actual
numeric
value
each
Fme.
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-55
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-56
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-57
}
}
3-58
Mapper
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);
driver code
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
Reducer
Copyright
2010-2014
Cloudera.
All
rights
reserved.
Not
to
be
reproduced
without
prior
wri>en
consent.
3-59
Chapter
Topics
WriDng
a
MapReduce
Program
in
Java
Basic
MapReduce
API
Concepts
WriFng
MapReduce
ApplicaFons
in
Java
The
Driver
The
Mapper
The
Reducer
Speeding
up
Hadoop
Development
by
Using
Eclipse
Dierences
Between
the
Old
and
New
MapReduce
APIs
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-60
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-61
Chapter
Topics
WriDng
a
MapReduce
Program
in
Java
Basic
MapReduce
API
Concepts
WriFng
MapReduce
ApplicaFons
in
Java
The
Driver
The
Mapper
The
Reducer
Speeding
up
Hadoop
Development
by
Using
Eclipse
Dierences
Between
the
Old
and
New
MapReduce
APIs
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-62
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-63
Old API
import org.apache.hadoop.mapreduce.*
import org.apache.hadoop.mapred.*
Driver code:
Driver code:
Mapper:
Mapper:
}
}
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-64
Old API
Reducer:
Reducer:
}
}
setup(Context c) (See later)
configure(JobConf job)
close()
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-65
New API
MapReduce v1
MapReduce v2
Summary:
Code
using
either
the
Old
API
or
the
New
API
will
run
under
MRv1
and
MRv2
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-66
Key
Points
InputFormat
Parses
input
les
into
key/value
pairs
WritableComparable,
Writable
classes
Box
or
wrapper
classes
to
pass
keys
and
values
Driver
Sets
InputFormat
and
input
and
output
types
Species
classes
for
the
Mapper
and
Reducer
Mapper
map()
method
takes
a
key/value
pair
Call
Context.write()
to
output
intermediate
key/value
pairs
Reducer
reduce()
method
takes
a
key
and
iterable
list
of
values
Call
Context.write()
to
output
nal
key/value
pairs
Copyright
2010-2014
Cloudera.
All
rights
reserved.
Not
to
be
reproduced
without
prior
wri>en
consent.
3-67
Bibliography
The
following
oer
more
informaDon
on
topics
discussed
in
this
chapter
References
for
InputFormats
TDG
3e
pages
237
and
245-251
TDG
3e
page
247
Conguring
jobs:
See
TDG
3e
pages
65-66
for
more
explanaDon
and
examples.
Running
jobs:
For
more
informaDon,
see
page
190
of
TDG
3e.
Old
API
vs.
new
API:
See
TDG
3e
pages
27-30.
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-68
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-69
Chapter
Topics
WriDng
a
MapReduce
Program
Using
Streaming
WriDng
Mappers
and
Reducers
with
the
Streaming
API
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-70
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-71
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-72
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-73
#
#
#
#
#
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-74
sat
1
1
Key
Values
the
1
1
1
1
Streaming:
stdin
sat
sat
the
the
the
the
1
1
1
1
1
1
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-75
#
#
#
#
#
#
#
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-76
3-77
Key
Points
The
Hadoop
Streaming
API
allows
you
to
write
Mappers
and
Reducers
in
any
language
Data
is
read
from
stdin
and
wri>en
to
stdout
Other
Hadoop
components
(InputFormats,
ParDDoners,
etc.)
sDll
require
Java
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-78
Bibliography
The
following
oer
more
informaDon
on
topics
discussed
in
this
chapter
Streaming:
TDG
3e,
pages
36-40
http://hadoop.apache.org/common/docs/r0.20.203.0/
streaming.html
http://code.google.com/p/hadoop-stream-mapreduce/
wiki/Performanc
http://stackoverflow.com/questions/1482282/javavs-python-on-hadoop
http://hadoop.apache.org/common/docs/r0.20.203.0/
streaming.html#Hadoop+Comparator+Class
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-79
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-80
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-81
Chapter
Topics
Unit
TesDng
MapReduce
Programs
Unit
TesDng
The
JUnit
and
MRUnit
TesFng
Frameworks
WriFng
Unit
Tests
with
MRUnit
Running
Unit
Tests
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-82
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-83
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-84
Chapter
Topics
Unit
TesDng
MapReduce
Programs
Unit
TesFng
The
JUnit
and
MRUnit
tesDng
frameworks
WriFng
Unit
Tests
with
MRUnit
Running
Unit
Tests
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-85
Why
MRUnit?
JUnit
is
a
popular
Java
unit
tesDng
framework
Problem:
JUnit
cannot
be
used
directly
to
test
Mappers
or
Reducers
Unit
tests
require
mocking
up
classes
in
the
MapReduce
framework
A
lot
of
work
MRUnit
is
built
on
top
of
JUnit
Works
with
the
mockito
framework
to
provide
required
mock
objects
Allows
you
to
test
your
code
from
within
an
IDE
Much
easier
to
debug
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-86
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-87
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-88
3-89
Chapter
Topics
Unit
TesDng
MapReduce
Programs
Unit
TesFng
The
JUnit
and
MRUnit
TesFng
Frameworks
WriDng
unit
tests
with
MRUnit
Running
Unit
Tests
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-90
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-91
org.apache.hadoop.io.IntWritable;
org.apache.hadoop.io.LongWritable;
org.apache.hadoop.io.Text;
org.apache.hadoop.mrunit.mapreduce.MapDriver;
org.junit.Before;
org.junit.Test;
@Test
public void testMapper() {
mapDriver.withInput(new LongWritable(1), new Text("cat dog"));
mapDriver.withOutput(new Text("cat"), new IntWritable(1));
mapDriver.withOutput(new Text("dog"), new IntWritable(1));
mapDriver.runTest();
}
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-92
org.apache.hadoop.io.IntWritable;
org.apache.hadoop.io.LongWritable;
org.apache.hadoop.io.Text;
org.apache.hadoop.mrunit.mapreduce.MapDriver;
org.junit.Before;
org.junit.Test;
@Test
public void testMapper() {
mapDriver.withInput(new LongWritable(1), new Text("cat dog"));
mapDriver.withOutput(new Text("cat"), new IntWritable(1));
mapDriver.withOutput(new Text("dog"), new IntWritable(1));
mapDriver.runTest();
}
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-93
org.apache.hadoop.io.IntWritable;
org.apache.hadoop.io.LongWritable;
org.apache.hadoop.io.Text;
org.apache.hadoop.mrunit.mapreduce.MapDriver;
org.junit.Before;
org.junit.Test;
@Test
public void testMapper() {
mapDriver.withInput(new LongWritable(1), new Text("cat dog"));
mapDriver.withOutput(new Text("cat"), new IntWritable(1));
mapDriver.withOutput(new Text("dog"), new IntWritable(1));
mapDriver.runTest();
}
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-94
org.apache.hadoop.io.IntWritable;
org.apache.hadoop.io.LongWritable;
org.apache.hadoop.io.Text;
org.apache.hadoop.mrunit.mapreduce.MapDriver;
org.junit.Before;
org.junit.Test;
@Test
up
testMapper()
the
test.
This
publicSet
void
{ method
will
be
called
before
every
test,
mapDriver.withInput(new LongWritable(1), new Text("cat dog"));
just
as
with
JUnit.
Text("cat"), new IntWritable(1));
mapDriver.withOutput(new
mapDriver.withOutput(new Text("dog"), new IntWritable(1));
mapDriver.runTest();
}
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-95
org.apache.hadoop.io.IntWritable;
org.apache.hadoop.io.LongWritable;
org.apache.hadoop.io.Text;
org.apache.hadoop.mrunit.mapreduce.MapDriver;
org.junit.Before;
org.junit.Test;
@Test
public void testMapper() {
mapDriver.withInput(new LongWritable(1), new Text("cat dog"));
mapDriver.withOutput(new Text("cat"), new IntWritable(1));
mapDriver.withOutput(new Text("dog"), new IntWritable(1));
mapDriver.runTest();
}
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-96
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-97
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-98
Chapter
Topics
Unit
TesDng
MapReduce
Programs
Unit
TesFng
The
JUnit
and
MRUnit
TesFng
Frameworks
WriFng
Unit
Tests
with
MRUnit
Running
Unit
Tests
Hands-On
Exercise:
WriFng
Unit
Tests
with
the
MRUnit
Framework
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-99
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-100
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-101
Key
Points
Unit
tesDng
is
important
MRUnit
is
a
framework
for
MapReduce
programs
Built
on
JUnit
You
can
write
tests
for
Mappers
and
Reducers
individually,
and
for
both
together
Run
tests
from
the
command
line,
Eclipse,
or
other
IDE
Best
pracDce:
always
write
unit
tests!
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-102
Bibliography
The
following
oer
more
informaDon
on
topics
discussed
in
this
chapter
Junit
reference:
Reference
http://pub.admc.com/howtos/junit4x/
MRUnit
was
developed
by
Cloudera
and
donated
to
the
Apache
project.
You
can
nd
the
MRUnit
project
at
the
following
URL
http://mrunit.apache.org
Copyright 2010-2014 Cloudera. All rights reserved. Not to be reproduced without prior wri>en consent.
3-103