Mapreduce Programming Model and Design Patterns: Andrea Lottarini January 17, 2012

MapReduce Programming Model and Design Patterns
Andrea Lottarini January 17, 2012
Contents
1 Introduction 1.1 Functional Programming Heritage . . . . . . . . . . . . . . . . . . . . . . 2 MapReduce Programming Model 2.1 The Word Count Example . . . . 2.2 Advanced Features . . . . . . . . 2.2.1 Combiner . . . . . . . . . 2.2.2 Partitioner . . . . . . . . . 2 3 3 4 6 6 9 10 10 14 16 20 21 21 22 22 22 22 22 23 23
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
3 Design Patterns 3.1 Matrix Vector multiplication . . . . . . . . . 3.1.1 Secondary Sort . . . . . . . . . . . . 3.1.2 Generic Objects and Sequence Files . 3.1.3 In Memory Multiplication . . . . . . 3.2 Relational Algebra . . . . . . . . . . . . . . 3.2.1 Selection . . . . . . . . . . . . . . . . 3.2.2 Projection . . . . . . . . . . . . . . . 3.2.3 Union . . . . . . . . . . . . . . . . . 3.2.4 Intersection . . . . . . . . . . . . . . 3.2.5 Dierence . . . . . . . . . . . . . . . 3.2.6 Natural Join . . . . . . . . . . . . . . 3.2.7 Group by and aggregation functions . 4 Conclusions
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
Introduction
This document is based on Dr. Nicola Tonellottos lectures of the Complements of Enabling Platforms course. The objective of these notes is to introduce the MapReduce programming model and its most used design patterns. MapReduce serves mainly to process large data sets in a massively parallel manner and it is based on the following fundamental concepts [4]: 1. Accessing input elements in a sequential fashion. 2. Each piece of input is treated and processed as a key/value pair. 3. Intermediate values are grouped using the key. 4. Each group is reduced using a specic function. The programming model is very simple (Figure 1). Apache Hadoop[6][9][8] is the de facto standard implementation of the MapReduce programming model. Considering a large, distributed network of commodity computers, the Hadoop framework can handle most of the non functional problems such as distribution of input data/tasks and fault tolerance. In fact, one of the biggest strength of the MapReduce model and the Hadoop implementation. We will not consider how the frameworks implements the model, but instead we will analyze only the programming model and we will present several examples implemented in Hadoop.
Programming Model (simple)

INPUT
I1 map
I2 map
I3 map Aggregate values by key
I4 map
I5 map
reduce
reduce
reduce
O1
O2
O3
OUTPUT
Figure 1: Schema of a Map Reduce computation. Notice how the input and output data is accessed or written in a massively parallel fashion.
MCSN N. Tonellotto Complements of Distributed Enabling Platforms
11
1.1
Functional Programming Heritage
The MapReduce model is strongly inuenced, if not derived, by the map/fold primitives which can be found in many functional languages. We will now examine an example of a map and fold computation in haskell. The map function multiplies every number in a list by 2.
map (() 2) [1,2,3] [2,4,6]
((*) 2) denotes multiplication by 2. Expressed in lambda notation this is equivalent to: x.(2 x) Then, the Reduce (fold) function sums all the numbers in the list.
foldl (+) 0 [2,4,6] 12
Here, the expression (+) denotes addition and 0 is the initial value of the sum. foldl is left associative so the operation performed is: (((0 + 2) + 4) + 6) = 12 In the next section we are going to show how the actual model of MapReduce is an extension of this scheme.
MapReduce Programming Model
Conceptually, MapReduce programs create an output list of elements from an input list of elements (Figure 1); this is done by using the functions Map and Reduce in a similar way as Section 1.1. The computation is divided in three steps: 1. Map: In this phase, the input les are divided in chunks called InputSplits1 . Every InputSplit is then assigned to a worker using Rack Awareness, i.e. data which is stored locally on a given machine is usually elaborated on the same node in order to avoid communication. This mechanism is implicitly implemented by the framework without the programmers intervention. Every element of an InputSplit is elaborated by the assigned mapper 2 and (in general) a new element is emitted. 2. Shue & Sort: This phase is entirely handled by the framework; the output of every mapper is sorted by means of the key.
By default every le is divided in chunks of 64MB, is the le is smaller padding is added. This is done because the default block size of the Hadoop distributed le system is 64MB. 2 A new instance of Mapper is instantiated in a separate Java process for each map task (InputSplit) that makes up part of the total job input
1
3. Reduce: Finally all runs of sorted elements, associated to the dierent keys, are assigned to a dierent reducer, which performs a specic function and produces the output elements. At this point, we should formally dene the phases just introduced. In the Map phase, the input is a list of pairs in the form Key, V alue and another list of pairs Key1 , V alue1 is produced. Consider that every element is immutable and of type String [3] so pairs of strings are read from the input and a pairs of strings are produced as intermediate output. We are not implying that the same number of pairs is produced as an output. During the shue & sort stage, the framework, which implements MapReduce, sorts all the values using the key and assigns every sorted run to a specic machine; this is seamless to the programmer. Consider that this is the only phase where communication is performed; communication is not allowed between mapper or reducer instances. In the reduce stage, a single key with a list of values is received Key, [v1, v2, v3, ] and a multiset of the form Key, V alue1 , Key, V alue2 , is produced by applying a specic function. We are not imposing that the reduce function is associative or commutative as in the case of the fodl command (Section 1.1 ) or the MPI REDUCE command [1]. Notice that every phase can be easily parallelized but the three phases that compose a computation are indeed sequential. In fact, there is no possibility to start the reduce phase before all elements are sorted. It may seem that the model impose some excessive constrains. We already stated that every element in MapReduce is immutable. This is necessary to avoid synchronization among a large number of nodes; an operation which would aect the scalability of the whole application. Similarly, we want to be able to split the input in chunks and process them separately. Moreover, we want to be oblivious to data types and consider everything as a String. All these constrains state that a MapReduce computation should be purely functional and static typed (exactly like a computation in Haskell). The Hadoop implementation, however, permits the user to use and dene dierent datatypes under the constrain that they can be serialized as a string.
2.1
The Word Count Example
This is possibly one of the simplest examples of a computation implemented in the MapReduce Model. The program just computes how many times dierent words appear in a set of les. Given the input les: dog.txt: this is the dog le cat.txt: this is the cat le we should expect the output le to look like this: this 2 is 2 the 2 4
dog 1 cat 1 file 2 and the code which actually implements it will have this form:
mapper (lename, le contents ) : for each word in le contents : emit (word, 1) reducer (word, values) : sum = 0 foreach value in values : sum = sum + value emit (word, sum)
Listing 1: Pseudo code of the wordcount application
We can analyze the input/output behavior: every mapper will receive a list of key value pairs. We can assume that the key is the lename while the value is the content of a whole line of text. For every word in the le the mapper will produce a pair word, 1 . Notice how the mapper changes datatypes in this process, i.e. starting from pairs of type f ilename, string it produces pairs of type string, int . In the shue phase the output of every mapper is collected and sorted. A Reducer is then instantiated for every dierent key. The reducer associated to the word this will receive the pair this, [1, 1] and produce the pair this, 2 while the reducer for the word dog will receive the pair dog, [1] and produce dog, 1 . Notice how there is no state associated to mapper or reducer processes; the whole computation is purely functional and the input les can be splitted without aecting the correctness of the computation. The corresponding Hadoop code for this example is the following:
public class WordCount { public static class MyMapper extends Mapper<Object, Text, Text, IntWritable> { private nal static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(Object key, Text value, Context context) { StringTokenizer itr = new StringTokenizer(value.toString()); while (itr .hasMoreTokens()) { word.set( itr .nextToken()); context.write(word, one); } }
public static class MyReducer extends Reducer<Text, IntWritable, Text, IntWritable> { private IntWritable result = new IntWritable();
public void reduce(Text key, Iterable<IntWritable> values, Context context) { int sum = 0; for (IntWritable val : values) { sum += val.get(); } result . set(sum); context.write(key, result ) ; }
public static void main(String[] args) throws Exception { Conguration conf = new Conguration(); Job job = new Job(conf, wordcount); job.setJarByClass(WordCount.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setMapperClass(MyMapper.class); job.setReducerClass(MyReducer.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); System.exit(job.waitForCompletion(true) ? 0 : 1); }
Listing 2: Code of the wordcount application.
Notice how the association of the les (or inputSplits) to the Mappers is performed by the framework. No relevant computation is performed in the main apart from specifying conguration details of the job.
2.2
Advanced Features
We have already seen how Hadoop permits to use datatypes in order to relieve the programmer from the burden of working with text only. Hadoop has other advanced features that somewhat break the purely functional characteristic of Map Reduce in order to obtain better performance and usability. 2.2.1 Combiner
By analyzing the Wordcount example in Section 2.1, we can notice that producing a pair word, 1 for every word in the text can be inecient. The mapper could perform a preliminary reduce of the values of his input split in order to reduce the size of communications. A possible solution is to keep a dictionary of word with associated number of occurrences and ush it when the input is completely scanned. However, this breaks the functional constrain by adding a state to the mapper. This operation is instead eciently performed by combiners (Figure 2). 6
Programming Model (complete)

INPUT
I1 map
combine partition
I2 map
combine partition
I3 map
combine partition
I4 map
combine partition
I5 map
combine partition
Aggregate values by key

reduce reduce reduce
O1
O2 OUTPUT
O3
Figure 2: Schema of a Map Reduce computation comprising a combiner and partition phase.
As an example, consider the word count application and an Input Split which contains three instances of the word this. Without the combiner these pairs will be produced: this, 1 , this, 1 , this, 1 By using the combiner, the output of the mapper instances running on a same node will be redirected to the node combiner which will directly produce the pair this, 3 This is obviously advantageous considering that the amount of data to be transferred is greatly reduced. This is done by simply adding in the previous code a single instruction to instruct the framework to perform the combiner phase (Listing 3).
job.setMapperClass(MYMapper.class); job.setCombinerClass(MyReducer.class); job.setReducerClass(MyReducer.class);
Listing 3: Combiner setup in the job conguration.
Notice that a reference to the reducer class is given as the combiner. It is clear that the same operation performed by the reducer is instead performed from the combiner in memory. It is important to notice that the programmer has no control over the execution of the combiner. The framework will keep in memory sorted runs of the mapper output and will periodically ush these sorted runs using the combiner. It is left 7
to the framework to decide whether it is convenient to perform the local reduction or not. Nonetheless the programmer could actually overcome this by performing the reduction by hand, which is possible in Hadoop. This might be convenient for applications where the mappers produce lots of data with many repetitions. In this case, it is necessary to rewrite the mapper code ( Figure 3).
Statefull In-Mapper Combining
Figure local aggregator with performed Custom 3: PseudoCode of the Combiner state in memory.
Coding overhead
This givesit a programmer the ability to control the combine phase and to minimize Is the real improvement? the creation and destruction of objects during execution, but it has two major drawbacks: 1. ItTaken from Data-Intensive Text Processing with MapReduce, Jimmy Linassumption (now the mapper 44 breaks the functional programming and Chris Dyer, Morgan & Claypool Publisher, 2010, pag. process has a state). 2. It breaks the stream behavior; also memory foot printing is necessary as auxiliary data structures, used to represent the state, may grow in size very rapidly. Even more important is the case when Combiners cannot simply be Reducers executed in memory; the user has to dene a case specic combiner. Consider a modication of the wordcount example where the output consists of pairs word, boolean where the boolean values indicates whether more than ten occurrences of the associated word are present in the set of analyzed documents. This requires a modication in the reducer:
public static class NewReducer extends Reducer<Text, IntWritable, Text, BooleanWritable> { private BooleanWritable result = new BooleanWritable(); public void reduce(Text key, Iterable<IntWritable> values, Context context) { int sum = 0; for (IntWritable val : values) { sum += val.get(); } result . set( (sum > 10) ? true : false);
MCSN N. Tonellotto Complements of Distributed Enabling Platforms 5
context.write(key, result ) ; } }
Listing 4: Reducer of the new wordcount application
The mapper is the same presented in Listing 2. The reducer instead takes a key of type string with a list of integers and output a pair string, boolean . The constrains on the combiner are that it should perform a commutative and associative operation and, obviously, it should not change the data types between input and output. The operation performed by the reducer in Listing 4 is not associative and it modies the data types; therefore, an ad hoc combiner must be implemented. In this case, the combiner has to compute the sum of the values produced by the mapper instances, which is an operation that is both commutative and associative. It is possible to use the reducer class of the rst wordcount application (Listing 2). Notice that it is necessary to make a small modication to the main and specify the datatypes outputted in each phase.
public static void main(String[] args) throws Exception { Conguration conf = new Conguration(); Job job = new Job(conf, wordcount); job.setJarByClass(WordCountNew.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(BooleanWritable.class); job.setMapOutputKeyClass(Text.class); job.setMapOutputValueClass(IntWritable.class); job.setMapperClass(MyMapper.class); job.setCombinerClass(MyReducer.class); job.setReducerClass(NewReducer.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); System.exit(job.waitForCompletion(true) ? 0 : 1); }
Listing 5: Job conguration of the new wordcount application
2.2.2
Partitioner
Another important feature of the Hadoop framework is the Partitioner. Partitioning is the process of determining which reducer instance will receive which intermediate keys and values[5]. The default behavior of Hadoop is to use the HashPartitioner class as a partitioner. It utilizes the hascode() function of the key in order to divide evenly the keys among the dierent reducers applying the function j = (hascode(Ki )) mod numberOfReducers Thus the reducer Rj is associated with the key Ki . The user, however, can dene a specic partition function. The default behavior is perfectly ne for evenly distributed keys, but consider a more realistic case where the number of values per key is unbalanced 9
and you have statistical information about the key, value distribution. In order to dene a partitioner, the user has to implement the partitioner interface MyPartitioner implements Partitioner<E,E> and specify, in the job conguration, its implementation as the partitioner class using job. setPartitionerClass (MyPartitioner).
Design Patterns
We have already seen some advanced features of Hadoop and we presented the InMemoryCombiner which is an important design pattern. We will now present other two relevant example of Map Reduce Design Patterns: Matrix vector multiplication Relational algebra Operations [2] The rst one is very important since we will show how many advanced features of Hadoop are utilized in order to produce an ecient implementation of a map reduce computation. Examples of the matrix matrix multiplication implemented in Hadoop can also be found at [7].
3.1
Matrix Vector multiplication
Consider a very common operation in Numerical methods: multiply a sparse matrix A with a vector v. This operation is performed on the power method (among others) to compute the greatest eigenvalue of a matrix.
Only non zero elements of the matrix are stored in order to reduce space utilization. Every non zero element is stored as < row, column, value >. The simplest way to perform this operation is to perform two steps of map reduce: Map 1: The Mapper receive either a chunk of elements from the matrix or the vector. In the rst case, it emits < j, i, Ai,j >3 while in the second case it emits < j, vj >. By doing so, elements with the same j, i.e. the same column index, will be sent to the same reducer. Reduce 1: The reducer assigned to the key n will receive a single element from the vector v, precisely the element in position n. It will also receive all the nonzero elements of the n-th column of the matrix A. Ai,n received, it will emits < i, Ai,n vn > ( Figure 5).
3
i indicates the row index, while j indicates the column index
10
Figure 4: Matrix Vector multiplication.
Map 2: Is an identity mapper (reads elements from the input and output them without modications). Reduce 2: Will receive all temporary elements with the same i, i.e. the same row index. It will perform the sum of these elements and emit < i, j Ai,j wj >, the nal value of the vector for the row i (Figure 6).
A(:,n) v(n) A(:,n)*v(n)
Figure 5: First Phase of Map Reduce
11
A(:,1)*v(1)
A(:,2)*v(2)
A(:,n)*v(n)
n1
Reducer n1
n1,
An1,j wj
.......
.......
...
n2
Reducer n2
n2,
An2,j wj
Shufe & Sort

Figure 6: Second Phase of Map Reduce
It does not appear to be a complex operation to implement in Hadoop but it require some attention in order to be implemented correctly. The rst step in particular is somewhat tricky. Lets analyze it in detail. 1. The mapper should distinguish between the matrix element and the vector element. It also has to emit data structured dierently in the two cases. 2. Similarly, the reducer has to distinguish between an element from the vector or the matrix. 3. Consider the output of the reducer < i, Ai,n vn > Ai,n | Ai,n = 0. We would like to receive vn before the elements of the matrix since vn is used in every multiplication performed by the reducer. 4. We are using two steps of map reduce in sequence and the information is not implicitly text (they are numbers), we would like to minimize the size of the data transferred. 5. We want to have a well organized/maintainable code. The rst two concerns are functional, in fact, it is necessary to distinguish elements from A and v, otherwise the computation will be incorrect. First we consider a preliminary straightforward implementation.
public class Phase1 {
12
public static class MyMapper extends Mapper<LongWritable, Text, IntWritable, Text> { private static boolean V = false; @Override protected void setup(Context context) throws IOException { String chunkName = ((FileSplit) context.getInputSplit()).getPath().getName(); if (chunkName.startsWith(V)) { V = true; } else if (! chunkName.startsWith(A)) { throw new IOException(File name not correct); } } @Override public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { / The vector is stored one element per line in the form row#value The matrix is stored one element per line in the form column#row%value, / String [] values = value.toString() . split (#); if (V) { context.write( new IntWritable(Integer.parseInt(values[0]) ) , new Text(V + values[1] ) ); } else / The sparse element must be emitted / { IntWritable column = new IntWritable(Integer.parseInt(values[0])); context.write(column, new Text(a + values[1])); } }
public static class MyReducer extends Reducer<IntWritable, Text, IntWritable, Text> { @Override public void reduce(IntWritable key, Iterable<Text> values, Context context) throws IOException, InterruptedException { ArrayList<Text> matrixValues = new ArrayList<Text>(); Text val; double vectorValue = 0, outputValue; Iterator <Text> iter = values.iterator() ; while (iter .hasNext()) { val = iter .next(); if (val .charAt(0) == a) { matrixValues.add(val); } else { vectorValue = Double.parseDouble(val.toString().substring(1)); } } / elements are emitted by scanning the list of received elements from the matrix / iter = matrixValues.iterator() ;
13
while (iter .hasNext()) { val = iter .next(); String [] rowValue = val.toString(). split (%); outputValue = vectorValue Double.parseDouble(rowValue[1]); context.write(new IntWritable(Integer.parseInt(rowValue[0])), new Text( + outputValue)); }
Listing 6: Code of the rst Map Reduce phase. We assume that the vector is saved on multiple les stored in a folder with the lename starting with V. Similarly for the matrix A.
We solved the problem of distinguishing between A and v in the mapper using the lename of the chunk. Similarly, we solved the problem in the reducer by adding an annotation in the intermediate values. Mapper and Reducers can distinguish between data from the vector or the matrix. We are also using text to save intermediate values so we can arrange the structure of the data as we want. Now, we consider non functional constrains. This solution is inecient since the reducer has to store every element received from the matrix. The order of the values received is random. A possible optimization is to buer elements from A until the element from the vector is received and then, ush the buer and output nal elements in a streaming fashion. In the worst case, this solution is equivalent to the previous solution (Listing 6). We can overcome this problem using the shue & sort phase and ensure that the vector element will be the rst on the list of values for every reduce operation. This problem of having values (not only keys) sorted can be found in literature as Secondary Sort. 3.1.1 Secondary Sort
What we want to accomplish with secondary sort is to have runs of values from a key ordered depending on our needs. There are many possible ways to implement a secondary sort; in this case a very convenient solution is to dene a specic object as a key. In order to utilize a user dened object as a key in Hadoop, it must implements the WritableComparable interface. The object must be comparable so that a list of keys can be sorted using their compareTo method. The object must be writable in order to be serialized using the methods readFields(DataOutput out) and write(DataInput in). In this specic case, we can extend IntWritable (Listing 7) in order to solve the problem in a very concise way.
public class IntAndIdWritable extends IntWritable { private char id; @Override public void readFields(DataInput in) throws IOException {
14
super.readFields(in); this.id=in.readChar(); } @Override public void write(DataOutput out) throws IOException { super.write(out); out.writeChar(id); } / Compares two IntWritables. / @Override public int compareTo(Object o) { int compare value = super.compareTo(o); return (compare value==0)? this.id ((IntAndIdWritable)o).id : compare value;
/ A Comparator optimized for IntAndIdWritable. / public static class Comparator extends WritableComparator { public Comparator() { super(IntAndIdWritable.class); } public int compare(byte[] b1, int s1, int l1 , byte[] b2, int s2, int l2) { int thisValue = readInt(b1, s1) ; int thatValue = readInt(b2, s2); / char are in UTF > 2 byte / int confrontoChar = compareBytes(b1, s1+l12, 2 , b2, s2+l2 2, 2); return (thisValue<thatValue ? 1 : (thisValue==thatValue ? confrontoChar : 1));
static { // register this comparator WritableComparator.dene(IntAndIdWritable.class, new Comparator()); }
Listing 7: IntAndIdWritable class. This class implements both methods from the Writable and the Comparable interface. We omitted constructors and getter/setter methods for clarity.
There is a lot going on in this class. Consider the three phases that occur during shue & sort. 1. Sort: elements are sorted. Based exclusively on the key. 15
2. Partition: elements are partitioned to worker, this is based both on the key and the value (if necessary). 3. Group: elements are grouped together in a single reduce() invocation. The user can override the comparator and group together elements with dierent keys. Notice that every object can have dierent comparators for the sort and group phase. The rst phase sorts the output from the mappers using the key. In our class, we added an id and implemented a new compareTo method so that elements will be in the right order (element from the vector rst). To be more ecient, the framework gives the programmer the possibility to implement a compare directly on the bytes of the serialized object in order to avoid serialization and deserialization during the shue & sort phase. The inner class Comparator implements such compare on bytes. By doing so, the Sort phase will place the element coming from the vector before the elements from the matrix. However, this is not enough because elements from the vector and the matrix have dierent keys so they could be partitioned to dierent machines. We must ensure that they are sent to the same reducer task. This is done by dening an ad hoc partition function which in this particular case can be inherited from the superclass (IntWritable) without any modication. The IntAndIdWritable class will inherit the hashcode method from the IntWritable class and the default HashPartitioner will be used. Therefore, only the integer part of the object will be used for partitioning. By doing so, runs of elements will be assigned to the same reducer without considering the id, i.e. without considering if they are elements from the matrix or from the vector. Still this is not enough. We ensured that the elements from the matrix and the vector having the same index are sent to the same reducer and they are sorted in such a way that the element from the vector is the rst of the run. Now, we would like to receive both on a single sorted run invoking reduce(). This is done by implementing a grouping comparator; a grouping comparator tells the framework when to collapse dierent runs (with dierent keys) in a single run. In this specic case, we can declare the comparator of IntWritable as grouping comparator of IntAndIdWritable so elements with the same integer part will be grouped together ignoring the id. The whole implementation of secondary sort is not so simple since it requires redenition of the dierent mechanisms. In this case, we managed to implement it with minimal modications using the default partitioner and inheriting other mechanisms from IntWritable. The reader may actually benet from trying to implement the Secondary Sort from scratch redening all the three interfaces (all the methods should work on raw binary data for better performance). 3.1.2 Generic Objects and Sequence Files
Now, consider a completely dierent problem of the naive implementation (Listing 6): we work with integers and double values and therefore using text for input/output and intermediate values is probably inecient and leads to complex code. For intermediate values, Hadoop oers the possibility to use any type under the constrains that it should 16
implement the WritableComparable interface. Hadoop also oers the possibility to use as Input and Output of a map reduce job serialized data (binary data). This is done via the SequenceFile record reader and writer. This mechanism also produces a more readable and ecient code since key value pairs are stored as objects, instead of Text, without need for conversion. In our case, the input key will correspond to the column index of either the matrix or the vector. Similarly to what presented in Listing 6 we can distinguish vector or matrix element from the lename. However we have the problem that values read from the input and emitted from mappers can either be a sparse element (value and coordinates) or a double value (a single vector element). Consider that it is not possible to exploit polymorphism; every object read or written by mappers and reducers must have the same exact type as specied in its type declaration4 . Even if this breaks the Liskov substitution principle (which can be quite shocking for some of the readers of this report ), it is done in order to ensure better performance. In fact, in order to use polymorphism, every serialized element should contain a serialized id of its class increasing the amount of data transferred. This is done in the implementation of ObjectOutputStream in Java and it greatly increases the dimension of serialized data. Consider that in most cases, we do not want to serialize every possible type of object, but only a small set of dierent types. The framework has a specic solution to solve this problem: the GenericWritable Interface. It provides the developer with a writable wrapper for dierent types of objects with a minimum overhead for serialization. In conclusion, Secondary Sort, Sequence Files and GenericWritable elements are used in order to ensure maximum performance and produce a concise and maintainable code. The nal implementation is the following:
public class Phase1 { private static boolean V = false; public static class MyMapper extends Mapper<IntWritable, GenericElement, IntAndIdWritable, GenericElement> { IntAndIdWritable out = new IntAndIdWritable(); @Override protected void setup(Context context) throws IOException { String folderName = ((FileSplit) context.getInputSplit()) .getPath().getParent(). getName(); if (folderName.startsWith(V)) / A vector element must be emitted / Consider a M apper < IntW ritable, DoubleW ritable, IntW ritable, DoubleW ritable >. It should be possible to read an IntAndIdWritable (which inherit from IntWritable) as a key. Similarly it should be possible to write an IntAndIdWritable as an output key. However the framework does not permit such operations.
4
17
V = true; } else if ( ! folderName.startsWith(A)) throw new IOException(File name not correct); } @Override public void map(IntWritable key, GenericElement value, Context context) throws IOException, InterruptedException { if (W) { context.write(out. set(key.get() , W), value); } else / The sparse element must be emitted / { context.write(out. set(key.get() , a) , value) ; } }
public static class MyReducer extends Reducer<IntAndIdWritable, GenericElement, IntWritable, DoubleWritable> { @Override public void reduce(IntAndIdWritable key, Iterable<GenericElement> values, Context context) throws IOException, InterruptedException { DoubleWritable emit = null,vectorValue = null; double vv; SparseVectorElement val = null; IntWritable out = new IntWritable; Iterator <GenericElement> iter = values.iterator(); if ( iter .hasNext()) { // a generic element is received GenericElement g = iter.next(); // its correct type can be inferred and the generic element is unwrapped vectorValue = (DoubleWritable) g.get(); vv = vectorValue.get(); } while (iter .hasNext()) {
18
val = (SparseVectorElement) iter.next().get() ; if (val .getValue() != 0.0) { emit.set(vv val .getValue()); context.write(out. set(val .getCoordinate()), emit); }
public static void main(String[] args) throws Exception { Conguration conf = new Conguration(); Job job = new Job(conf, Phase 1); job.setJarByClass(Phase1.class); job.setMapperClass(MyMapper.class); job.setReducerClass(MyReducer.class); job.setMapOutputKeyClass(IntAndIdWritable.class); job.setMapOutputValueClass(GenericElement.class); job.setOutputKeyClass(IntWritable.class); job.setOutputValueClass(DoubleWritable.class); job.setGroupingComparatorClass(IntWritable.Comparator.class); // this should be added in order to use sequence les job.setInputFormatClass(SequenceFileInputFormat.class); job.setOutputFormatClass(SequenceFileOutputFormat.class); TextInputFormat.addInputPath(job, new Path(args[0])); TextInputFormat.addInputPath(job, new Path(args[1])); FileOutputFormat.setOutputPath(job, new Path(args[2])); job.waitForCompletion(true); } }
Listing 8: First phase of matrix vector multiplication (Figure 5).
Notice how objects are reused in order to reduce the overhead of garbage collection; this overhead might be signicant for very large computations (especially if jvm reuse is enabled). The second phase of matrix vector multiplication is instead very simple, temporary values of every row of the output vector are grouped together, summed, and nal values are emitted; we will show its implementation without any further comment:
public class Phase2{ public static class MyReducer extends Reducer<IntWritable, DoubleWritable, IntWritable, GenericElement> {
19
public void reduce(IntWritable key, Iterable<GenericElement> values, Context context) throws IOException, InterruptedException { / The array contains the the row vector once the w row vector is read / DoubleWritable mv,result = null; Iterator <DoubleWritable> iterator = values.iterator(); if ( iterator .hasNext()) { result = (DoubleWritable) iterator.next().get() ; } while(iterator.hasNext()) { mv = (DoubleWritable) iterator.next().get(); result .Sum(mv); } context.write(key, result ) ; } } }
Listing 9: Second phase of matrix vector multiplication (Figure 6). Only the Reducer is shown since non computation is performed in the Map phase.
3.1.3
In Memory Multiplication
We will now examine the case of a vector small enough to t in memory.
A(n1,:) Reducer n1
n1,
An1,j wj
.......
v(:)
Reducer n2 A(n2,:)
n2,
An2,j wj
Figure 7: In Memory Multiplication Map Reduce
20
In this case, it is possible to compute the multiplication employing only one phase of map reduce: Map: Identity Mapper Shue & Sort: Elements of the matrix are sorted using the row index as key. Therefore, every reducer will receive entire rows of the matrix A (Secondary Sort can be used to obtain runs sorted also on the column index). Reduce: 1. Every reducer task store in its memory the vector v (this can be done eciently using HDFS and Hadoop mechanisms 5 ) 2. Elements from the matrix are accessed in the usual streaming pattern; Ai,j received Ai,j vj is computed and added to previous results.
3. In the cleanup phase (when every element of the same row of the matrix has been received) the nal value is emitted. The whole operation is similar to the second stage of the previous computation (Figure 6) but now all reducers have to access the vector. We should expect this implementation to be faster since it requires only one phase of map reduce. However, it is clear that the rst implementation scales better since reducers do not have to store the entire vector in memory and they dont have to access it concurrently from the le system. In conclusion, this second solution would probably perform better for a small vector v and a small number of nodes (in both cases is questionable to utilize Map Reduce to solve the problem).
3.2
Relational Algebra
We are going to briey introduce how to implement relational algebra operation on Hadoop. 3.2.1 Selection
Select from a relation R the tuples satisfying condition C: Map: emit only tuples that satises condition C.
5
DistributedCache mechanism of the hadoop framework
21
3.2.2
Projection
For each tuple in relation R select only specic attributes and remove duplicates Map: from a tuple t, create a new tuple t and emit (t,t). Reduce: receive a list of identical tuples (t,[t,t ...]) emit t only once. 3.2.3 Union
Given relation R and S output the union of their tuples. Map: for each tuple t emit (t,R) if from relation R, (t,S) otherwise Reduce: for each key (t,[ .. ]) emit t. 3.2.4 Intersection
For each tuple in relation R select only specic attributes and remove duplicates Map: for each tuple t emit (t,R) if from relation R, (t,S) otherwise Reduce: emit tuple t if and only if (t,[R,S]) is received. 3.2.5 Dierence
For each tuple in relation R select only specic attributes and remove duplicates Map: for each tuple t emit (t,R) if from relation R, (t,S) otherwise Reduce: emit tuple t if and only if (t,[R]) is received. 3.2.6 Natural Join
Given relation R(A,B) and S(B,C) nd tuples that have the same B attribute. Map: Given tuple (a, b) R emit (b,(R,a)). Given tuple (b, c) S emit (b,(S,c)). Sort: tuples with the same B argument will be shued to the same reducer. Reduce: Given the received list (b,(R,a1 ),(R,a2 ),(S,c1 ) ) produce all the pairs6 and output them as (b,pair1 ) , (b,pair2 ), (b,pair3 )
We should expect the number of produced pairs to be limited since we have already grouped together tuple with the same B parameter
6
22
3.2.7
Group by and aggregation functions
Given a relation R(A,B,C) we group together elements with the same A and apply a function f on the related B elements Map: given (a, b, c) R output (a,b). Sort: elements with the same a will be sent to the same reducer. Reduce: given the list (a,[b1 ,b2 , ]) apply function f on the list: f(b1 ,b2 , ) = x and emit (a,x).
Conclusions
This short essay on the Map Reduce model and its design patterns should have given the reader a basic understanding of the programming model and presented some of its most important design patterns. The example of Matrix Vector factorization is explained in great detail in order to present the reader many important aspects of Hadoop and show usage its advanced mechanisms. In particular, it is important to understand the various passages from a basic implementation to an implementation that fully exploit the potential of the framework.
References
[1] Mpi man page. http://www.open-mpi.org/doc/v1.5/man3/MPI_Reduce.3.php. [2] D. Stott Parker Ali Dasdan, Ruey-Lung Hsiao. Map-reduce-merge: Simplied relational data processing on large clusters. [3] Sergei Vassilvitskii Howard Karlo, Siddhart Suri. A model of computation for mapreduce. [4] Ralf Lammel. Googles mapreduce programming model. [5] Yahoo Developer Network. Advanced mapreduce. [6] Yahoo Developer Network. Mapreduce. [7] John Norstad. A mapreduce algorithm for matrix multiplication. [8] Jason Venner. Pro Hadoop. [9] Tom White. Hadoop, The Denitive Guide.
23

Mapreduce Programming Model and Design Patterns: Andrea Lottarini January 17, 2012

Загружено:

Сведения о документе

Исходное описание:

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Mapreduce Programming Model and Design Patterns: Andrea Lottarini January 17, 2012

Загружено:

Авторское право:

Доступные форматы

MapReduce Programming Model and Design Patterns

Andrea Lottarini January 17, 2012

Programming Model (simple)

I3 map Aggregate values by key

MCSN N. Tonellotto Complements of Distributed Enabling Platforms

Functional Programming Heritage

MapReduce Programming Model

The Word Count Example

Listing 1: Pseudo code of the wordcount application

Listing 2: Code of the wordcount application.

Programming Model (complete)

Aggregate values by key

Listing 3: Combiner setup in the job conguration.

Statefull In-Mapper Combining

Listing 4: Reducer of the new wordcount application

Listing 5: Job conguration of the new wordcount application

Matrix Vector multiplication

i indicates the row index, while j indicates the column index

Figure 4: Matrix Vector multiplication.

A(:,n) v(n) A(:,n)*v(n)

Figure 5: First Phase of Map Reduce

Shufe & Sort

static { // register this comparator WritableComparator.dene(IntAndIdWritable.class, new Comparator()); }

Listing 8: First phase of matrix vector multiplication (Figure 5).

We will now examine the case of a vector small enough to t in memory.

Figure 7: In Memory Multiplication Map Reduce

DistributedCache mechanism of the hadoop framework

Group by and aggregation functions

Вам также может понравиться