Академический Документы
Профессиональный Документы
Культура Документы
Hadoop
Dr.A.Bazila Banu
ASSOCIATE PROFESSOR
DEPARTMENT OF CSE
Introduction
2. Architecture in detail
• Fault-tolerance
Hadoop Framework
• Data is processed in
parallel and accomplish the
entire statistical analysis on
large amount of data.
• It is a framework which is
based on java programming.
• It is intended to work upon
from a single server to
thousands of machines each
offering local computation
and storage
Hadoop’s Architecture-
Name Node,Data Node
Hadoop’s Architecture:
MapReduce Engine
Hadoop’s Architecture
• Distributed, with some centralization
The key is the byte offset into the file at which the line starts.
The value is the contents of the line itself.
MapReduce: The Reducer
• After the Map phase is over, all the intermediate values for a
given intermediate key are combined together into a list. This
list is given to a Reducer
• The intermediate keys and their value lists are passed to the
Reducer in sorted key order. This step is known as the shuffle
and sort.