Вы находитесь на странице: 1из 50

Big Data and Cisco Unified Computing

System (UCS)
BRKAPP-2033
Kapil Bakshi Technical Solution Architect
Sean McKeown - Technical Solution Architect
Raghunath Nambiar Distinguished Engineer
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Agenda
Big Data Concepts and Overview
Enterprise data management and big data
Problems, Opportunities and Use case examples
Hadoop, NOSQL and MPP Architecture concepts
Cisco UCS for Big Data
Building a big data cluster with the UCS Common Platform Architecture (CPA)
UCS Networking, Management, and Scaling for big data
Q&A

3
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Big Data Concepts and Overview
Enterprise data management and big data
Problems, Opportunities and Use case examples
Hadoop, NOSQL and MPP Architecture concepts
4
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
For our purposes, big data refers to distributed computing
architectures specifically aimed at the 3 Vs of data: Volume,
Velocity, and Variety

What is Big Data?
5
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Operational
(OLTP)
Traditional Enterprise Data Management
6
Operational
(OLTP)
ETL
EDW
BI/Reports
Online
Transactional
Processing
Extract,
Transform,
and Load
(batch
processing)
Enterprise
Data
Warehouse
Business
Intelligence
Operational
(OLTP)
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Traditional Business Intelligence Questions
7
Transactional Data (e.g. OLTP)
Real-time, but limited reporting/analytics
What are the top 5
most active stocks
traded in the last
hour?
How many new
purchase orders have
we received since
noon?
Enterprise Data Warehouse
High value, structured, indexed, cleansed
How many more
hurricane windows are
sold in Gulf-area
stores during
hurricane season vs.
the rest of the year?
What were the top 10
most frequently back-
ordered products over
the past year?
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
So what has changed?
The Explosion of Unstructured Data
8
2005 2015
2010
More than 90% is unstructured
data
Approx. 500 quadrillion files
Quantity doubles every 2 years
Most unstructured data is neither
stored nor analyzed!
1.8 trillion gigabytes of data
was created in 2011
10,000
0
G
B

o
f

D
a
t
a


(
I
N

B
I
L
L
I
O
N
S
)

STRUCTURED DATA
UNSTRUCTURED DATA
Source: Cloudera
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Machine
Operational
(OLTP)
Operational
(OLTP)
ETL
BI/Reports
Operational
(OLTP)
Enterprise Data Management with Big Data
9
Web
ETL
Dashboards
In-
memory
analytics
Big Data
(Hadoop, etc.)
MPP EDW
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Traditional Business Intelligence Questions
10
Transactional Data (e.g.
OLTP)
Fast data, real-time
What are the top 5
most active stocks
traded in the last hour?
How many new
purchase orders have
we received since
noon?
Enterprise Data Warehouse
High value, structured,
indexed, cleansed
How many more
hurricane windows are
sold in Gulf-area
stores during hurricane
season vs. the rest of
the year?
What were the top 10
most frequently back-
ordered products over
the past year?
Big Data
Lower value, semi-structured,
multi-source, raw/dirty
Which products do
customers click on the
most and/or spend the
most time browsing
without buying?
How do we optimally
set pricing for each
product in each store
for individual
customers everyday?
Did the recent
marketing launch
generate the expected
online buzz, and did
that translate to sales?
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Example: Web and Location Analytics
11
iPhone searches
Amazon for Vizio TVs
in Electronics
1336083635.130 10.8.8.158 TCP_MISS/200 8400 GET
http://www.amazon.com/gp/aw/s/ref=is_box_?k=Visio+tv "Mozilla/5.0 (iPhone;
CPU iPhone OS 5_0_1 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko)
Version/5.1 Mobile/9A405 Safari/7534.48.3"
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Big Data and Key Infrastructure Attributes
Usually not blade servers (not enough local storage)
Usually not virtualized (hypervisor only adds overhead)
Usually not highly oversubscribed (significant east-west traffic)
Usually not SAN/NAS
(What big data isnt)
12
Move the
compute to
the storage
Low-cost, DAS-
based, scale-out
clustered filesystem
$$$
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Integrated Data Warehouse
~80TB in Prod. Dual Active
environment with CoD and
online archive environment

Analytics Tier:
DMss and environments
housing relevant data
extracts, potential tiered DW
archive space

Common Logging, source
data unload platform, and
repository for other
unstructured and semi-
structured data sources
Cost, Performance, and Capacity



Enterprise
Database



Massive Scale-Out
Column Store


Hadoop
No SQL

Structured Data:
Relational
Database
Unstructured Data:
Machine Logs, Web Click
Stream, Call Data Records,
Satellite Feeds, GPS Data,
Sensor Readings, Sales Data,
Blogs, Emails, Video
$20K/TB
$10K/TB
$300-$1K/TB
HW:SW $ split 70:30
HW:SW $ split 30:70
13
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Big Data Software Architectures
14
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Three basic categories of big data architectures
MPP Relational
Columnar Database
Scale-out BI/DW
Batch-oriented
Hadoop
Heavy lifting, processing
Real-time NoSQL
Fast store/retrieve
Structured Data
Optimized for DW/OLAP, some
OLTP (ACID-compliant)
Data stored via frequently-
access columns rather than
rows for faster retrieval
Rigid schema applied to data
on insert/update
Read and write (insert, update)
many times
Somewhat limited linear scaling
Queries often involve a smaller
subset of data set vs. Hadoop
TB to low PB size
Unstructured Data emails,
syslogs, clickstream data
Optimized for large
streaming reads of large
blocks (128-256 MB)
comprising large files
Dynamic schema effectively
applied on read
Optimized to compute data
locally at cost of
normalization
Write once, read many
Linear scaling to thousands
of nodes and tens of PB
Entire data set at play for a
given query


Unstructured Data tweets,
sensor data, clickstream
Data typically stored and
retrieved as key-value pairs
in flexible column families
High transaction rates, many
reads and writes, small
block/chunk sizes (1K-1MB)
Less well-suited for ad-hoc
analysis than Hadoop
TB to PB scale
15
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Three basic big data software architectures
Greenplum DB
(Pivotal DB)*
ParAccel*
Vertica
Netezza
Teradata
MPP Relational
Database
Scale-out BI/DW
Cloudera*
MapR*
Intel Hadoop*
Pivotal HD*

Batch-oriented
Hadoop
Heavy lifting, processing
Real-time NoSQL
Fast key-value
store/retrieve
HBase (part of
Apache
Hadoop)*
DataStax
(Cassandra)*
Oracle NoSQL*
Amazon Dynamo
*Cisco Partners
16
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Hadoop is a distributed, fault-tolerant framework for storing and
analyzing data.
Its two primary components are the Hadoop Filesystem (HDFS)
and the MapReduce application engine.

What Is Hadoop?
17
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Hadoop Components and Operations
Hadoop Distributed File System
Block
1
Block
2
Block
3
Block
4
Block
5
Block
6
Scalable & Fault Tolerant
Filesystem is distributed, stored across all
data nodes in the cluster
Files are divided into multiple large
blocks 64MB default, typically 128MB
512MB
Data is stored reliably. Each block is
replicated 3 times by default
Types of Node Functions
Name Node - Manages HDFS
Job Tracker Manages MapReduce Jobs
Data Node/Task Tracker stores blocks/does
work

ToR
FEX/switch
Data
node 1
Data
node 2
Data
node 3
Data
node 4
Data
node 5
ToR
FEX/switch
Data
node 6
Data
node 7
Data
node 8
Data
node 9
Data
node 10
ToR
FEX/switch
Data
node 11
Data
node 12
Data
node 13
Name
Node
Job
Tracker
File
18
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
HDFS Architecture
19
ToR
FEX/switch
Data
node 1
Data
node 2
Data
node 3
Data
node 4
Data
node 5
ToR
FEX/switch
Data
node 6
Data
node 7
Data
node 8
Data
node 9
Data
node 10
ToR
FEX/switch
Data
node 11
Data
node 12
Data
node 13
Data
node 14
Data
node 15
1
Switch
Name Node





/usr/sean/foo.txt:blk_1,blk_2
/usr/jacob/bar.txt:blk_3,blk_4

Data node 1:blk_1
Data node 2:blk_2, blk_3
Data node 3:blk_3
1
1
2
2
2
3
3
3
4
4
4
4
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
MapReduce Example: Word Count
the
quick
brown
fox
the fox
ate the
mouse
how now
brown
cow
Map
Map
Map
Reduce
Reduce
brown,
2
fox, 2
how, 1
now, 1
the, 3
ate, 1
cow, 1
mouse,
1
quick,
1
the, 1
brown, 1
fox, 1
quick, 1
quick, 1
the, 1
fox, 1
the, 1
how, 1
now, 1
brown, 1
ate, 1
mouse, 1
cow, 1
Input Map Shuffle & Sort Reduce Output
20
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
MapReduce Architecture
21
ToR
FEX/switch
Task
Tracker 1
Task
Tracker 2
Task
Tracker 3
Task
Tracker 4
Task
Tracker 5
ToR
FEX/switch
Task
Tracker 6
Task
Tracker 7
Task
Tracker 8
Task
Tracker 9
Task
Tracker 10
ToR
FEX/switch
Task
Tracker 11
Task
Tracker 12
Task
Tracker 13
Task
Tracker 14
Task
Tracker 15
Switch
Job Tracker





Job1:TT1:Mapper1,Mapper2
Job1:TT4:Mapper3,Reducer1

Job2:TT6:Reducer2
Job2:TT7:Mapper1,Mapper3
M1
M2
R1
M3
M1
M3
R2
M2
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Design Considerations for MPP Database
22
MPP Hadoop NOSQL
Compute
IO Bandwidth
Capacity
Design Considerations
Scale-out with Shared nothing
Data Redundancy with Local RAID
5

Configuration Considerations
(1) High Compute (Fastest CPU)
(2) High IO Bandwidth (15K RPM
HDD)
Flash/SSD and In-memory
(3) Moderate Capacity
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Design Considerations
Scale-out with Shared-Nothing
Data Redundancy Options
Key-Value: JBOD + 3-Way
Replication
Document-Store: RAID or Replication
Configuration Considerations
(1) Moderate Compute
(2) Balanced IOPS (Performance vs. Cost)
10K RPM HDD, 15K RPM HDD
SSDs, PCI Flash
(3) Moderate to High Capacity
Design Considerations for NoSQL Databases
23
MPP Hadoop NOSQL
Compute
IO Bandwidth
Capacity
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Cisco UCS and Big Data
Building a big data cluster with the UCS Common
Platform Architecture (CPA)
CPA Networking
CPA Sizing and Scaling
24
Life is unfair, and the unfairness is distributed unfairly.
-Russian proverb
25
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
The evolution of big data deployments
Experimental use of Big Data
Deployed into IT Ops mandated
infrastructures
Skunk works
Small to medium clusters
App team mandated infrastructure
Purpose built for Big Data
Big Data has established business
value
Performance matters
Large or small clusters
IT Infrastructure
Big Data
VMware
WEB
SAP
Generic IT servers
General Purpose IT Data Center
X86 servers
Big Data
Dedicated Pod for Big Data
26
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Hadoop Hardware Evolving in the Enterprise
27
Typical 2009
Hadoop node
1RU server
4 x 1TB 3.5
spindles
2 x 4-core CPU
1 x GE
24 GB RAM
Single PSU
Running Apache
$
Economics favor
fat nodes
6x-9x more
data/node
3x-6x more
IOPS/node
Saturated gigabit,
10GE on the rise
Fewer total nodes
lowers
licensing/support
costs
Increased
significance of node
and switch failure
Typical 2013
Hadoop node
2RU server
12 x 3TB 3.5 or 24
x 1TB 2.5 spindles
2 x 8-core CPU
1-2 x 10GE
128 GB RAM
Dual PSU
Running
commercial/licensed
distribution
$$$
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Cisco UCS Common Platform Architecture (CPA)
Building Blocks for Big Data
28
UCS 6200 Series
Fabric Interconnects
Nexus 2232
Fabric Extenders

UCS Manager
UCS 240 M3
Servers
LAN, SAN, Management
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
CPA Network Design for Big Data
29
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Cisco UCS: Physical Architecture
6200
Fabric A
6200
Fabric B
B200
CNA
F
E
X
B
F
E
X
A
SAN A SAN B ETH 1 ETH 2
MGM
T
MGMT
Chassis 1
Fabric Switch
Fabric Extenders
Uplink Ports
Compute Blades
Half / Full width
OOB Mgmt
Server Ports
Virtualized Adapters
Cluster
30
Rack Mount C240
CNA
FEX A FEX B
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
CPA: Topology
Single wire for data and management
8 x 10GE
uplinks per
FEX= 2:1
oversub (16
servers/rack),
no
portchannel
(static pinning)
2 x 10GE links
per server for all
traffic, data and
management
31
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
CPA Recommended FEX Connectivity
2 FEXs and 2 FIs
2232 FEX has 4 buffer groups: ports 1-8, 9-16, 17-24, 25-32
Distribute servers across port groups to maximize buffer
performance and predictably distribute static pinning on uplinks
32
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Cisco Virtual Interface Card (VIC)
PCIe x16
10GbE/FCoE
User
Definable
vNICs
Eth
0
FC
1 2
FC
3
Eth
256
Converged Network Adapter
FCoE in hardware
Bare metal and VM deployments
Virtualize in hardware
PCIe compliant
vNIC Fabric Failover
Up to 256 distinct PCIe devices
Ethernet vNIC and FC vHBA
QoS
8 queues
vNIC bandwidth guarantees

33
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
UCS Fabric Failover
Fabric provides NIC failover
capabilities chosen when
defining a service profile
Traditionally done using NIC
bonding driver in the OS
Provides failover for both
unicast and multicast traffic
Works for any OS on
bare metal
(Also works for any
hypervisor-based servers)


vNIC
1
10GE 10GE
vEth
1
OS / Hypervisor / VM
vEth
1
FEX FEX
Physical
Adapter
Virtual
Adapter
6200-A
6200-B
L1
L2
L1
L2
Physical Cable
Virtual Cable
Cisco
VIC 1225
34
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Recommended UCS networking with Apache Hadoop
Use 2 VNICs with Fabric Failover on opposite fabrics for internal and external traffic
35
VNIC 1 on Fabric A with
FF to B (internal cluster)
VNIC 2 on Fabric B with
FF to A (external data)
No OS bonding required
VNIC 0 (management)
wiring not shown for
clarity (primary on Fabric
B, FF to A)
Note: cluster traffic will
flow northbound in the
event of a VNIC1
failover. Ensure
appropriate
bandwidth/topology
(e.g. vPC)
VNIC
1
L2/L3 Switching
Data Node 1
VNIC
2
Data Node 2
6200 A
VNIC
2
6200 B
VNIC
1
EHM EHM
Data ingress/egress
VNIC
0
VNIC
0
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Cisco Unified IO Grant Bandwidth
Near Wire Speed without CPU load
Dynamic bandwidth management according to SLAs
3G/s LAN Traffic (HDFS Import)
3G/s
2G/s
3G/s
ApplicationTraffic
3G/s
3G/s
Cluster Traffic (Shuffle)
4G/s
5G/s 3G/s
t1 t2 t3
Individual
Ethernets

Prioritized QoS
36
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Switch Buffer
Usage
With Network QoS
Policy to prioritize
HBase
Update/Read
Operations
0
5000
10000
15000
20000
25000
30000
35000
40000
L
a
t
e
n
c
y

(
u
s
)

Time
READ - Average Latency (us) QoS - READ - Average Latency (us)
1

7
0

1
3
9

2
0
8

2
7
7

3
4
6

4
1
5

4
8
4

5
5
3

6
2
2

6
9
1

7
6
0

8
2
9

8
9
8

9
6
7

1
0
3
6

1
1
0
5

1
1
7
4

1
2
4
3

1
3
1
2

1
3
8
1

1
4
5
0

1
5
1
9

1
5
8
8

1
6
5
7

1
7
2
6

1
7
9
5

1
8
6
4

1
9
3
3

2
0
0
2

2
0
7
1

2
1
4
0

2
2
0
9

2
2
7
8

2
3
4
7

2
4
1
6

2
4
8
5

2
5
5
4

2
6
2
3

2
6
9
2

2
7
6
1

2
8
3
0

2
8
9
9

2
9
6
8

3
0
3
7

3
1
0
6

3
1
7
5

3
2
4
4

3
3
1
3

3
3
8
2

3
4
5
1

3
5
2
0

3
5
8
9

3
6
5
8

3
7
2
7

3
7
9
6

3
8
6
5

3
9
3
4

4
0
0
3

4
0
7
2

4
1
4
1

4
2
1
0

4
2
7
9

4
3
4
8

4
4
1
7

4
4
8
6

4
5
5
5

4
6
2
4

4
6
9
3

4
7
6
2

4
8
3
1

4
9
0
0

4
9
6
9

5
0
3
8

5
1
0
7

5
1
7
6

5
2
4
5

5
3
1
4

5
3
8
3

5
4
5
2

5
5
2
1

5
5
9
0

5
6
5
9

5
7
2
8

5
7
9
7

5
8
6
6

5
9
3
5

B
u
f
f
e
r

U
s
e
d

Timeline
Hadoop TeraSort Hbase
Read Latency
Comparison of
Non-QoS vs. QoS
Policy
~60% Read
Improvement
HBase + Hadoop Map Reduce with QoS
37
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Can Hadoop really push 10GE?
Analytic workloads tend to be
lighter on the network
Transform workloads tend to be
heavier on the network
Hadoop has numerous
parameters which affect network
Take advantage of 10GE CPA:
mapred.reduce.slowstart.completed.maps
dfs.balance.bandwidthPerSec
mapred.reduce.parallel.copies
mapred.reduce.tasks
mapred.tasktracker.reduce.tasks.maximum
mapred.compress.map.output

It can, depending on workload, so tune for it!
38
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
CPA Sizing and Scaling for Big Data
39
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
UCS Reference Configurations for Big Data
40
Half-Rack UCS Solutions
Bundle for MPP, NoSQL
Configuration

Full Rack UCS Solutions
Bundle for Hadoop
Capacity

Full Rack UCS Solutions
Bundle for Hadoop,
NoSQL Performance


2 x UCS 6248
2 x Nexus 2232 PP
8 x C240 M3 (SFF)

2x E5-2690
256GB
24x 600GB 10K SAS

2 x UCS 6296
2 x Nexus 2232 PP
16 x C240 M3 (LFF)

E5-2640 (12 cores)
128GB
12x 3TB 7.2K SATA

2 x UCS 6296
2 x Nexus 2232 PP
16 x C240 M3 (SFF)

2x E5-2665 (16 cores)
256GB
24 x 1TB 7.2K SAS
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Sizing
Start with current storage requirement
Factor in replication (typically 3x) and compression (varies by data set)
Factor in 20-30% free space for temp (Hadoop) or up to 50% for some NoSQL
systems
Factor in average daily/weekly data ingest rate
Factor in expected growth rate (i.e. increase in ingest rate over time)
If I/O requirement known, use next table for guidance
Most big data architectures are very linear, so more nodes = more capacity and
better performance
Strike a balance between price/performance of individual nodes vs. total # of
nodes

Part science, part art
41
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
CPA sizing and application guidelines
42
Server
CPU
2 x E5-2690 2 x E5-2665 2 x E5-2640
Memory (GB) 256 256 128
Disk Drives 24 x 600GB 10K 24 x 1TB 7.2K 12 x 3TB 7.2K
IO Bandwidth (GB/Sec) 2.6 2.0 1.1
Rack-Level
Cores 256 256 192
Memory (TB) 4 4 2
Capacity (TB) 225 384 576
IO Bandwidth (GB/Sec) 41.3 31.9 16.9
Applications
MPP DB
NoSQL
Hadoop
NoSQL
Hadoop
Best Performance Best Price/TB
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Scaling the CPA
Single Rack
16 servers
Single Domain
Up to 10 racks, 160 servers
43
Multiple Domains
L2/L3 Switching
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Consider intra- and inter-domain bandwidth:
Servers Per
Domain
(Pair of Fabric
Interconnects)
Available
North-Bound
10GE ports
(per fabric)
Southbound
oversubscription
(per fabric)
Northbound
oversubscription
(per fabric)
Intra-domain
server-to-server
bandwidth (per
fabric, Gbits/sec)
Inter-domain
server-to-server
bandwidth (per
fabric, Gbits/sec)
160 16 2:1 5:1 5 1
144 24 2:1 3:1 5 1.67
128 32 2:1 2:1 5 2.5
Scaling the Common Platform Architecture
Multiple domains based on 16 servers per rack and 2 x 2232 FEXs
44
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Multi-Domain CPA Customer Example
45
10 Gits/sec Intra-Domain
Server to Server NW
Bandwidth
5 Gbits/sec Inter-Domain
Server to Server NW
Bandwidth
Static pinning from FEX to
FI (no port-channel)
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Rack Awareness
Rack Awareness provides Hadoop the
optional ability to group nodes together
in logical racks
Logical racks may or may not
correspond to physical data center
racks
Distributes blocks across different
racks to avoid failure domain of a
single rack
It can also lessen block movement
between racks
Can be useful to control block
placement and movement in UCSM
integrated environments






46
Rack 1
Data
node 1
Data
node 2
Data
node 3
Data
node 4
Data
node 5
Rack 2
Data
node 6
Data
node 7
Data
node 8
Data
node 9
Data
node 10
Rack 3
Data
node 11
Data
node 12
Data
node 13
Data
node 14
Data
node 15
1
1
1
2
2
2
3
3
3
4
4
4
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Recommendations: UCS Domains and Racks
47
Single Domain Recommendation
Turn off or enable at physical rack level
For simplicity and ease of
use, leave Rack Awareness
off
Consider turning it on to limit
physical rack level fault
domain (e.g. localized
failures due to physical data
center issues water, power,
cooling, etc.)
Multi Domain Recommendation
Create one Hadoop rack per UCS Domain
With multiple domains,
enable Rack Awareness
such that each UCS Domain
is its own Hadoop rack
Provides HDFS data
protection across domains
Helps minimize cross-
domain traffic
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Further Reading
BRKCOM-2011 - UCSM Integration Hadoop Networking Best Practices
Best practices for multi-domain CPA deployment
BRKAPP-2027 - Big Data Architecture and Deployment
Deep dive on network behavior of Hadoop
48
2013 Cisco and/or its affiliates. All rights reserved. BRKAPP-2033 Cisco Public
Maximize your Cisco Live experience with your
free Cisco Live 365 account. Download session
PDFs, view sessions on-demand and participate in
live activities throughout the year. Click the Enter
Cisco Live 365 button in your Cisco Live portal to
log in.
Complete Your Online Session Evaluation
Give us your feedback and
you could win fabulous prizes.
Winners announced daily.
Receive 20 Cisco Daily Challenge
points for each session evaluation
you complete.
Complete your session evaluation
online now through either the mobile
app or internet kiosk stations.
49

Вам также может понравиться