BigSDN Public

Big Data + SDN
SDN Abstractions
The Story Thus Far

Different types of traffic in clusters
Background Traffic
Bulk transfers
Control messages
Active Traffic (used by jobs)

HDFS read/writes
Partition-Aggregate traffic
The Study Thus Far

Specific communication patterns in
clusters
Patterns used by Big Data Analytics
You can optimize specifically for theses
Shuffle
Incast
Broadcast
Map
Ma
p
Map
Reduc
e
Reduc
e
HDFS
HDFS
Map
Ma
p
Map
Reduc
e
Reduc
e
The Story Thus Far

Helios, Hedera, MicroTE, c-thru
improve utilization
Congestion leads to bad performance
Eliminate congestion
Gather
Network
Demand
Determine
paths with
minimal
congestion
Install New
paths
Draw Backs
Demand gather at network is
ineffective
Assumes that past demand will predict
future
Many small jobs in cluster so ineffective
May Require expensive

instrumentation to gather
Switch modifications
Or endhost modification to gather
Application Aware
Networking
Insight
Application knows every the network
need
So application can in fact instruct the
network
Small number of big data paradigms

So only a small number of applications need
to be changed
Map-reduce/hadoop, sharp, dyrad
Application has a central entity

controlling everything
Important Questions
What information do you need?
Size of a flow
Source+destination of the flow
Start time of the flow
Deadline of the flow
How should the application inform the

network?
Reactively or proactively
Modified applications or unmodified applications
Challenges Getting Information

Flow Size
Insight
Data that is transferred is data that is stored in a file
Input data
Query HDFS for file size
Intermediate data/Output Data

Reactive methods: wait for map to finish writing to temp file
Asking the file system for size
Checking the Hadoop logs for file size
Checking the Hadoop web-API
Proactive methods: predict size using prior history

Jobs run the same code over and over
Learn the ratio between input data and intermediate data
Learn the ratio between intermediate data and output data
Challenges Getting Information

End points
Reactively
Job tracker places the task; it knows the
locations
Check the hadoop logs for the locations
Modify the job tracker to directly inform you of
location
Proactively
Have the SDN controller tell the job tracker
where to place the end-points
Rack aware placement: reduce inter-rack transfers
Congestion aware placement: reduce loss
Challenges getting information

:Flow start time
Hadoop specific details obscure the
start time
Reducer transfers data from only 5 map
at at time
Tries to reduce unfairness
Reducers randomly pick the mappers to

start from
Reducers start transfer at random times
Tries to reduce incast and synchronization
between flows
Logs store when transfer starts
FloxBox: Simple Approach

Insight: many types of traffic exist in N/W
We only care about map-reduce more than
other traffic
Solution: prioritize map-reduce traffic

Place them highest priority queue
Other traffic cant interfere
How about control messages?

Should prioritize those too.
Reactive Approach:
FlowComb
Reactive attempt to integrate
bigdata + SDN
No changes to application
Learn information by looking at logs and
determine file size and end-points
Learn information by running agents on
the endhost that determines start times
FlowComb: Architecture
Agents on servers
Hadoop
cluster
Detect start/end of map

Detect start/end transfer
Agents
FlowComb
Predictor
Scheduler
Controller
Predictor
Determines size of
intermediate data
Queries Map Via API
Aggregates information
from agents sends to
scheduler
ure 1: FlowComb consists of three modules: flow

diction, flow scheduling, and flow control.
FlowComb: Architecture
Scheduler
Hadoop
cluster
Agents
FlowComb
Predictor
Scheduler
Controller
Examines each flow

that has started
For each flow what
is the ideal rate
Is the flow currently
bottlenecked?
Move to the next
shortest path with
available capacity
ure 1: FlowComb consists of three modules: flow

diction, flow scheduling, and flow control.
Open Questions
How about non map-reduce traffic?
Only focus on the active transfers ignores control
msgs and background
How about HDFS reads and writes

Only focus on intermediate data
Sub optimal control loop

Benefits for small jobs?
CoFlows : Proactive
Approach
Modify the applications
Have them directly inform network of
intent
Application inform network of co-flow

Co-flow: Group of flows bound by app
level semantics
Challenges:
End-points not known at the beginning of
transfer
Start times of the different flows not know
File-sizes not known but can be estimated
Interactions between Coflows

Sharing:
Sharing the cluster network among multiple
coflows: How to allocate
Reservation
Max-min faireness
Prioritization:
Using priorities as weights
Per job/application
17
We Want To
Better schedule the network
Intra-coflow
Inter-coflow
Write the communication layer of a new

application
Without reinventing the wheel
Add unsupported coflows to an application,

or
Replace an existing coflow implementation
Independent of applications
18
Coflow APIs
Get+put operations allow you to overcome the

limitation of unknown start times. The network
determines when to do the transfer.
You can call put, without specifying an endpoint.
The network determines where to temporarily
store.
When the receiver calls a get, the network
19
Job
finishe
s
terminate(handle)
Coflow
API
Shuffle
finishe
s
get(handle, id) content
Drive
r
put(handle, id, content)
create(SHUFFLE) handle
MapReduc
e
2
0
Summary
Applications know a lot about the transfers
We can reactively learn by using logs
Or modify the application to inform us of these things
Tricky information to obtain include:

Transfer start time
Transfer end-points
CoFlows: proactive
Controls network path, transfer times, and transfer rate
FlowComb: reactive
Controls network paths based on app knowledge
21
ToDo
Need more images from the infobox guys
Maybe improvements and why skethcy
Maybe graphs from flowcomb also
Extensive discussion of pro-active versus reactive.

Discussion on orchester should also include
patterns from co-flow
Add IBM talk.

BigSDN Public

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

BigSDN Public

Загружено:

Авторское право:

Доступные форматы

Big Data + SDN

The Story Thus Far

Active Traffic (used by jobs)

The Study Thus Far

The Story Thus Far

May Require expensive

Small number of big data paradigms

Application has a central entity

How should the application inform the

Challenges Getting Information

Intermediate data/Output Data

Proactive methods: predict size using prior history

Challenges Getting Information

Challenges getting information

Reducers randomly pick the mappers to

Logs store when transfer starts

FloxBox: Simple Approach

Solution: prioritize map-reduce traffic

How about control messages?

Detect start/end of map

ure 1: FlowComb consists of three modules: flow

Examines each flow

ure 1: FlowComb consists of three modules: flow

How about HDFS reads and writes

Sub optimal control loop

Application inform network of co-flow

Interactions between Coflows

Write the communication layer of a new

Add unsupported coflows to an application,

Get+put operations allow you to overcome the

get(handle, id) content

Tricky information to obtain include:

Extensive discussion of pro-active versus reactive.

Вам также может понравиться