Transport

Transport services and protocols
Chapter 6: Transport Layer
Transport protocols run in end

systems
Reliable, in-order unicast

delivery: TCP
network
data link
physical
connection setup
network
data link
physical
Unreliable (best-effort),
unordered unicast or multicast
delivery: UDP
application
transport
network
data link
physical
TCP Features
Transport layer provides a mechanism for either

quick unreliable delivery (UDP) or error recovery
and flow control (TCP).
Connection-oriented, duplex, reliable byte-stream service

with flow control.
TCP uses sliding window

TCP Reno, newReno: Go-Back-N.
TCP (Transmission Control Protocol)
TCP SACK: Selective Repeat.
UDP (User Datagram Protocol)
Any application layer protocol on top of TCP or

UDP automatically operates across the internet.
Connection oriented in TCP is very different from that in

ATM. How?
Sequence number of the first segment of a connection is

agreed upon by a three-way handshake:
Source announces sequence number.
Destination acks sequence number.
Source sends first segment with that number.
ECE/CSC 570, Fall 2014
network
data link
physical
Transport Layer
network
data link
physical
network
data link
physical
flow control
application
transport
network
data link
physical
congestion
TCP (Transmission Control

Protocol) Congestion Control
Provide logical communication

between application processes
running on different hosts
Flow Control / Congestion Control
TCP Features (2)
Connections are released by a two way handshake in

each direction.
32-bit ack is cumulative:
Flow Control mechanism (rwnd)

Congestion Control mechanism (cwnd)
Ack of n indicates all bytes up to n-1 have been received correctly,

and n is next expected byte number.
TCP does not use NACK, but duplicate Acks.
How does sender know byte n is lost?
Effective window = min (rwnd, cwnd)
Flow Control mechanism is simple.

Receiver uses a window size field in the ack to advertise the size
of the window rwnd that reflects its buffer capacity.
If seq. # n is acked multiple times, sender may infer that byte n is

lost or corrupted.
An inadequate receiver buffer size may constrain the throughput

of the connection regardless of the state of the network.
Time-out
TCP Congestion Control (cwnd)
32 bits
ACK: ACK #
valid
PSH: push data now
(generally not used)
RST, SYN, FIN:
connection estab
(setup, teardown
commands)
Internet
checksum
(as in UDP)
source port #
dest port #
sequence number
acknowledgement number
head not
UA P R S F
len used
checksum
Receive window
Urg data pnter
Options (variable length)
counting
by bytes
of data
(not segments!)
# bytes
rcvr willing
to accept
Initially there is a multiplicative increase of the window

size (slow start)
Normal operation: AIMD Additive Increase and

Multiplicative Decrease of the window size (congestion
avoidance phase).
How can end-systems detect congestion?

Router silently drops packet when congestion occurs (drop-tail)
application
data
(variable length)
Congestion Control mechanism is more complex.
TCP segment structure: Overview

URG: urgent data
(generally not used)
Two mechanisms in TCP that control the flow of

information:
Assumption:
When there is a packet loss, congestion occurs somewhere in
the network.
Detecting Packet Loss Using

Retransmission Timeout (RTO)
How does TCP (end-host) detect a

packet loss?
Retransmission timeout (RTO)
Duplicate acknowledgements
In practice, triple duplicate acks are considered
to be a signal of a packet loss.
At any time, TCP sender sets retransmission

timer for one TCP packet (or segment)
If acknowledgement for the segment is not

received before timer goes off, the segment is
assumed to be lost
RTO dynamically calculated

Time out period doubles for each timeout event.
Detecting Packet Loss Using Dupacks:

Fast Retransmit Mechanism
Duplicate acks (dupacks) may be generated when a packet is lost.
TCP sender assumes that a packet loss has occurred if it receives

three dupacks consecutively.
Duplicate acks (dupacks) may also be generated due to

out-of-order packet (segment) delivery.
12 8 11 10 9 7
3 dupacks are also generated if

a packet is delivered at least 3 places
beyond its in-sequence location
Fast retransmit useful only if lower layers deliver packets

almost ordered ---- otherwise, fast transmit is unnecessary
11
10
TCP Round Trip Time (RTT) and Timeout

Q: How to estimate RTT?
Q: How to set TCP

timeout value?
longer than RTT

note: RTT will vary
too short: premature

timeout
12
ignore retransmissions, or
cumulatively ACKed segments
unnecessary
retransmissions
too long: slow reaction
to segment loss
SampleRTT: measured time from

segment transmission until ACK
receipt
SampleRTT will vary, want

estimated RTT smoother
use several recent
measurements, not just current
SampleRTT
TCP Round Trip Time and Timeout (2)

EstimatedRTT (new) =
(1-x) (EstimatedRTT) + x (SampleRTT)
Exponential Weighted
Moving Average (EWMA)
Influence of given sample

decreases exponentially
fast
typical value of x = 0.125
TCP Round Trip Time and Timeout (3)

Setting the timeout
EstimatedRTT plus safety
margin
large variation in
EstimatedRTT
larger safety margin
Timeout = EstimatedRTT + 4(Deviation)

Deviation = (1-x)(Deviation) +
x |SampleRTT - EstimatedRTT|
13
14
TCP Congestion Control: Algorithm
Two phases
Slow Start
Q: What is the target window size?
slow start
Example: MSS = 500 bytes & RTT = 200 msec
congestion avoidance
initial rate = 20 kbps
Important variables:
available bandwidth may be >> MSS/RTT

desirable to quickly ramp up to respectable rate
Cwnd
ssthresh: defines threshold between slow start phase and
congestion control phase
When connection begins, Cwnd = 1 MSS
probing for usable bandwidth:

Ideally: transmit as fast as possible (Cwnd as large as possible)
without loss
Increment window size by 1 MSS on each new ack
Slow start phase ends when window size reaches (or

exceeds) the slow-start threshold (= ssthresh)
cwnd grows exponentially with time during slow start
increase Cwnd until loss (congestion)
factor of 1.5 per RTT if every other segment ackd
loss: decrease Cwnd, then begin probing (increasing) again
factor of 2 per RTT if every segment ackd

Could be less if sender does not always have data to send
15
16
TCP Slow Start
Congestion Avoidance
Slowstart algorithm
Host B
On each new ack, increase cwnd by 1/cwnd segment
cwnd increases linearly with time during congestion

avoidance
RTT
14
Congestion Window size (segments)
initialize: Cwnd = 1
for (each segment ACKed)
Cwnd++
until (loss event OR
Cwnd > ssthresh)
Host A
Cwnd unit = MSS here

Exponential increase (per RTT) in
window size (not so slow!)
Loss event: timeout or or three
duplicate ACKs
time
Congestion
avoidance
12
10
8
6
4
2
Slow start
threshold
Slow
start
Example
assumes that
acks are not
delayed
0
0
Time (round trips)
17
18
TCP Congestion Avoidance

Congestion avoidance
/* slowstart is over */
/* Cwnd > ssthresh */
Until (loss event) {
every cwnd segments
ACKed:
Cwnd++
}
ssthresh = Cwnd/2
Cwnd = 1
perform slowstart
Refinement: Inferring loss (TCP-Reno)
In case of TCP Tahoe
After 3 dup ACKs:

Cwnd is cut in half
window then grows
linearly
But after timeout event:

Cwnd instead set to 1
MSS;
Enter slow-start
Philosophy:
3 dup ACKs indicates
network capable of
delivering some segments
timeout indicates a
more alarming
congestion scenario
Cwnd unit = MSS here

Congestion Avoidance phase ends when there is a loss event.
19
20
Refinement (more)
Congestion Control - Fast retransmit
Congestion window size (cwnd)
3 dupacks
Fast retransmit occurs when multiple ( 3) dupacks come

back
Fast recovery follows fast retransmit
Different from the case when slow start follows a timeout

When timeout occurs, no more packets are getting across.
When fast retransmit occurs, a packet is lost, but latter packets
get through
Ack clock is still there (not expired) when fast retransmit occurs
No need for slow start
21
22
Fast Recovery
AIMD
ssthresh = min (cwnd, receivers advertised window)/2
retransmit the missing segment (fast retransmit)

cwnd = ssthresh + number of dupacks (3)
Fast recovery lasts until a non-duplicate ACK is received.
when a new ack comes: cwnd = ssthreh
Decrease window by
factor of 2 on loss
event
Congestion window cut into half (in TCP-Reno)
AIMD:
additive increase,
multiplicative
decrease
Increase window by 1
per RTT
Then, enter congestion avoidance
23
Fairness goal: If N TCP

sessions share same
bottleneck link, each
should get 1/N of link
capacity
TCP congestion
avoidance:
(at least 2 MSS)
TCP Fairness
24
TCP connection 1
TCP
connection 2
bottleneck
router
capacity R
TCP Fairness
2 sessions share the common link with BW = R
R1 and R2 are throughputs of each session
Throughput increases as Cwnd grows and decreases as Cwnd does.
TCP Fairness (more)

Fairness and UDP
R2
full bw. utilization
equal bandwidth
share (y=x)
25
Instead use UDP:

pump audio/video at
constant rate, tolerate
packet loss
loss: decrease window by factor of 2

congestion avoidance: additive increase
loss: decrease window by factor of 2
congestion avoidance: additive increase
nothing prevents app from

opening parallel
connections between 2
hosts.
Web browsers do this
Example: link of rate R

supporting 9 connections;
do not want rate throttled

by congestion control
(R/2, R/2)
Target
operating
point
Multimedia apps often

do not use TCP
Fairness and parallel TCP

connections
Research area: TCPfriendly
new app asks for 11 TCPs,

gets > R/2 !
R1
new app asks for 1 TCP, gets

rate R/10
26
Efficiency vs. Fairness
Summary of TCP Cong. Control

(1) Start of Connection: ssthresh is initialized.
Each link provides same bandwidth R
(2) Slow Start: cwnd is increased by 1 with each nonduplicate ack.

cwnd is increased by cwnd over one RTT.
cwnd doubles each RTT (exponential growth).
Slow start is not so slow!!
(3) Congestion Avoidance: After cwnd ssthresh, cwnd is

increased by 1/cwnd with each non-duplicate ack.
cwnd is increased by 1 each RTT.
Maximize the total throughput (sum of all throughputs)?
27
28
Summary (Contd)
Throughput Analysis of TCP- Reno
(4) Fast Recovery: Upon reception of 3 duplicate acks,
Assumptions:
ssthresh is set to cwnd/2,
Infinite data, infinite receiver window size

Packet loss followed by fast recovery (no time-out)
Single TCP connection, single router
Steady-state (Congestion-avoidance phase)
cwnd = ssthresh+3,
For each duplicate ack received, cwnd is increased by 1 (in every
RTT).
Note: When retransmitted packet is finally acked, cwnd is set to
ssthresh, and the algorithm enters Congestion Avoidance phase.
Constant RTT
(5) Upon retransmission timeout,

ssthresh = cwnd/2,
cwnd = 1,
Slow Start begins.
29
30
TCP future: TCP over long, fat pipes
Throughput Analysis of TCP (Reno) (2)
Slope: 1/RTT
Window
size in pkts W
W/2
Single TCP connection
Slope = 1/ RTT
Packet size = 1500 bytes

RTT = 100ms
D
w/2
Time
31
Problem with large bandwidth-delay networks
Capacity (throughput) = 10Gbps

T 1.5 hrs!
T
t
Packet drop rate 1 out of
D = W RTT/2
In D sec, the connection sends W/2+(W/2+1)++2W/2 3W2/8

units (MSS)
It takes too long to fill the pipe again after loss under-utilization !
Throughput R = (3MSS
Or, to fill the pipe, the packet drop rate must be even smaller than the
physical fiber error rate!
New versions of TCP for high-speed needed!
Loss rate L =
Solving for W
W2/8)
/D
1/(3W2/8)
32
5 billion packets!
RED: Random Early Drop
Active Queue Managements (AQM) at Routers

Goal: minimize the delay (i.e., the queue size) while giving
necessary congestion info. to senders in time
Drop-tail
Early Congestion Notification (ECN)
Flip a
coin
drop
accept
Random Early Detection (RED)

Rmax
Random Exponential Mark (REM)
Adaptive Virtual Queueing (AVQ)
Hybrid of above, etc.
Packet drop
probability
0
33
34
RED: Random Early Drop (2)
Sources learn about congestion before router is full,

and slow down before facing multiple packet losses.
Alleviate global synchronization of Drop-Tail
RED is more likely to drop packets from faster
connection.
Faster connections are more likely to slow down.
Rmin
Rmax
Buffer
occupancy
Possible Improvements for TCP
Effect:
Marking mechanism that is

currently available on Cisco
routers (not used though!).
Rmin
(1) Increase window size at a rate independent of RTT.

Eliminate bias in favor of connections with short RTTs.
Very difficult to choose a rate that works well over a wide range of
RTTs.
(2) Base window adjustment on delays, not loss.

Problem: How to estimate the delays?
Compare RTT with minimum RTT
Idea: (RTT min RTT) measures queueing time in routers.
TCP-Vegas uses this protocol.
35
36
(3) Explicit Congestion Notification:
Routers mark packets during congestion and/or send signal to source.
Requires more sophisticated routers.
Advantage: Sends source a signal to slow down without forced

retransmission.
Disadvantage:
(6) Cheat and modify TCP window control.

Walrand & Varaiya say this is done all the time in commercial
products.
May bias in favor of shorter connections.

Routers have to implement marking
(5) Predict congestion based on current load and mark

packets so that receiver can send feedback to the source:
(4) Base decision to drop packets on # of packets of

same connection in router, not simply on total # of
packets.
(7) Problem with wireless links.

Not all packet losses are due to congestion.
(8) Problem with large bandwidth-delay links
Keeping track of all connections at router More complex to

implement.
37
38

Transport

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Transport

Загружено:

Авторское право:

Доступные форматы

Transport services and protocols

Chapter 6: Transport Layer

Transport protocols run in end

Reliable, in-order unicast

Transport layer provides a mechanism for either

Connection-oriented, duplex, reliable byte-stream service

TCP uses sliding window

TCP (Transmission Control Protocol)

TCP SACK: Selective Repeat.

UDP (User Datagram Protocol)

Any application layer protocol on top of TCP or

Connection oriented in TCP is very different from that in

Sequence number of the first segment of a connection is

ECE/CSC 570, Fall 2014

ECE/CSC 570, Fall 2014

TCP (Transmission Control

Provide logical communication

ECE/CSC 570, Fall 2014

Flow Control / Congestion Control

TCP Features (2)

Connections are released by a two way handshake in

32-bit ack is cumulative:

Flow Control mechanism (rwnd)

Ack of n indicates all bytes up to n-1 have been received correctly,

TCP does not use NACK, but duplicate Acks.

How does sender know byte n is lost?

Effective window = min (rwnd, cwnd)

Flow Control mechanism is simple.

If seq. # n is acked multiple times, sender may infer that byte n is

An inadequate receiver buffer size may constrain the throughput

ECE/CSC 570, Fall 2014

ECE/CSC 570, Fall 2014

TCP Congestion Control (cwnd)

Options (variable length)

Initially there is a multiplicative increase of the window

Normal operation: AIMD Additive Increase and

How can end-systems detect congestion?

ECE/CSC 570, Fall 2014

Congestion Control mechanism is more complex.

TCP segment structure: Overview

Two mechanisms in TCP that control the flow of

ECE/CSC 570, Fall 2014

Detecting Packet Loss Using

How does TCP (end-host) detect a

Retransmission timeout (RTO)

At any time, TCP sender sets retransmission

If acknowledgement for the segment is not

RTO dynamically calculated

ECE/CSC 570, Fall 2014

Detecting Packet Loss Using Dupacks:

Duplicate acks (dupacks) may be generated when a packet is lost.

TCP sender assumes that a packet loss has occurred if it receives

Duplicate acks (dupacks) may also be generated due to

3 dupacks are also generated if

Fast retransmit useful only if lower layers deliver packets

ECE/CSC 570, Fall 2014

ECE/CSC 570, Fall 2014

TCP Round Trip Time (RTT) and Timeout

Q: How to set TCP

longer than RTT

too short: premature

SampleRTT: measured time from

SampleRTT will vary, want

ECE/CSC 570, Fall 2014