Вы находитесь на странице: 1из 10

Transport services and protocols

Chapter 6: Transport Layer

Transport protocols run in end


systems

Reliable, in-order unicast


delivery: TCP

network
data link
physical

connection setup

network
data link
physical

Unreliable (best-effort),
unordered unicast or multicast
delivery: UDP

application
transport
network
data link
physical

TCP Features

Transport layer provides a mechanism for either


quick unreliable delivery (UDP) or error recovery
and flow control (TCP).

Connection-oriented, duplex, reliable byte-stream service


with flow control.

TCP uses sliding window


TCP Reno, newReno: Go-Back-N.

TCP (Transmission Control Protocol)

TCP SACK: Selective Repeat.

UDP (User Datagram Protocol)

Any application layer protocol on top of TCP or


UDP automatically operates across the internet.

Connection oriented in TCP is very different from that in


ATM. How?

Sequence number of the first segment of a connection is


agreed upon by a three-way handshake:
Source announces sequence number.
Destination acks sequence number.
Source sends first segment with that number.

ECE/CSC 570, Fall 2014

network
data link
physical

ECE/CSC 570, Fall 2014

Transport Layer

network
data link
physical

network
data link
physical

flow control

application
transport
network
data link
physical

congestion

TCP (Transmission Control


Protocol) Congestion Control

Provide logical communication


between application processes
running on different hosts

ECE/CSC 570, Fall 2014

Flow Control / Congestion Control

TCP Features (2)

Connections are released by a two way handshake in


each direction.

32-bit ack is cumulative:

Flow Control mechanism (rwnd)


Congestion Control mechanism (cwnd)

Ack of n indicates all bytes up to n-1 have been received correctly,


and n is next expected byte number.

TCP does not use NACK, but duplicate Acks.

How does sender know byte n is lost?

Effective window = min (rwnd, cwnd)

Flow Control mechanism is simple.


Receiver uses a window size field in the ack to advertise the size
of the window rwnd that reflects its buffer capacity.

If seq. # n is acked multiple times, sender may infer that byte n is


lost or corrupted.

An inadequate receiver buffer size may constrain the throughput


of the connection regardless of the state of the network.

Time-out

ECE/CSC 570, Fall 2014

ECE/CSC 570, Fall 2014

TCP Congestion Control (cwnd)

32 bits

ACK: ACK #
valid
PSH: push data now
(generally not used)
RST, SYN, FIN:
connection estab
(setup, teardown
commands)
Internet
checksum
(as in UDP)

source port #

dest port #

sequence number
acknowledgement number

head not
UA P R S F
len used

checksum

Receive window
Urg data pnter

Options (variable length)

counting
by bytes
of data
(not segments!)
# bytes
rcvr willing
to accept

Initially there is a multiplicative increase of the window


size (slow start)

Normal operation: AIMD Additive Increase and


Multiplicative Decrease of the window size (congestion
avoidance phase).

How can end-systems detect congestion?


Router silently drops packet when congestion occurs (drop-tail)

application
data
(variable length)

ECE/CSC 570, Fall 2014

Congestion Control mechanism is more complex.

TCP segment structure: Overview


URG: urgent data
(generally not used)

Two mechanisms in TCP that control the flow of


information:

Assumption:
When there is a packet loss, congestion occurs somewhere in
the network.

ECE/CSC 570, Fall 2014

Detecting Packet Loss Using


Retransmission Timeout (RTO)

How does TCP (end-host) detect a


packet loss?

Retransmission timeout (RTO)

Duplicate acknowledgements
In practice, triple duplicate acks are considered
to be a signal of a packet loss.

At any time, TCP sender sets retransmission


timer for one TCP packet (or segment)

If acknowledgement for the segment is not


received before timer goes off, the segment is
assumed to be lost

RTO dynamically calculated


Time out period doubles for each timeout event.

ECE/CSC 570, Fall 2014

Detecting Packet Loss Using Dupacks:


Fast Retransmit Mechanism

Duplicate acks (dupacks) may be generated when a packet is lost.

TCP sender assumes that a packet loss has occurred if it receives


three dupacks consecutively.

Duplicate acks (dupacks) may also be generated due to


out-of-order packet (segment) delivery.
12 8 11 10 9 7

3 dupacks are also generated if


a packet is delivered at least 3 places
beyond its in-sequence location

Fast retransmit useful only if lower layers deliver packets


almost ordered ---- otherwise, fast transmit is unnecessary

11

ECE/CSC 570, Fall 2014

ECE/CSC 570, Fall 2014

10

TCP Round Trip Time (RTT) and Timeout


Q: How to estimate RTT?

Q: How to set TCP


timeout value?

longer than RTT


note: RTT will vary

too short: premature


timeout

12

ignore retransmissions, or
cumulatively ACKed segments

unnecessary
retransmissions
too long: slow reaction
to segment loss

SampleRTT: measured time from


segment transmission until ACK
receipt

SampleRTT will vary, want


estimated RTT smoother
use several recent
measurements, not just current
SampleRTT

ECE/CSC 570, Fall 2014

TCP Round Trip Time and Timeout (2)


EstimatedRTT (new) =
(1-x) (EstimatedRTT) + x (SampleRTT)

Exponential Weighted
Moving Average (EWMA)

Influence of given sample


decreases exponentially
fast

typical value of x = 0.125

TCP Round Trip Time and Timeout (3)


Setting the timeout
EstimatedRTT plus safety
margin
large variation in
EstimatedRTT
larger safety margin

Timeout = EstimatedRTT + 4(Deviation)


Deviation = (1-x)(Deviation) +
x |SampleRTT - EstimatedRTT|

ECE/CSC 570, Fall 2014

13

ECE/CSC 570, Fall 2014

14

TCP Congestion Control: Algorithm

Two phases

Slow Start

Q: What is the target window size?

slow start

Example: MSS = 500 bytes & RTT = 200 msec

congestion avoidance

initial rate = 20 kbps

Important variables:

available bandwidth may be >> MSS/RTT


desirable to quickly ramp up to respectable rate

Cwnd
ssthresh: defines threshold between slow start phase and
congestion control phase

When connection begins, Cwnd = 1 MSS

probing for usable bandwidth:


Ideally: transmit as fast as possible (Cwnd as large as possible)
without loss

Increment window size by 1 MSS on each new ack

Slow start phase ends when window size reaches (or


exceeds) the slow-start threshold (= ssthresh)

cwnd grows exponentially with time during slow start

increase Cwnd until loss (congestion)

factor of 1.5 per RTT if every other segment ackd

loss: decrease Cwnd, then begin probing (increasing) again

factor of 2 per RTT if every segment ackd


Could be less if sender does not always have data to send

15

ECE/CSC 570, Fall 2014

16

ECE/CSC 570, Fall 2014

TCP Slow Start

Congestion Avoidance

Slowstart algorithm
Host B

On each new ack, increase cwnd by 1/cwnd segment

cwnd increases linearly with time during congestion


avoidance

RTT

14
Congestion Window size (segments)

initialize: Cwnd = 1
for (each segment ACKed)
Cwnd++
until (loss event OR
Cwnd > ssthresh)

Host A

Cwnd unit = MSS here


Exponential increase (per RTT) in
window size (not so slow!)
Loss event: timeout or or three
duplicate ACKs

time

Congestion
avoidance

12
10
8
6
4
2

Slow start
threshold

Slow
start

Example
assumes that
acks are not
delayed

0
0

Time (round trips)

ECE/CSC 570, Fall 2014

17

ECE/CSC 570, Fall 2014

18

TCP Congestion Avoidance


Congestion avoidance
/* slowstart is over */
/* Cwnd > ssthresh */
Until (loss event) {
every cwnd segments
ACKed:
Cwnd++
}
ssthresh = Cwnd/2
Cwnd = 1
perform slowstart

Refinement: Inferring loss (TCP-Reno)

In case of TCP Tahoe

After 3 dup ACKs:


Cwnd is cut in half
window then grows
linearly

But after timeout event:


Cwnd instead set to 1
MSS;
Enter slow-start

Philosophy:
3 dup ACKs indicates
network capable of
delivering some segments
timeout indicates a
more alarming
congestion scenario

Cwnd unit = MSS here


Congestion Avoidance phase ends when there is a loss event.
19

ECE/CSC 570, Fall 2014

20

ECE/CSC 570, Fall 2014

Refinement (more)

Congestion Control - Fast retransmit

Congestion window size (cwnd)

3 dupacks

Fast retransmit occurs when multiple ( 3) dupacks come


back

Fast recovery follows fast retransmit

Different from the case when slow start follows a timeout


When timeout occurs, no more packets are getting across.
When fast retransmit occurs, a packet is lost, but latter packets
get through
Ack clock is still there (not expired) when fast retransmit occurs
No need for slow start

ECE/CSC 570, Fall 2014

21

ECE/CSC 570, Fall 2014

22

Fast Recovery

AIMD

ssthresh = min (cwnd, receivers advertised window)/2

retransmit the missing segment (fast retransmit)


cwnd = ssthresh + number of dupacks (3)

Fast recovery lasts until a non-duplicate ACK is received.

when a new ack comes: cwnd = ssthreh

Decrease window by
factor of 2 on loss
event

Congestion window cut into half (in TCP-Reno)

ECE/CSC 570, Fall 2014

AIMD:
additive increase,
multiplicative
decrease
Increase window by 1
per RTT

Then, enter congestion avoidance

23

Fairness goal: If N TCP


sessions share same
bottleneck link, each
should get 1/N of link
capacity

TCP congestion
avoidance:

(at least 2 MSS)

TCP Fairness

24

TCP connection 1

TCP
connection 2

ECE/CSC 570, Fall 2014

bottleneck
router
capacity R

TCP Fairness

2 sessions share the common link with BW = R

R1 and R2 are throughputs of each session

Throughput increases as Cwnd grows and decreases as Cwnd does.

TCP Fairness (more)


Fairness and UDP

R2

full bw. utilization

equal bandwidth
share (y=x)

25

Instead use UDP:


pump audio/video at
constant rate, tolerate
packet loss

loss: decrease window by factor of 2


congestion avoidance: additive increase
loss: decrease window by factor of 2
congestion avoidance: additive increase

nothing prevents app from


opening parallel
connections between 2
hosts.

Web browsers do this

Example: link of rate R


supporting 9 connections;

do not want rate throttled


by congestion control

(R/2, R/2)
Target
operating
point

Multimedia apps often


do not use TCP

Fairness and parallel TCP


connections

Research area: TCPfriendly

new app asks for 11 TCPs,


gets > R/2 !

R1

ECE/CSC 570, Fall 2014

new app asks for 1 TCP, gets


rate R/10

26

Efficiency vs. Fairness

ECE/CSC 570, Fall 2014

Summary of TCP Cong. Control


(1) Start of Connection: ssthresh is initialized.

Each link provides same bandwidth R

(2) Slow Start: cwnd is increased by 1 with each nonduplicate ack.


cwnd is increased by cwnd over one RTT.
cwnd doubles each RTT (exponential growth).
Slow start is not so slow!!

(3) Congestion Avoidance: After cwnd ssthresh, cwnd is


increased by 1/cwnd with each non-duplicate ack.
cwnd is increased by 1 each RTT.
Maximize the total throughput (sum of all throughputs)?

27

ECE/CSC 570, Fall 2014

28

ECE/CSC 570, Fall 2014

Summary (Contd)

Throughput Analysis of TCP- Reno

(4) Fast Recovery: Upon reception of 3 duplicate acks,

Assumptions:

ssthresh is set to cwnd/2,

Infinite data, infinite receiver window size


Packet loss followed by fast recovery (no time-out)
Single TCP connection, single router
Steady-state (Congestion-avoidance phase)

cwnd = ssthresh+3,
For each duplicate ack received, cwnd is increased by 1 (in every
RTT).
Note: When retransmitted packet is finally acked, cwnd is set to
ssthresh, and the algorithm enters Congestion Avoidance phase.

Constant RTT

(5) Upon retransmission timeout,


ssthresh = cwnd/2,
cwnd = 1,
Slow Start begins.

ECE/CSC 570, Fall 2014

29

ECE/CSC 570, Fall 2014

30

TCP future: TCP over long, fat pipes

Throughput Analysis of TCP (Reno) (2)

Slope: 1/RTT

Window
size in pkts W

W/2

Single TCP connection

Slope = 1/ RTT

Packet size = 1500 bytes


RTT = 100ms

D
w/2

Time

31

Problem with large bandwidth-delay networks

Capacity (throughput) = 10Gbps


T 1.5 hrs!

T
t

Packet drop rate 1 out of

D = W RTT/2

In D sec, the connection sends W/2+(W/2+1)++2W/2 3W2/8


units (MSS)

It takes too long to fill the pipe again after loss under-utilization !

Throughput R = (3MSS

Or, to fill the pipe, the packet drop rate must be even smaller than the
physical fiber error rate!

New versions of TCP for high-speed needed!

Loss rate L =

Solving for W

W2/8)

/D

1/(3W2/8)

ECE/CSC 570, Fall 2014

32

5 billion packets!

ECE/CSC 570, Fall 2014

RED: Random Early Drop

Active Queue Managements (AQM) at Routers


Goal: minimize the delay (i.e., the queue size) while giving
necessary congestion info. to senders in time

Drop-tail

Early Congestion Notification (ECN)

Flip a
coin

drop

accept

Random Early Detection (RED)


Rmax

Random Exponential Mark (REM)

Adaptive Virtual Queueing (AVQ)

Hybrid of above, etc.

Packet drop
probability

0
ECE/CSC 570, Fall 2014

33

34

RED: Random Early Drop (2)

Sources learn about congestion before router is full,


and slow down before facing multiple packet losses.
Alleviate global synchronization of Drop-Tail
RED is more likely to drop packets from faster
connection.
Faster connections are more likely to slow down.

Rmin

Rmax

Buffer
occupancy

ECE/CSC 570, Fall 2014

Possible Improvements for TCP

Effect:

Marking mechanism that is


currently available on Cisco
routers (not used though!).

Rmin

(1) Increase window size at a rate independent of RTT.


Eliminate bias in favor of connections with short RTTs.
Very difficult to choose a rate that works well over a wide range of
RTTs.

(2) Base window adjustment on delays, not loss.


Problem: How to estimate the delays?
Compare RTT with minimum RTT
Idea: (RTT min RTT) measures queueing time in routers.
TCP-Vegas uses this protocol.

35

ECE/CSC 570, Fall 2014

36

ECE/CSC 570, Fall 2014

Possible Improvements for TCP

Possible Improvements for TCP

(3) Explicit Congestion Notification:

Routers mark packets during congestion and/or send signal to source.

Requires more sophisticated routers.

Advantage: Sends source a signal to slow down without forced


retransmission.

Disadvantage:

(6) Cheat and modify TCP window control.


Walrand & Varaiya say this is done all the time in commercial
products.

May bias in favor of shorter connections.


Routers have to implement marking

(5) Predict congestion based on current load and mark


packets so that receiver can send feedback to the source:

(4) Base decision to drop packets on # of packets of


same connection in router, not simply on total # of
packets.

(7) Problem with wireless links.


Not all packet losses are due to congestion.

(8) Problem with large bandwidth-delay links

Keeping track of all connections at router More complex to


implement.

37

ECE/CSC 570, Fall 2014

38

ECE/CSC 570, Fall 2014

Вам также может понравиться