You are on page 1of 40

CS2307-COMPUTER NETWORKS LAB MANUAL

SEM/YEAR:VI/III STAFF CODE:CS66


STAFF NAME:D.SORNA SHANTHI
E N!:" PRO#RAM USIN# TCP SOCKETS
EX NO: 1.i DATE AND TIME SERVER
AIM:
TO implement date and time display from local ost to ser!er "sin# T$%
AL#ORITHM: CLIENT
1.start te pro#ram
&. To create a soc'et in client to ser!er.
(. te client connection accept to te ser!er and replay to read te system date and time.
). Stop te pro#ram.
AL#ORITHM: SERVER
1.start te pro#ram
&. To create a soc'et in ser!er to client.
(. To display te c"rrent date and time to client
). Stop te pro#ram.
EX NO:1.ii CLIENT-SERVER APPLICATION FOR CHAT
AIM:
To *rite a client+ser!er application for cat "sin# T$%
AL#ORITHM: CLIENT
1.start te pro#ram
&. To create a soc'et in client to ser!er.
(. Te client esta,lises a connection to te ser!er.
-. Te client accept te connection and to send te data from client to ser!er and !ice
!ersa
). Te client comm"nicate te ser!er to send te end of te messa#e
.. Stop te pro#ram.
AL#ORITHM: SERVER
1.start te pro#ram
&. To create a soc'et in ser!er to client
(. Te ser!er esta,lises a connection to te client.
-. Te ser!er accept te connection and to send te data from ser!er to client and !ice
ersa
). Te ser!er comm"nicate te client to send te end of te messa#e
..Stop te pro#ram.
EX NO:1.iii IMPLEMENTATION OF TCP/IP ECHO
AIM:
To implementation of eco client ser!er "sin# T$%/I%
AL#ORITHM:
1.start te pro#ram
& To create a soc'et in client to ser!er.
(. (e client esta,lises a connection to te ser!er.
-. Te client accept te connection and send data to ser!er and te ser!er to replay te
eco messa#e to te client
). Te client comm"nicate te ser!er to send te end of te messa#e
.. Stop te pro#ram.
E N!:2 PRO#RAM USIN# SIMPLE UDP
EX NO:&.i DOMAIN NAME SYSTEM
AIM:
To *rite a $ pro#ram to de!elop a DNS client ser!er to resol!e te #i!en
ostname.
A01ORIT2M:
1. $reate a ne* file. Enter te domain name and address in tat file.
&. To esta,lis te connection ,et*een client and ser!er.
(. $ompile and e3ec"te te pro#ram.
-. Enter te domain name as inp"t.
). Te I% address correspondin# to te domain name is display on te screen
.. Enter te I% address on te screen.
4. Te domain name correspondin# to te I% address is display on te screen.
5. Stop te pro#ram.
P$!%$&' :
6incl"de7stdio.8
6incl"de7stdli,.8
6incl"de7errno.8
6incl"de7netd,.8
6incl"de7sys/types.8
6incl"de7sys/soc'et.8
6incl"de7netinet/in.8
int main9int ar#c:car ;ar#!<1=>
?
str"ct ostent ;en@
if9ar#cAB&>
?
fprintf9stderr:CEnter te ostname DnC>@
e3it91>@
E
enB#etost,yname9ar#!<1=>@
if9enBBNF00>
?
fprintf9stderr:C2ost not fo"nd DnC>@
E
printf9C2ostname is Gs DnC:en+8Hname>@
printf9CI% address is Gs DnC:inetHntoa9;99str"ct inHaddr ;>en+8Haddr>>>@
E
RESULT:
T"s te a,o!e pro#ram "dp performance "sin# domain name ser!er *as
e3ec"ted and s"ccessf"lly
EX NO:&.ii PRO#RAM USIN# UDP SOCKET
AIM:
To *rite a client+ser!er application for cat "sin# FD%
AL#ORITHM: CLIENT
1. Incl"de necessary pac'a#e in Ia!a
&. Te client esta,lises a connection to te ser!er.
(. Te client accept te connection and to send te data from client to ser!er and !ice
!ersa
-. Te client comm"nicate te ser!er to send te end of te messa#e
). Stop te pro#ram.
AL#ORITHM: SERVER
1. Incl"de necessary pac'a#e in Ia!a
&. Te ser!er esta,lises a connection to te client.
(. Te ser!er accept te connection and to send te data from ser!er to client and !ice
!ersa
-. Te ser!er comm"nicate te client to send te end of te messa#e
). Stop te pro#ram.
EX NO 3 :
PRO#RAMS USIN# RAW SOCKETS (LIKE PACKET CAPTURIN# AND
FILTERIN#)
AIM :
To implement pro#rams "sin# ra* soc'ets 9li'e pac'et capt"rin# and filterin#>
A01ORIT2M :
1. Start te pro#ram and to incl"de te necessary eader files
&. To define te pac'et len#t
(. To declare te I% eader str"ct"re "sin# T$%eader
-. Fsin# simple cec's"m process to cec' te process
). Fsin# T$% DI% comm"nication protocol to e3ec"te te pro#ram
.. And "sin# T$%DI% comm"nication to enter te So"rce I% and port n"m,er and Tar#et I%
address and port n"m,er.
4. Te Ra* soc'et 9> is created and accept te Soc'et 9 > and Send to 9 >: A$J
5. Stop te pro#ram
//---cat rawtcp.c---
// Run as root or SUID 0, just datagram no data/payload
#include <unistd.!
#include <stdio.!
#include <sys/soc"et.!
#include <netinet/ip.!
#include <netinet/tcp.!
// #ac"et lengt
#de$ine #%&'()*+ ,-./
// 0ay create separate eader $ile 1.2 $or all
// eaders3 structures
// I# eader3s structure
struct ipeader 4
unsigned car ip(il56, /7 )ittle-endian 7/
ip(8er59:
unsigned car ip(tos:
unsigned sort int ip(len:
unsigned sort int ip(ident:
unsigned car ip($lags:
unsigned sort int ip(o$$set:
unsigned car ip(ttl:
unsigned car ip(protocol:
unsigned sort int ip(c"sum:
unsigned int ip(sourceip:
unsigned int ip(destip:
;:
/7 Structure o$ a '%# eader 7/
struct tcpeader 4
unsigned sort int tcp(srcport:
unsigned sort int tcp(destport:
unsigned int tcp(se<num:
unsigned int tcp(ac"num:
unsigned car tcp(reser8ed59, tcp(o$$set59:
// unsigned car tcp($lags:
unsigned int
tcp(res-59, /7little-endian7/
tcp(len59, /7lengt o$ tcp eader in =/->it
words7/
tcp($in5-, /7?inis $lag @$in@7/
tcp(syn5-, /7SyncroniAe se<uence num>ers to
start a connection7/
tcp(rst5-, /7Reset $lag 7/
tcp(ps5-, /7#us, sends data to te
application7/
tcp(ac"5-, /7ac"nowledge7/
tcp(urg5-, /7urgent pointer7/
tcp(res/5/:
unsigned sort int tcp(win:
unsigned sort int tcp(c"sum:
unsigned sort int tcp(urgptr:
;:
// Simple cec"sum $unction, may use oters suc as %yclic
Redundancy %ec", %R%
unsigned sort csum1unsigned sort 7>u$, int len2
4
unsigned long sum:
$or1sumB0: len!0: len--2
sum CB 7>u$CC:
sum B 1sum !! -D2 C 1sum E0F$$$$2:
sum CB 1sum !! -D2:
return 1unsigned sort21Gsum2:
;
int main1int argc, car 7arg8HI2
4
int sd:
// +o data, just datagram
car >u$$erH#%&'()*+I:
// 'e siAe o$ te eaders
struct ipeader 7ip B 1struct ipeader 72 >u$$er:
struct tcpeader 7tcp B 1struct tcpeader 72 1>u$$er C
siAeo$1struct ipeader22:
struct soc"addr(in sin, din:
int one B -:
const int 78al B Eone:
memset1>u$$er, 0, #%&'()*+2:
i$1argc JB 62
4
print$1@- In8alid parametersJJJKn@2:
print$1@- Usage5 Ls <source ostname/I#! <source port!
<target ostname/I#! <target port!Kn@, arg8H0I2:
eFit1--2:
;
sd B soc"et1#?(I+*', SM%&(RNO, I##RM'M('%#2:
i$1sd < 02
4
perror1@soc"et12 error@2:
eFit1--2:
;
else
print$1@soc"et12-SM%&(RNO and tcp protocol is M&.Kn@2:
// 'e source is redundant, may >e used later i$ needed
// Nddress $amily
sin.sin($amily B N?(I+*':
din.sin($amily B N?(I+*':
// Source port, can >e any, modi$y as needed
sin.sin(port B tons1atoi1arg8H/I22:
din.sin(port B tons1atoi1arg8H9I22:
// Source I#, can >e any, modi$y as needed
sin.sin(addr.s(addr B inet(addr1arg8H-I2:
din.sin(addr.s(addr B inet(addr1arg8H=I2:
// I# structure
ip-!ip(il B 6:
ip-!ip(8er B 9:
ip-!ip(tos B -D:
ip-!ip(len B siAeo$1struct ipeader2 C siAeo$1struct
tcpeader2:
ip-!ip(ident B tons169=/-2:
ip-!ip(o$$set B 0:
ip-!ip(ttl B D9:
ip-!ip(protocol B D: // '%#
ip-!ip(c"sum B 0: // Done >y "ernel
// Source I#, modi$y as needed, spoo$ed, we accept troug
command line argument
ip-!ip(sourceip B inet(addr1arg8H-I2:
// Destination I#, modi$y as needed, >ut ere we accept
troug command line argument
ip-!ip(destip B inet(addr1arg8H=I2:
// 'e '%# structure. 'e source port, spoo$ed, we accept
troug te command line
tcp-!tcp(srcport B tons1atoi1arg8H/I22:
// 'e destination port, we accept troug command line
tcp-!tcp(destport B tons1atoi1arg8H9I22:
tcp-!tcp(se<num B tonl1-2:
tcp-!tcp(ac"num B 0:
tcp-!tcp(o$$set B 6:
tcp-!tcp(syn B -:
tcp-!tcp(ac" B 0:
tcp-!tcp(win B tons1=/PDP2:
tcp-!tcp(c"sum B 0: // Done >y "ernel
tcp-!tcp(urgptr B 0:
// I# cec"sum calculation
ip-!ip(c"sum B csum11unsigned sort 72 >u$$er,
1siAeo$1struct ipeader2 C siAeo$1struct tcpeader222:
// In$orm te "ernel do not $ill up te eaders3 structure,
we $a>ricated our own
i$1setsoc"opt1sd, I##RM'M(I#, I#(QDRI+%), 8al, siAeo$1one22
< 02
4
perror1@setsoc"opt12 error@2:
eFit1--2:
;
else
print$1@setsoc"opt12 is M&Kn@2:
print$1@Using55555Source I#5 Ls port5 Lu, 'arget I#5 Ls
port5 Lu.Kn@, arg8H-I, atoi1arg8H/I2, arg8H=I,
atoi1arg8H9I22:
// sendto12 loop, send e8ery / second $or 60 counts
unsigned int count:
$or1count B 0: count < /0: countCC2
4
i$1sendto1sd, >u$$er, ip-!ip(len, 0, 1struct soc"addr
72Esin, siAeo$1sin22 < 02
// Reri$y
4
perror1@sendto12 error@2:
eFit1--2:
;
else
print$1@%ount #Lu - sendto12 is M&Kn@, count2:
sleep1/2:
;
close1sd2:
return 0:
;
RESULT :
T"s te A,o!e pro#rams "sin# ra* soc'ets T$% DI% 9li'e pac'et capt"rin# and
filterin#> *as e3ec"ted and s"ccessf"lly.
EX NO: - PROGRAMS USING RPC / RMI
AIM:
To implement te pro#ram "sin# RMI
A01ORIT2M:
1. Start te pro#ram and to incl"de necessary pac'a#es
&. Fsin# Add client to #et te t*o !al"es
(.Fsin# Add ser!er9> to implement and $all te Add ser!er impl
-.Fsin# p",lic interface to call te pro#ram in remotely
).Kinally to call and compile all te s", pro#ram
..To E3ec"te Start RMI re#istry
4.Stop te pro#ram
RESULT:
T"s te A,o!e pro#ram RMI *as e3ec"ted and s"cessf"lly
E* N!: 0+ SIMULATION OF SLIDIN# WINDOW PROTOCOL
AIM:
To *rite a $ pro#ram to perform slidin# *indo*.
AL#ORITHM:
1. Start te pro#ram.
&. 1et te frame siLe from te "ser
(. To create te frame ,ased on te "ser reM"est.
-. To send frames to ser!er from te client side.
). If yo"r frames reac te ser!er it *ill send A$J si#nal to client oter*ise it *ill
send NA$J si#nal to client.
.. Stop te pro#ram
%RO1RAM :
// S0IDIN1 NINDON %ROTO$O0
$lient :
6incl"de 7stdio.8
6incl"de 7stdli,.8
6incl"de 7strin#.8
6incl"de 7sys/types.8
6incl"de 7sys/ipc.8
6incl"de 7sys/ms#.8
str"ct myms#,"f
?
lon# mtype@
car mte3t<&)=@
E@
KI0E ;fp@
int main9>
?
str"ct myms#,"f ,"f@
int ms#id@
int iBO:s@
int co"ntBO:frmsL@
int a<1OO=@
car d@
if99ms#idBms##et95P:I%$H$REATQO...>>BB+1>
?
printf9CDn ERROR IN MS11ETC>@
e3it9O>@
E
printf9CDn Enter te frame siLe:C>@
scanf9CGdC:RfrmsL>@
if99fpBfopen9Ccec'C:CrC>>BBNF00>
printf9CDn KI0E NOT O%ENEDC>@
else
printf9CDn KI0E O%ENEDC>@
*ile9Afeof9fp>>
?
dB#etc9fp>@
a<i=Bd@
iSS@
E
sBi@
for9iBO@i7frmsL@iSS> //print from te cec' file
printf9CDt GcC:a<i=>@
for9iBO@i7frmsL@iSS>
? if99ms#rc!9ms#id:R,"f:siLeof9,"f>:O:1>>BB+1>
?
printf9CDn ERROR IN MS1R$VC>@
e3it9O>@
E
printf9CDn RE$EIVED KRAMES ARE:GcC:,"f.mte3t<i=>@
E
for9iBO@i7frmsL@iSS>
? if9a<i=BB,"f.mte3t<i=>
co"ntSS@
E if9co"ntBBO>
?
printf9CDn KRAMES NERE NOT RE$EIVED IN $ORRE$T SETC>@
e3it9O>@
E if9co"ntBBfrmsL>
?
printf9CDn KRAMES NERE RE$EIVED IN $ORRE$T SETC>@
E else
?
printf9CDn KRAMES NERE NOT RE$EIVED IN $ORRE$T SETC>@
EE S,-.-/% W-/.!0 P$!1!2!, -
S3$43$
6incl"de 7stdio.8
6incl"de 7stdli,.8
6incl"de 7strin#.8
6incl"de 7sys/types.8
6incl"de 7sys/ipc.8
6incl"de 7sys/ms#.8
str"ct myms#,"f
? lon# mtype@
car mte3t<&)=@
E@
KI0E ;fp@
int main9>
?s
tr"ct myms#,"f ,"f@
int si:ei:sL@
int ms#id@
int iBO:s@
int a<1OO=@
car d@
if99fpBfopen9CsendC:CrC>>BBNF00>
printf9CDn KI0E NOT O%ENEDC>@
else
printf9CDn KI0E O%ENEDC>@
printf9CDn Enter startin# and endin# inde3 of frame array:C>@
scanf9CGdGdC:Rsi:Rei>@
sLBei+si@
if99ms#idBms##et95P:I%$H$REATQO...>>BB+1>
?
printf9CDn ERROR IN MS11ETC>@
e3it9O>@
E
*ile9Afeof9fp>>
?
dB#etc9fp>@
a<i=Bd@
iSS@
Es
Bi@
,"f.mtypeB1@
for9iBsi@i7Bei@iSS>
?
,"f.mte3t<i=Ba<i=@
E
for9iBsi@i7Bei@iSS> //te frames to ,e sent
printf9CDt GcC:,"f.mte3t<i=>@
for9iBO@i7BsL@iSS>
? if99ms#snd9ms#id:R,"f:siLeof9,"f>:O>>BB+1>
?
printf9CDn ERROR IN MS1SNDC>@
e3it9O>@
EE
printf9CDn KRAMES SENTC>@
ret"rn O@
E
RESULT:
T"s te a,o!e pro#ram slidin# *indo* protocol *as e3ec"ted and s"ccessf"lly
E N!:6 ADDRESS RESOLUTION PROTOCOL
AIM:
To #et te MA$ or %ysical address of te system "sin# Address Resol"tion %rotocol.
AL#ORITHM:
1. Incl"de necessary eader files.
&. InitialiLe te arpreM str"ct"re initially to Lero.
(. 1et te I%Address of te system as command line ar#"ment.
-. $ec' *eter te #i!en I%Address is !alid.
). $opy te I%Address from soc'addrHin str"ct"re to arpreM str"ct"re "sin# miscopy 9>
system call.
.. $reate a soc'et of type SO$JHD1RAM.
4. $alc"late te MA$ address for te #i!en I%Address "sin# ioctl9> system call.
5. Display te I%Address and MA$ address in te standard o"tp"t.
P$!%$&':
6incl"de7"nistd.8
6incl"de7sys/types.8
6incl"de7sys/soc'et.8
6incl"de7netinet/in.8
6incl"de7net/ifHarp.8
6incl"de7stdli,.8
6incl"de7stdio.8
6incl"de7netd,.8
6incl"de7sys/ioctl.8
6incl"de7arpa/inet.8
int main9int ar#c:car ;ar#!<=>
? int sd@
"nsi#ned car ;ptr@
str"ct arpreM myarpB??OEE@
str"ct soc'addrHin sinB?OE@
sin.sinHfamilyBAKHINET@
if9inetHaton9ar#!<1=:Rsin.sinHaddr>BBO>
?
printf9CI% address EnteredGs is not !alidDnC:ar#!<1=>@
e3it9O>@
E
memcpy9Rmyarp.arpHpa:Rsin:siLeof9myarp.arpHpa>>@
strcpy9myarp.arpHde!:CetOC>@
sdBsoc'et9AKHINET:SO$JHD1RAM:O>@
if9ioctl9sd:SIO$1AR%:Rmyarp>BB1>
?
printf9CNo entry in AR% cace forGsC:ar#!<1=>@
e3it9O>@
E
ptrBRmyarp.arpHa.saHdata<O=@
printf9CMA$ address forGsC:ar#!<1=>@
printf9CG3G3G3G3G3G3DnC:;ptr:;9ptrS1>:;9ptrS&>:;9ptrS(>:;9ptrS->:;9ptrS)>>@
ret"rn9O>@
E
RESULT:
T"s te MA$ address *as #enerated for I% address "sin# AR% protocol.
E* NO:7 IM%0EMENTIN1 ROFTIN1 %ROTO$O0S
AIM:
To sim"late te Implementin# Ro"tin# %rotocols "sin# ,order #ate*ay protocol9U1%>
AL#ORITHM:
1. Read te no. of nodes n
&. Read te cost matri3 for te pat from eac node to anoter node.
(. InitialiLe SOFR$E to 1 and incl"de 1
-. $omp"te D of a node *ic is te distance from so"rce to tat correspondin#
node.
). Repeat step . to step 5 for n+l nodes.
.. $oose te node tat as not ,een incl"ded *ose distance is minim"m
and incl"de tat node.
4. Kor e!ery oter node not incl"ded compare te distance directly from te
so"rce *it te distance to reac te node "sin# te ne*ly incl"ded node
5. Ta'e te minim"m !al"e as te ne* distance.
P. %rint all te nodes *it sortest pat cost from so"rce node
P$!%$&' :
6incl"de 7stdio.8
6incl"de7conio.8
int main9>
?
int n@
int i:I:'@
int a<1O=<1O=:,<1O=<1O=@
printf9CDn Enter te n"m,er of nodes:C>@
scanf9CGdC:Rn>@
for9iBO@i7n@iSS>
?
for9IBO@I7n@ISS>
?
printf9CDn Enter te distance ,et*een te ost Gd + Gd:C:iS1:IS1>@
scanf9CGdC:Ra<i=<I=>@
E
E
for9iBO@i7n@iSS>
?
for9IBO@I7n@ISS>
?
printf9CGdDtC:a<i=<I=>@
E
printf9CDnC>@
E
for9'BO@'7n@'SS>
?
for9iBO@i7n@iSS>
?
for9IBO@I7n@ISS>
?
if9a<i=<I=8a<i=<'=Sa<'=<I=>
?
a<i=<I=Ba<i=<'=Sa<'=<I=@
E
E
E
E
for9iBO@i7n@iSS>
?
for9IBO@I7n@ISS>
?
,<i=<I=Ba<i=<I=@
if9iBBI>
?
,<i=<I=BO@
E
EE
printf9CDn Te o"tp"t matri3:DnC>@
for9iBO@i7n@iSS>
?
for9IBO@I7n@ISS>
?
printf9CGdDtC:,<i=<I=>@
E
printf9CDnC>@
E
#etc9>@
E
RESULT:
T"s te a,o!e pro#ram to sim"late te Implementin# Ro"tin# %rotocols "sin# ,order #ate*ay
protocol *as e3ec"ted and s"ccessf"lly
E* NO:5 OPEN SHORTEST PATH FIRST ROUTIN# PROTOCOL
AIM:
To sim"late te O%EN S2ORTEST %AT2 KIRST ro"tin# protocol ,ased on te cost
assi#ned to te pat.
AL#ORITHM:
1.Read te no. of nodes n
&.Read te cost matri3 for te pat from eac node to anoter node.
(.InitialiLe SOFR$E to 1 and incl"de 1
-. $omp"te D of a node *ic is te distance from so"rce to tat correspondin#
node.
).Repeat step . to step 5 for n+l nodes.
..$oose te node tat as not ,een incl"ded *ose distance is minim"m
and incl"de tat node.
4.Kor e!ery oter node not incl"ded compare te distance directly from te
so"rce *it te distance to reac te node "sin# te ne*ly incl"ded node
5.Ta'e te minim"m !al"e as te ne* distance.
P.%rint all te nodes *it sortest pat cost from so"rce node
PRO#RAM:
6incl"de7stdio.8
6incl"de7conio.8
int a<)=<)=:n:i:I@
!oid main9>
?
!oid #etdata9>@
!oid sortest9>@
!oid display9>@
clrscr9>@
printf9CDnDn %RO1RAM TO KIND S2ORTEST %AT2 UETNEEN TNO
NODESDnC>@
#etdata9>@
sortest9>@
display9>@
#etc9>@
E
!oid #etdata9>
?
clrscr9>@
printf9CDnDnENTER T2E NFMUER OK 2OST IN T2E 1RA%2DnC>@
scanf9CGdC:Rn>@
printf9CDnDnIK T2ERE IS NO DIRE$T %AT2 DnC>@
printf9C DnDnASSI1N T2E 2I12EST DISTAN$E VA0FE 1OOO DnC>@
for9iBO@i7n@iSS>
?
a<i=<I=BO@
for9IBO@I7n@ISS>
?
if9iABI>
?
printf9CDnDnENTER T2E DISTAN$E UETNENN 9Gd:
Gd>: C:iS1:IS1>@
scanf9CGdC:Ra<i=<I=>@
if9a<i=<I=BBO>
a<i=<I=B1OOO@
E
E
E
E
!oid sortest9>
?
int i:I:'@
for9'BO@'7n@'SS>
for9iBO@i7n@iSS>
for9IBO@I7n@ISS>
?
if9a<i=<'=Sa<'=<I=7a<i=<I=>
a<i=<I=Ba<i=<'=Sa<'=<I=@
E
E
!oid display9>
?
int i:I@
for9iBO@i7n@iSS>
for9IBO@I7n@ISS>
if9iABI>
?
printf9CDn S2ORTEST %AT2 IS : 9Gd:Gd>++GdDnC:iS1:IS1:a<i=<I=>@
E
#etc9>@ E
RESULT:
T"s te a,o!e pro#ram to sim"late te Implementin# Ro"tin# %rotocols "sin# open sortest
pat first 9OS%K> *as e3ec"ted and s"ccessf"lly
E* NO : 6
S17.8 !9 UDP :3$9!$'&/23
I/1$!.721-!/
Most net*or' #ames "se te Fser Data#ram %rotocol 9FD%> as te "nderlyin# transport
protocol. Te Transport $ontrol %rotocol 9T$%>: *ic is *at most Internet traffic relies
on: is a relia,le connection+oriented protocol tat allo*s datastreams comin# from a
macine connected to te Internet to ,e recei!ed *ito"t error ,y any oter macine on
te Internet. FD% o*e!er: is an "nrelia,le connectionless protocol tat does not
#"arantee acc"rate or "nd"plicated deli!ery of data.
W;8 .! %&'3< 7<3 UDP=
T$% as pro!ed too comple3 and too slo* to s"stain real+time #ame+play. FD% allo*s
#amin# application pro#rams to send messa#es to oter pro#rams *it te minim"m of
protocol mecanism. 1ames do not rely "pon ordered relia,le deli!ery of data streams.
Nat is more important to #amin# applications is te prompt deli!ery of data. FD%
allo*s applications to send I% data#rams to oter applications *ito"t a!in# to esta,lis
a connection and tan a!in# to release it later: *ic increases te speed of
comm"nication. FD% is descri,ed in RK$ 4.5.
Te FD% se#ment so*n a,o!e consists of an 5+,yte eader follo*ed ,y te data octets.
F-3,.<
Te so"rce and destination ports identify te end points *itin te so"rce and destination
macines.
Te so"rce port indicates te port of te sendin# process and "nless oter*ise stated it is
te port to *ic a reply so"ld ,e sent to. A Lero is inserted into it if it is not "sed.
Te FD% 0en#t field so*s te len#t of te data#ram in octets. It incl"des te 5+,yte
eader and te data to ,e sent.
Te FD% cec's"m field contains te FD% eader: FD% data and te pse"do+eader
so*n a,o!e. Te pse"do+eader contains te (&+,it I% addresses of te so"rce and
destination macines: te FD% protocol n"m,er and te ,yte co"nt for te FD% se#ment.
Te pse"do+eader elps to find "ndeli!ered pac'ets or pac'ets tat arri!e at te *ron#
address. 2o*e!er te pse"do+eader !iolates te protocol ierarcy ,eca"se te I%
addresses *ic are "sed in it ,elon# to te I% layer and not to te FD% layer.
UDP L&13/28
Nile T$% implements a form of flo* control to stop te net*or' from floodin# tere is
no s"c concept in FD%. Tis is ,eca"se FD% does not rely on ac'no*led#ements to
si#nal s"ccessf"l deli!ery of data. %ac'ets are simply transmitted one after anoter *it
complete disre#ard to e!ent of te recei!er ,ein# flooded.
T;3 399321< !9 UDP
As mentioned ,efore te maIority of te traffic on te Internet relies on T$%. Nit te
e3plosi!e increase in te amo"nt of #amin# ta'in# place on te Internet: and *it most of
tese #ames "sin# FD%: tere are concerns a,o"t te effects tat FD% *ill a!e on T$%
traffic.
A st"dy carried o"t in te Fni!ersity of Nai'ato in Ne* Vealand s"##ests tat FD%
traffic as a ne#ati!e effect on T$% tro"#p"t. FD% is no* seen as ,ein# a##ressi!e to
'network friendly applications deploying adaptive congestion control'.
Nile T$% implements a form of flo* control to stop te net*or' from floodin# tere is
no s"c concept in FD%. Tis is ,eca"se FD% does not rely on ac'no*led#ements to
si#nal s"ccessf"l deli!ery of data. %ac'ets are simply transmitted one after anoter *it
complete disre#ard to e!ent of te recei!er ,ein# flooded. FD% affects T$% tro"#p"t
in m"c te same *ay as di#itiLed speec o!er I% does. Te st"dy so*s tat FD%
,ea!es in m"c te same *ay re#ardless of *at application is r"nnin# it.
FD% Uroadcast Kloodin#
A broadcast is a data pac'et tat is destined for m"ltiple osts. Uroadcasts can occ"r at
te data lin' layer and te net*or' layer. Data+lin' ,roadcasts are sent to all osts
attaced to a partic"lar pysical net*or'. Net*or' layer ,roadcasts are sent to all osts
attaced to a partic"lar lo#ical net*or'. Te Transmission $ontrol %rotocol/Internet
%rotocol 9T$%/I%> s"pports te follo*in# types of ,roadcast pac'ets:
W All onesXUy settin# te ,roadcast address to all ones 9&)).&)).&)).&))>: all osts on
te net*or' recei!e te ,roadcast.
W NetworkXUy settin# te ,roadcast address to a specific net*or' n"m,er in te net*or'
portion of te I% address and settin# all ones in te ost portion of te ,roadcast address:
all osts on te specified net*or' recei!e te ,roadcast. Kor e3ample: *en a ,roadcast
pac'et is sent *it te ,roadcast address of 1(1.1O5.&)).&)): all osts on net*or'
n"m,er 1(1.1O5 recei!e te ,roadcast.
W SubnetXUy settin# te ,roadcast address to a specific net*or' n"m,er and a specific
s",net n"m,er: all osts on te specified s",net recei!e te ,roadcast. Kor e3ample:
*en a ,roadcast pac'et is set *it te ,roadcast address of 1(1.1O5.-.&)): all osts on
s",net - of net*or' 1(1.1O5 recei!e te ,roadcast. Ueca"se ,roadcasts are reco#niLed ,y
all osts: a si#nificant #oal of ro"ter confi#"ration is to control "nnecessary proliferation
of ,roadcast pac'ets. $isco ro"ters s"pport t*o 'inds of ,roadcasts:
directed and flooded. A directed ,roadcast is a pac'et sent to a specific net*or' or series
of net*or's: *ereas a flooded ,roadcast is a pac'et sent to e!ery net*or'. In I%
internet*or's: most ,roadcasts ta'e te form of Fser Data#ram %rotocol 9FD%>
,roadcasts. Alto"# c"rrent I% implementations "se a ,roadcast address of all ones: te
first I% implementations "sed a ,roadcast address of all Leros. Many of te early
implementations do not reco#niLe ,roadcast addresses of all ones and fail to respond to
te ,roadcast correctly. Oter early implementations for*ard ,roadcasts of all ones:
*ic ca"ses a serio"s net*or' o!erload 'no*n as a broadcast storm. Implementations
tat e3i,it tese pro,lems incl"de systems ,ased on !ersions of USD FNIX prior to
Version -.(. In te ,ro'era#e comm"nity: applications "se FD% ,roadcasts to transport
mar'et data to te des'tops of traders on te tradin# floor. Tis case st"dy #i!es
e3amples of o* ,ro'era#es a!e implemented ,ot directed and floodin# ,roadcast
scemes in an en!ironment tat consists of $isco ro"ters and S"n *or'stations..
Note tat te addresses in tis net*or' "se a 1O+,it netmas' of &)).&)).&)).1P&.
6-2 Internet*or'in# $ase St"dies
FD% ,roadcasts m"st ,e for*arded from a so"rce se#ment 9Keed net*or'> to many
destination se#ments tat are connected red"ndantly. Kinancial mar'et data: pro!ided: for
e3ample: ,y Re"ters: enters te net*or' tro"# te S"n *or'stations connected to te
Keed net*or' and is disseminated to te TI$ ser!ers. Te TI$ ser!ers are S"n
*or'stations r"nnin# Te'ne'ron Information $l"ster soft*are. Te S"n *or'stations on
te trader net*or's s",scri,e to te TI$ ser!ers for te deli!ery of certain mar'et data:
*ic te TI$ ser!ers deli!er ,y means of FD% ,roadcasts. Te t*o ro"ters in tis
net*or' pro!ide red"ndancy so tat if one ro"ter ,ecomes "na!aila,le: te oter ro"ter
can ass"me te load of te failed ro"ter *ito"t inter!ention from an
operator. Te connection ,et*een eac ro"ter and te Keed net*or' is for net*or'
administration p"rposes only and does not carry "ser traffic.
T*o different approaces can ,e "sed to confi#"re $isco ro"ters for for*ardin# FD%
,roadcast traffic: I% elper addressin# and FD% floodin#. Tis case st"dy analyLes te
ad!anta#es and disad!anta#es of eac approac.
Ro"ter A Ro"ter U
1.-.)(.5.O 1.-.)(.P.O 1.-.)(.1O.O
E1
EO EO
E1
TI$ ser!er net*or' 1.-.)(.4.O
&OO.&OO.&OO.O
Keed Net*or'
&OO.&OO.&OO..1 &OO.&OO.&OO..&
1.-.)(.4..1 1.-.)(.4..&
1.-.)(.5..1
1.-.)(.P..1
1.-.)(.1O..1
Trader Net 1 Trader Net & Trader Net (
TI$ TI$ TI$ TI$
1.-.)(.P..&
1.-.)(.1O..&
E- 1.-.)(.5..&
E& E(
E-
E& E(
FD% Uroadcast Kloodin#
I':,3'3/1-/% IP H3,:3$ A..$3<<-/%
N!13 Re#ardless of *eter yo" implement I% elper addressin# or FD% floodin#: yo"
m"st "se te -: 9!$0&$.-:$!1!2!, 7.: #lo,al confi#"ration command to ena,le te FD%
for*ardin#. Uy defa"lt: te -: 9!$0&$.-:$!1!2!, 7.: command ena,les for*ardin# for
ports associated *it te follo*in# protocols: Tri!ial Kile Transfer %rotocol: Domain
Name System: Time ser!ice: NetUIOS Name Ser!er: NetUIOS Data#ram Ser!er: Uoot
%rotocol: and Terminal Access $ontroller Access $ontrol System. To ena,le for*ardin#
for oter ports: yo" m"st specify tem as ar#"ments to te -: 9!$0&$.-:$!1!2!, 7.:
command.
I':,3'3/1-/% IP H3,:3$ A..$3<<-/%
I% elper addressin# is a form of static addressin# tat "ses directed ,roadcasts to
for*ard local and all+nets ,roadcasts to desired destinations *itin te internet*or'.
To confi#"re elper addressin#: yo" m"st specify te -: ;3,:3$-&..$3<< command on
e!ery interface on e!ery ro"ter tat recei!es a ,roadcast tat needs to ,e for*arded. On
Ro"ter A and Ro"ter U: I% elper addresses can ,e confi#"red to mo!e data from te TI$
ser!er net*or' to te trader net*or's. I% elper addressin# in not te optimal sol"tion for
tis type of topolo#y ,eca"se eac ro"ter recei!es "nnecessary ,roadcasts from te oter
ro"ter: I':,3'3/1-/% IP H3,:3$ A..$3<<-/%
In tis case: Ro"ter A recei!es eac ,roadcast sent ,y Ro"ter U three times: one for eac
se#ment: and Ro"ter U recei!es eac ,roadcast sent ,y Ro"ter A tree times: one for
eac se#ment. Nen eac ,roadcast is recei!ed: te ro"ter m"st analyLe it and determine
tat te ,roadcast does not need to ,e for*arded. As more se#ments are added to te
net*or': te ro"ters ,ecome o!erloaded *it "nnecessary traffic: *ic m"st ,e
analyLed and discarded.
Nen I% elper addressin# is "sed in tis type of topolo#y: no more tan one ro"ter can
,e confi#"red to for*ard FD% ,roadcasts 9"nless te recei!in# applications can andle
d"plicate ,roadcasts>. Tis is ,eca"se d"plicate pac'ets arri!e on te trader net*or'. Tis
restriction limits red"ndancy in te desi#n and can ,e "ndesira,le in some
implementations.
To send FD% ,roadcasts ,idirectionally in tis type of topolo#y: a second -: ;3,:3$
&..$3<< command m"st ,e applied to e!ery ro"ter interface tat recei!es FD%
,roadcasts. As more se#ments and de!ices are added to te net*or': more -: ;3,:3$
&..$3<< commands are reM"ired to reac tem: so te administration of tese ro"ters
,ecomes more comple3 o!er time. Note: too: tat ,idirectional traffic in tis topolo#y
si#nificantly impacts ro"ter performance.
Ro"ter A Ro"ter U
1.-.)(.5.O 1.-.)(.P.O 1.-.)(.1O.O
E1
EO EO
E1
TI$ ser!er net*or' 1.-.)(.4.O
&OO.&OO.&OO.O
Keed Net*or'
&OO.&OO.&OO..1 &OO.&OO.&OO..&
1.-.)(.4..1 1.-.)(.4..&
1.-.)(.5..1
1.-.)(.P..1
1.-.)(.1O..1
Trader Net 1 Trader Net & Trader Net (
TI$ TI$ TI$ TI$
1.-.)(.P..&
1.-.)(.1O..&
E- 1.-.)(.5..&
E& E(
E-
E& E(
FD% pac'ets
FD% Uroadcast Kloodin#
I':,3'3/1-/% UDP F,!!.-/%
Alto"# I% elper addressin# is *ell+s"ited to nonred"ndant: nonparallel topolo#ies tat
do not reM"ire a mecanism for controllin# ,roadcast loops: in !ie* of tese dra*,ac's:
I% elper addressin# does not *or' *ell in tis topolo#y. To impro!e performance:
net*or' desi#ners considered se!eral oter alternati!es:
W Setting the broadcast address on the TIC servers to all ones (!!"!!"!!"!!#
XTis
alternati!e *as dismissed ,eca"se te TI$ ser!ers a!e more tan one interface: ca"sin#
TI$ ,roadcasts to ,e sent ,ac' onto te Keed net*or'. In addition: some *or'station
implementations do not allo* all ones ,roadcasts *en m"ltiple interfaces are present.
W Setting the broadcast address of the TIC servers to the ma$or net broadcast
(%&'"!(")")#XTis alternati!e *as dismissed ,eca"se te S"n T$%/I% implementation
does not allo* te "se of maIor net ,roadcast addresses *en te net*or' is s",netted.
W *liminating the subnets and letting the workstations use Address +esolution
,rotocol
(A+,# to learn addressesXTis alternati!e *as dismissed ,eca"se te TI$ ser!ers
cannot M"ic'ly learn an alternati!e ro"te in te e!ent of a primary ro"ter fail"re.
Nit alternati!es eliminated: te net*or' desi#ners t"rned to a simpler implementation
tat s"pports red"ndancy *ito"t d"plicatin# pac'ets and tat ens"res fast con!er#ence
and minimal loss of data *en a ro"ter fails: FD% floodin#.
I':,3'3/1-/% UDP F,!!.-/%
FD% floodin# "ses te spannin# tree al#oritm to for*ard pac'ets in a controlled
manner. Urid#in# is ena,led on eac ro"ter interface for te sole p"rpose of ,"ildin# te
spannin# tree.
Te spannin# tree pre!ents loops ,y stoppin# a ,roadcast from ,ein# for*arded o"t an
interface on *ic te ,roadcast *as recei!ed. Te spannin# tree also pre!ents pac'et
d"plication ,y placin# certain interfaces in te ,loc'ed state 9so tat no pac'ets are
for*arded> and oter interfaces in te for*ardin# state 9so tat pac'ets tat need to ,e
for*arded are for*arded>.
To ena,le FD% floodin#: te ro"ter m"st ,e r"nnin# soft*are tat s"pports transparent
,rid#in# and ,rid#in# m"st ,e confi#"red on eac interface tat is to participate in te
floodin#. If ,rid#in# is not confi#"red for an interface: te interface *ill recei!e
,roadcasts: ,"t te ro"ter *ill not for*ard tose ,roadcasts and *ill not "se tat interface
as a destination for sendin# ,roadcasts recei!ed on a different interface.
N!13 Releases prior to $isco Internet*or' Operatin# System 9$isco IOS> Soft*are
Release 1O.& do not s"pport floodin# s",net ,roadcasts.
Nen confi#"red for F%D floodin#: te ro"ter "ses te destination address specified ,y
te -: >$!&.2&<1-&..$3<< command on te o"tp"t interface to assi#n a destination
address to a flooded FD% data#ram. T"s: te destination address mi#t can#e as te
data#ram propa#ates tro"# te net*or'. Te so"rce address: o*e!er: does not can#e.
Nit FD% floodin#: ,ot ro"ters "se a spannin# tree to control te net*or'
topolo#y for te p"rpose of for*ardin# ,roadcasts. Te 'ey commands for ena,lin# FD%
floodin# are as follo*s:
,rid#e group protocol protocolip for*ard+protocol spannin# tree
,rid#e+#ro"p group inp"t+type+list access-list-number
I':,3'3/1-/% UDP F,!!.-/%
6-6 Internet*or'in# $ase St"dies
Te >$-.%3 :$!1!2!, command can specify eiter te .32 'ey*ord 9for te DE$
spannin#+tree protocol> or te -333 'ey*ord 9for te IEEE Eternet protocol>. All ro"ters
in te net*or' m"st ena,le te same spannin# tree protocol. Te -: 9!$0&$.-:$!1!2!,
<:&//-/% 1$33 command "ses te data,ase created ,y te >$-.%3 :$!1!2!, command.
Only one ,roadcast pac'et arri!es at eac se#ment: and FD% ,roadcasts can tra!erse te
net*or' in ,ot directions.
N!13
Ueca"se ,rid#in# is ena,led only to ,"ild te spannin# tree data,ase: "se access lists to
pre!ent te spannin# tree from for*ardin# non+FD% traffic. Te confi#"ration e3amples
later in tis capter confi#"re an access list tat ,loc's all ,rid#ed pac'ets.
To determine *ic interface for*ards or ,loc's pac'ets: te ro"ter confi#"ration
specifies a pat cost for eac interface. Te defa"lt pat cost for Eternet is 1OO. Settin#
te pat cost for eac interface on Ro"ter U to )O ca"ses te spannin# tree al#oritm to
place te interfaces in Ro"ter U in for*ardin# state. 1i!en te i#er pat cost 91OO> for
te interfaces in Ro"ter A: te interfaces in Ro"ter A are in te ,loc'ed state and do not
for*ard te ,roadcasts. Nit tese interface states: ,roadcast traffic flo*s tro"# Ro"ter
U. If Ro"ter U fails: te spannin# tree al#oritm *ill place te interfaces in Ro"ter A in
te for*ardin# state: and Ro"ter A *ill for*ard ,roadcast traffic. Nit one ro"ter
for*ardin# ,roadcast traffic from te TI$ ser!er net*or' to te trader net*or's: it is
desira,le to a!e te oter for*ard "ncast traffic. Kor tat reason: eac ro"ter ena,les te
I$M%
Ro"ter Disco!ery %rotocol 9IRD%>: and eac *or'station on te trader net*or's r"ns te
-$.: daemon. On Ro"ter A: te :$393$3/23 'ey*ord sets a i#er IRD% preference tan
does te confi#"ration for Ro"ter U: *ic ca"ses eac -$.: daemon to "se Ro"ter A as
its preferred defa"lt #ate*ay for "nicast traffic for*ardin#. Fsers of tose *or'stations
can "se /31<1&1 -$/ to see o* te ro"ters are ,ein# "sed. On te ro"ters: te ;!,.1-'3:
'&&.43$1-/13$4&,: and '-/&.43$1-/13$4&, 'ey*ords red"ce te ad!ertisin# inter!al
from te defa"lt so tat te -$.: daemons r"nnin# on te osts e3pect to see
ad!ertisements more freM"ently. Nit te ad!ertisin# inter!al red"ced: te *or'stations
*ill adopt Ro"ter U more M"ic'ly if Ro"ter A ,ecomes "na!aila,le. Nit tis
confi#"ration: *en a ro"ter ,ecomes "na!aila,le: IRD% offers a con!er#ence time of
less tan one min"te. IRD% is preferred o!er te Ro"tin# Information %rotocol 9RI%> and
defa"lt #ate*ays for te
follo*in# reasons:
W RI% ta'es lon#er to con!er#e: typically from one to t*o min"tes.
W $onfi#"ration of Ro"ter A as te defa"lt #ate*ay on eac S"n *or'station on te trader
net*or's *o"ld allo* tose S"n *or'stations to send "nicast traffic to Ro"ter A: ,"t
*o"ld not pro!ide an alternati!e ro"te if Ro"ter A ,ecomes "na!aila,le.
N!13
Some *or'station !endors incl"de an -$.: daemon *it teir operatin# systems. So"rce
code for an -$.: daemon is a!aila,le ,y anonymo"s KT% at ftp"cisco"com.
Ki#"re .+( so*s o* data flo*s *en te net*or' is confi#"red for FD% floodin#.
FD% Uroadcast Kloodin# 6-7
I':,3'3/1-/% UDP F,!!.-/%
N!13 Tis topolo#y is ,roadcast intensi!eX,roadcasts sometimes cons"me &O percent of
te Eternet ,and*idt. 2o*e!er: tis is a fa!ora,le percenta#e *en compared to te
confi#"ration of I% elper addressin#: *ic: in te same net*or': ca"ses ,roadcasts to
cons"me "p to )O percent of te Eternet ,and*idt.
If te osts on te trader net*or's do not s"pport IRD%: te 2ot Stand,y Ro"tin#
%rotocol 92SR%> can ,e "sed to select *ic ro"ter *ill andle "nicast traffic. 2SR%
allo*s te stand,y ro"ter to ta'e o!er M"ic'ly if te primary ro"ter ,ecomes "na!aila,le.
Kor information a,o"t confi#"rin# 2SR%:
te follo*in# command:
ip for*ard+protocol t"r,o+flood
Ro"ter A Ro"ter U
1.-.)(.5.O 1.-.)(.P.O 1.-.)(.1O.O
E1
EO EO
E1
TI$ ser!er net*or' 1.-.)(.4.O
&OO.&OO.&OO.O
Keed Net*or'
&OO.&OO.&OO..1 &OO.&OO.&OO..&
1.-.)(.4..1 1.-.)(.4..&
1.-.)(.5..1
1.-.)(.P..1
1.-.)(.1O..1
Trader Net 1 Trader Net & Trader Net (
TI$ TI$ TI$ TI$
1.-.)(.P..&
1.-.)(.1O..&
E- 1.-.)(.5..&
E& E(
E-
E& E(
Fnicast pac'ets
FD% pac'ets
I':,3'3/1-/% UDP F,!!.-/%
6-5 Internet*or'in# $ase St"dies
N!13 T"r,o floodin# increases te amo"nt of processin# tat is done at interr"pt le!el:
*ic increases te $%F load on te ro"ter. T"r,o floodin# may not ,e appropriate on
ro"ters tat are already "nder i# $%F load or tat m"st also perform oter $%Fintensi!e
acti!ities.
Te follo*in# commands confi#"re FD% floodin# on Ro"ter A. Ueca"se tis
confi#"ration does not specify a lo*er pat cost tan te defa"lt and ,eca"se te
confi#"ration of Ro"ter U specifies a lo*er cost tan te defa"lt *it re#ard to FD%
floodin#: Ro"ter A acts as a ,ac'"p to Ro"ter U. Ueca"se tis confi#"ration specifies an
IRD% preference of 1OO and ,eca"se Ro"ter U specifies a IRD% preference of PO 9-: -$.:
:$393$3/23 60>: Ro"ter A for*ards "nicast traffic from te trader net*or's: and Ro"ter U
is te ,ac'"p for "nicast traffic for*ardin#.
ARo"ter A:
ip for*ard+protocol spannin#+tree
ip for*ard+protocol "dp 111
ip for*ard+protocol "dp (OO1
ip for*ard+protocol "dp (OO&
ip for*ard+protocol "dp (OO(
ip for*ard+protocol "dp (OO-
ip for*ard+protocol "dp (OO)
ip for*ard+protocol "dp (OO.
ip for*ard+protocol "dp )O&O
ip for*ard+protocol "dp )O&1
ip for*ard+protocol "dp )O(O
ip for*ard+protocol "dp )OO&
ip for*ard+protocol "dp 1O&4
ip for*ard+protocol "dp .)4
A
interface eternet O
ip address &OO.&OO.&OO..1 &)).&)).&)).O
ip ,roadcast+address &OO.&OO.&OO.&))
no mop ena,led
A
interface eternet 1
ip address 1.-.)(.4..1 &)).&)).&)).1P&
ip ,roadcast+address 1.-.)(.4..(
ip irdp
ip irdp ma3ad!ertinter!al .O
ip irdp minad!ertinter!al -)
ip irdp oldtime .O
ip irdp preference 1OO
,rid#e+#ro"p 1
,rid#e+#ro"p 1 inp"t+type+list &O1
no mop ena,led
A
interface eternet &
ip address 1.-.)(.5..1 &)).&)).&)).1P&
ip ,roadcast+address 1.-.)(.5..(
ip irdp
ip irdp ma3ad!ertinter!al .O
ip irdp minad!ertinter!al -)
ip irdp oldtime .O
ip irdp preference 1OO
,rid#e+#ro"p 1
,rid#e+#ro"p 1 inp"t+type+list &O1
no mop ena,led
A
interface eternet (
ip address 1.-.)(.P..1 &)).&)).&)).1P&
ip ,roadcast+address 1.-.)(.P..(
ip irdp
ip irdp ma3ad!ertinter!al .O
FD% Uroadcast Kloodin# 6-6
I':,3'3/1-/% UDP F,!!.-/%
ip irdp minad!ertinter!al -)
ip irdp oldtime .O
ip irdp preference 1OO
,rid#e+#ro"p 1
,rid#e+#ro"p 1 inp"t+type+list &O1
no mop ena,led
A
interface eternet -
ip address 1.-.)(.1O..1 &)).&)).&)).1P&
ip ,roadcast+address 1.-.)(.1O..(
ip irdp
ip irdp ma3ad!ertinter!al .O
ip irdp minad!ertinter!al -)
ip irdp oldtime .O
ip irdp preference 1OO
,rid#e+#ro"p 1
,rid#e+#ro"p 1 inp"t+type+list &O1
no mop ena,led
A
ro"ter i#rp 1
net*or' 1.-.)(.O.O
A
ip name+ser!er &)).&)).&)).&))
snmp+ser!er comm"nity p",lic RN
snmp+ser!er ost 1.-.)(.4.1) p",lic
,rid#e 1 protocol dec
,rid#e 1 priority &))
access+list &O1 deny O3KKKK O3OOOO
Te follo*in# commands confi#"re FD% floodin# on Ro"ter U. Ueca"se tis
confi#"ration specifies a lo*er pat cost tan te defa"lt 9>$-.%3-%$!7: " :&1;-2!<1 +0>
and ,eca"se te confi#"ration of Ro"ter A accepts te defa"lt: Ro"ter U for*ards FD%
pac'ets. Ueca"se tis confi#"ration specifies an IRD% preference of PO 9-: -$.:
:$393$3/23 60> and ,eca"se Ro"ter A specifies a IRD% preference of 1OO: Ro"ter U acts
as te ,ac'"p for Ro"ter A for for*ardin# "nicast traffic from te trader net*or's.
ARo"ter U
ip for*ard+protocol spannin#+tree
ip for*ard+protocol "dp 111
ip for*ard+protocol "dp (OO1
ip for*ard+protocol "dp (OO&
ip for*ard+protocol "dp (OO(
ip for*ard+protocol "dp (OO-
ip for*ard+protocol "dp (OO)
ip for*ard+protocol "dp (OO.
ip for*ard+protocol "dp )O&O
ip for*ard+protocol "dp )O&1
ip for*ard+protocol "dp )O(O
ip for*ard+protocol "dp )OO&
ip for*ard+protocol "dp 1O&4
ip for*ard+protocol "dp .)4
A
interface eternet O
ip address &OO.&OO.&OO..& &)).&)).&)).O
ip ,roadcast+address &OO.&OO.&OO.&))
no mop ena,led
A
interface eternet 1
ip address 1.-.)(.4..& &)).&)).&)).1P&
ip ,roadcast+address 1.-.)(.4..(
ip irdp
ip irdp ma3ad!ertinter!al .O
ip irdp minad!ertinter!al -)
ip irdp oldtime .O
I':,3'3/1-/% UDP F,!!.-/%
6-"0 Internet*or'in# $ase St"dies
ip irdp preference PO
,rid#e+#ro"p 1
,rid#e+#ro"p 1 pat+cost )O
,rid#e+#ro"p 1 inp"t+type+list &O1
no mop ena,led
A
interface eternet &
ip address 1.-.)(.5..& &)).&)).&)).1P&
ip ,roadcast+address 1.-.)(.5..(
ip irdp
ip irdp ma3ad!ertinter!al .O
ip irdp minad!ertinter!al -)
ip irdp oldtime .O
ip irdp preference PO
,rid#e+#ro"p 1
,rid#e+#ro"p 1 pat+cost )O
,rid#e+#ro"p 1 inp"t+type+list &O1
no mop ena,led
A
interface eternet (
ip address 1.-.)(.P..& &)).&)).&)).1P&
ip ,roadcast+address 1.-.)(.P..(
ip irdp
ip irdp ma3ad!ertinter!al .O
ip irdp minad!ertinter!al -)
ip irdp oldtime .O
ip irdp preference PO
,rid#e+#ro"p 1
,rid#e+#ro"p 1 pat+cost )O
,rid#e+#ro"p 1 inp"t+type+list &O1
no mop ena,led
A
interface eternet -
ip address 1.-.)(.1O..& &)).&)).&)).1P&
ip ,roadcast+address 1.-.)(.1O..(
ip irdp
ip irdp ma3ad!ertinter!al .O
ip irdp minad!ertinter!al -)
ip irdp oldtime .O
ip irdp preference PO
,rid#e+#ro"p 1
,rid#e+#ro"p 1 pat+cost )O
,rid#e+#ro"p 1 inp"t+type+list &O1
no mop ena,led
A
ro"ter i#rp 1
net*or' 1.-.)(.O.O
A
ip name+ser!er &)).&)).&)).&))
snmp+ser!er comm"nity p",lic RN
snmp+ser!er ost 1.-.)(.4.1) p",lic
,rid#e 1 protocol dec
,rid#e 1 priority &))
access+list &O1 deny O3KKKK O3OOOO
E* NO : "0 S17.8 !9 TCP :3$9!$'&/23
I/1$!.721-!/ :
Te Transmission Control ,rotocol 9T$%> and te Fser Data#ram %rotocol 9FD%> are
,ot I% transport+layer protocols. FD% is a li#t*ei#t protocol tat allo*s applications
to ma'e direct "se of te "nrelia,le data#ram ser!ice pro!ided ,y te "nderlyin# I%
ser!ice. FD% is commonly "sed to s"pport applications tat "se simple M"ery/response
transactions: or applications tat s"pport real+time comm"nications. T$% pro!ides a
relia,le data+transfer ser!ice: and is "sed for ,ot ,"l' data transfer and interacti!e data
applications. T$% is te maIor transport protocol in "se in most I% net*or's: and s"pports
te transfer of o!er PO percent of all traffic across te p",lic Internet today. 1i!en tis
maIor role for T$%: te performance of tis protocol forms a si#nificant part of te total
pict"re of ser!ice performance for I% net*or's. In tis article *e e3amine T$% in f"rter
detail: loo'in# at *at ma'es a T$% session perform relia,ly and *ell. Tis article dra*s
on material p",lised in te Internet ,erformance Survival .uide <1=.
O!er!ie* of T$%
T$% is te em,odiment of relia,le end+to+end transmission f"nctionality in te o!erall
Internet arcitect"re. All te f"nctionality reM"ired to ta'e a simple ,ase of I% data#ram
deli!ery and ,"ild "pon tis a control model tat implements relia,ility: seM"encin#: flo*
control: and data streamin# is em,edded *itin T$% <&=.
T$% pro!ides a comm"nication cannel ,et*een processes on eac ost system. Te
cannel is relia,le: f"ll+d"ple3: and streamin#. To acie!e tis f"nctionality: te T$%
dri!ers ,rea' "p te session data stream into discrete se#ments: and attac a T$% eader
to eac se#ment. An I% eader is attaced to tis T$% pac'et: and te composite pac'et is
ten passed to te net*or' for deli!ery. Tis T$% eader as n"mero"s fields tat are
"sed to s"pport te intended T$% f"nctionality. T$% as te follo*in# f"nctional
caracteristics:
/nicast protocol : T$% is ,ased on a "nicast net*or' model: and s"pports data
e3can#e ,et*een precisely t*o parties. It does not s"pport ,roadcast or m"lticast
net*or' models.
Connection state : Rater tan impose a state *itin te net*or' to s"pport te
connection: T$% "ses syncroniLed state ,et*een te t*o endpoints. Tis
syncroniLed state is set "p as part of an initial connection process: so T$% can ,e
re#arded as a connection+oriented protocol. M"c of te protocol desi#n is
intended to ens"re tat eac local state transition is comm"nicated to: and
ac'no*led#ed ,y: te remote party.
+eliable : Relia,ility implies tat te stream of octets passed to te T$% dri!er at
one end of te connection *ill ,e transmitted across te net*or' so tat te
stream is presented to te remote process as te same seM"ence of octets: in te
same order as tat #enerated ,y te sender.
Tis implies tat te protocol detects *en se#ments of te data stream a!e ,een
discarded ,y te net*or': reordered: d"plicated: or corr"pted. Nere necessary:
te sender *ill retransmit dama#ed se#ments so as to allo* te recei!er to
reconstr"ct te ori#inal data stream. Tis implies tat a T$% sender m"st maintain
a local copy of all transmitted data "ntil it recei!es an indication tat te recei!er
as completed an acc"rate transfer of te data.
0ull duple1 : T$% is a f"ll+d"ple3 protocol@ it allo*s ,ot parties to send and
recei!e data *itin te conte3t of te sin#le T$% connection.
Streaming : Alto"# T$% "ses a pac'et str"ct"re for net*or' transmission: T$%
is a tr"e streamin# protocol: and application+le!el net*or' operations are not
transparent. Some protocols e3plicitly encaps"late eac application transaction@
for e!ery write : tere m"st ,e a matcin# read . In tis manner: te applicationderi!ed
se#mentation of te data stream into a lo#ical record str"ct"re is
preser!ed across te net*or'. T$% does not preser!e s"c an implicit str"ct"re
imposed on te data stream: so tat tere is no pairin# ,et*een write and read
operations *itin te net*or' protocol. Kor e3ample: a T$% application may
write tree data ,loc's in seM"ence into te net*or' connection: *ic may ,e
collected ,y te remote reader in a sin#le read operation. Te siLe of te data
,loc's 9se#ments> "sed in a T$% session is ne#otiated at te start of te session.
Te sender attempts to "se te lar#est se#ment siLe it can for te data transfer:
*itin te constraints of te ma3im"m se#ment siLe of te recei!er: te ma3im"m
se#ment siLe of te confi#"red sender: and te ma3i+m"m s"pporta,le nonfra#mented
pac'et siLe of te net*or' pat 9pat 2a1imum Transmission /nit
<MTF=>. Te pat MTF is refresed periodically to adI"st to any can#es tat
may occ"r *itin te net*or' *ile te T$% connection is acti!e.
+ate adaptation : T$% is also a rate+adapti!e protocol: in tat te rate of data
transfer is intended to adapt to te pre!ailin# load conditions *itin te net*or'
and adapt to te processin# capacity of te recei!er. Tere is no predetermined
T$% data+transfer rate@ if te net*or' and te recei!er ,ot a!e additional
a!aila,le capacity: a T$% sender *ill attempt to inIect more data into te net*or'
to ta'e "p tis a!aila,le space. $on!ersely: if tere is con#estion: a T$% sender
*ill red"ce its sendin# rate to allo* te net*or' to reco!er. Tis adaptation
f"nction attempts to acie!e te i#est possi,le data+transfer rate *ito"t
tri##erin# consistent data loss.
Te T$% %rotocal 2eader
Te T$% eader str"ct"re: so*n in Ki#"re 1: "ses a pair of 1.+,it so"rce and destination
%ort addresses. Te ne3t field is a (&+,it seM"ence n"m,er: *ic identifies te seM"ence
n"m,er of te first data octet in tis pac'et. Te seM"ence n"m,er does not start at an
initial !al"e of 1 for eac ne* T$% connection@ te selection of an initial !al"e is critical:
,eca"se te initial !al"e is intended to pre!ent delayed data from an old connection from
,ein# incorrectly interpreted as ,ein# !alid *itin a c"rrent connection. Te seM"ence
n"m,er is necessary to ens"re tat arri!in# pac'ets can ,e ordered in te senderYs
ori#inal order. Tis field is also "sed *itin te flo*+control str"ct"re to allo* te
association of a data pac'et *it its correspondin# ac'no*led#ement: allo*in# a sender
to estimate te c"rrent ro"nd+trip time across te net*or'.
Ki#"re 1: Te T$%/I% Data#ram
;Note:$lic' a,o!e for lar#er !ie*
Te acknowledgment se3uence number is "sed to inform te remote end of te data tat
as ,een s"ccessf"lly recei!ed. Te ac'no*led#ment seM"ence n"m,er is act"ally one
#reater tan tat of te last octet correctly recei!ed at te local end of te connection. Te
data offset field indicates te n"m,er of fo"r+octet *ords *itin te T$% eader. Si3
sin#le bit flags are "sed to indicate !ario"s conditions. FR1 is "sed to indicate *eter
te "r#ent pointer is !alid. A$J is "sed to indicate *eter te acknowledgment field is
!alid. %S2 is set *en te sender *ants te remote application to p"s tis data to te
remote application. RST is "sed to reset te connection. SZN 9for synchroni4e > is "sed
*itin te connection start"p pase: and KIN 9for finish > is "sed to close te connection
in an orderly fasion. Te window field is a 1.+,it co"nt of a!aila,le ,"ffer space. It is
added to te ac'no*led#ment seM"ence n"m,er to indicate te i#est seM"ence n"m,er
te recei!er can accept. Te T$% checksum is applied to a syntesiLed eader tat
incl"des te so"rce and destination addresses from te o"ter I% data#ram. Te final field
in te T$% eader is te "r#ent pointer: *ic: *en added to te seM"ence n"m,er:
indicates te seM"ence n"m,er of te final octet of "r#ent data if te "r#ent fla# is set.
Many options can ,e carried in a T$% eader. Tose rele!ant to T$% performance
incl"de:
2a1imum-receive-segment-si4e option : Tis option is "sed *en te connection
is ,ein# opened. It is intended to inform te remote end of te ma3im"m se#ment
siLe: meas"red in octets: tat te sender is *illin# to recei!e on te T$%
connection. Tis option is "sed only in te initial SZN pac'et 9te initial pac'et
e3can#e tat opens a T$% connection>. It sets ,ot te ma3im"m recei!e
se#ment siLe and te ma3im"m siLe of te ad!ertised T$% *indo*: passed to te
remote end of te connection. In a ro,"st implementation of T$%: tis option
so"ld ,e "sed *it pat MTF disco!ery to esta,lis a se#ment siLe tat can ,e
passed across te connection *ito"t fra#mentation: an essential attri,"te of a
i#+performance data flo*.
5indow-scale option : Tis option is intended to address te iss"e of te
ma3im"m *indo* siLe in te face of pats tat e3i,it a i#+delay ,and*idt
prod"ct. Tis option allo*s te *indo* siLe ad!ertisement to ,e ri#t+sifted ,y
te amo"nt specified 9in ,inary aritmetic: a ri#t+sift corresponds to a
m"ltiplication ,y &>. Nito"t tis option: te ma3im"m *indo* siLe tat can ,e
ad!ertised is .):)() ,ytes 9te ma3im"m !al"e o,taina,le in a 1.+,it field>. Te
limit of T$% transfer speed is effecti!ely one *indo* siLe in transit ,et*een te
sender and te recei!er. Kor i#+speed: lon#+delay net*or's: tis performance
limitation is a si#nificant factor: ,eca"se it limits te transfer rate to at most
.):)() ,ytes per ro"nd+trip inter!al: re#ardless of a!aila,le net*or' capacity. Fse
of te *indo*+scale option allo*s te T$% sender to effecti!ely adapt to i#,and+
*idt: i#+delay net*or' pats: ,y allo*in# more data to ,e eld in fli#t.
Te ma3im"m *indo* siLe *it tis option
is & (O ,ytes. Tis option is ne#otiated at te start of te T$% connection: and can
,e sent in a pac'et only *it te SZN fla#. Note tat *ile an MTF disco!ery
process allo*s optimal settin# of te ma3im"m+recei!e+se#ment+siLe option: no
correspondin# ,and*idt delay prod"ct disco!ery allo*s te relia,le a"tomated
settin# of te *indo*+scale option <(=.
SA$J+permitted option and SA$J option : Tis option alters te
ac'no*led#ment ,ea!ior of T$%. SA$J is an acronym for selective
acknowledgment . Te SA$J+permitted option is offered to te remote end
d"rin# T$% set"p as an option to an openin# SZN pac'et. Te SA$J option
permits selecti!e ac'no*led#ment of permitted data. Te defa"lt T$%
ac'no*led#ment ,ea!ior is to ac'no*led#e te i#est seM"ence n"m,er of inorder
,ytes. Tis defa"lt ,ea!ior is prone to ca"se "nnecessary retransmission of
data: *ic can e3acer,ate a con#estion condition tat may a!e ,een te ca"se
of te ori#inal pac'et loss. Te SA$J option allo*s te recei!er to modify te
ac'no*led#ment field to descri,e noncontin"o"s ,loc's of recei!ed data: so tat
te sender can retransmit only *at is missin# at te recei!er[s end <-=.
Any ro,"st i#+performance implementation of T$% so"ld ne#otiate tese parameters
at te start of te T$% session: ens"rin# te follo*in#: tat te session is "sin# te lar#est
possi,le I% pac'et siLe tat can ,e carried *ito"t fra#mentation: tat te *indo* siLes
"sed in te transfer are adeM"ate for te ,and*idt+delay prod"ct of te net*or' pat:
and tat selecti!e ac'no*led#ment can ,e "sed for rapid reco!ery from line+error
conditions or from sort periods of mar#inally de#raded net*or' performance.
T$% Operation
Te first pase of a T$% session is esta,lisment of te connection. Tis reM"ires a tree*ay
andsa'e: ens"rin# tat ,ot sides of te connection a!e an "nam,i#"o"s
"nderstandin# of te seM"ence n"m,er space of te remote side for tis session. Te
operation of te connection is as follo*s:
Te local system sends te remote end an initial seM"ence n"m,er to te remote
port: "sin# a SZN pac'et.
Te remote system responds *it an A$J of te initial seM"ence n"m,er and te
initial seM"ence n"m,er of te remote end in a response SZN pac'et.
Te local end responds *it an A$J of tis remote seM"ence n"m,er.
Te connection is opened.
Te operation of tis al#oritm is so*n in Ki#"re &. Te performance implication of tis
protocol e3can#e is tat it ta'es one and a alf round-trip times 9RTTs> for te t*o
systems to syncroniLe state ,efore any data can ,e sent.
Ki#"re & : T$% $onnection 2andsa'e
;Note:$lic' a,o!e for lar#er !ie*
After te connection as ,een esta,lised: te T$% protocol mana#es te relia,le
e3can#e of data ,et*een te t*o systems. Te al#oritms tat determine te !ario"s
retransmission timers a!e ,een redefined n"mero"s times. T$% is a slidin#+*indo*
protocol: and te #eneral principle of flo* control is ,ased on te mana#ement of te
ad!ertised *indo* siLe and te mana#ement of retransmission timeo"ts: attemptin# to
optimiLe protocol performance *itin te o,ser!ed delay and loss parameters of te
connection. T"nin# a T$% protocol stac' for optimal performance o!er a !ery lo*+delay:
i#+,and*idt 0AN reM"ires different settin#s to o,tain optimal performance o!er a
dial"p Internet connection: *ic in t"rn is different for te reM"irements of a i#+speed
*ide+area net*or'. Alto"# T$% attempts to disco!er te delay ,and*idt prod"ct of
te connection: and attempts to a"tomatically optimiLe its flo* rates *itin te estimated
parameters of te net*or' pat: some estimates *ill not ,e acc"rate: and te
correspondin# efforts ,y T$% to optimiLe ,ea!ior may not ,e completely s"ccessf"l.
Anoter critical aspect is tat T$% is an adapti!e flo*+control protocol. T$% "ses a ,asic
flo*+control al#oritm of increasin# te data+flo* rate "ntil te net*or' si#nals tat
some form of sat"ration le!el as ,een reaced 9normally indicated ,y data loss>. Nen
te sender recei!es an indication of data loss: te T$% flo* rate is red"ced@ *en relia,le
transmission is reesta,lised: te flo* rate slo*ly increases a#ain.
If no relia,le flo* is reesta,lised: te flo* rate ,ac's f"rter off to an initial pro,e of a
sin#le pac'et: and te entire adapti!e flo*+control process starts a#ain.
Tis process as n"mero"s res"lts rele!ant to ser!ice M"ality. Kirst: T$% ,ea!es
adaptively : rater tan predictively . Te flo*+control al#oritms are intended to increase
te data+flo* rate to fill all a!aila,le net*or' pat capacity: ,"t tey are also intended to
M"ic'ly ,ac' off if te a!aila,le capacity can#es ,eca"se of interaction *it oter
traffic: or if a dynamic can#e occ"rs in te end+to+end net*or' pat. Kor e3ample: a
sin#le T$% flo* across an oter*ise idle net*or' attempts to fill te net*or' pat *it
data: optimiLin# te flo* rate *itin te a!aila,le net*or' capacity. If a second T$%
flo* opens "p across te same pat: te t*o flo*+control al#oritms *ill interact so tat
,ot flo*s *ill sta,iliLe to "se appro3imately alf of te a!aila,le capacity per flo*. Te
o,Iecti!e of te T$% al#oritms is to adapt so tat te net*or' is f"lly "sed *ene!er
one or more data flo*s are present. In desi#n: tension al*ays e3ists ,et*een te
efficiency of net*or' "se and te enforcement of predicta,le session performance. Nit
T$%: yo" #i!e "p predicta,le tro"#p"t ,"t #ain a i#ly "tiliLed: efficient net*or'.
Interacti!e T$%
Interacti!e protocols are typically directed at s"pportin# sin#le caracter interactions:
*ere eac caracter is carried in a sin#le pac'et: as is its eco. Te protocol interaction
to s"pport tis is indicated in Ki#"re (.
Ki#"re (: Interacti!e E3can#e
;Note:$lic' a,o!e for lar#er !ie*
Tese & ,ytes of data #enerate fo"r T$%/I% pac'ets: or 1.O ,ytes of protocol o!eread.
T$% ma'es some small impro!ement in tis e3can#e tro"# te "se of piggybacking :
*ere an A$J is carried in te same pac'et as te data: and delayed acknowledgment :
*ere an A$J is delayed "p to &OO ms ,efore sendin#: to #i!e te ser!er application te
opport"nity to #enerate data tat te A$J can pi##y,ac'. Te res"ltant protocol
e3can#e is indicated in Ki#"re -.
Ki#"re -: Intereacti!e E3can#e *it Delayed A$J
;Note:$lic' a,o!e for lar#er !ie*
Kor sort+delay 0ANs: tis protocol e3can#e offers accepta,le performance. Tis
protocol e3can#e for a sin#le data caracter and its eco occ"rs *itin a,o"t 1. ms on
an Eternet 0AN: correspondin# to an interacti!e rate of .O caracters per second. Nen
te net*or' delay is increased in a NAN: tese small pac'ets can ,e a so"rce of
con#estion load. Te T$% mecanism to address tis small+pac'et con#estion *as
descri,ed ,y \on Na#le in RK$ 5P. <)=. $ommonly referred to as te Nagle Algorithm :
tis mecanism ini,its a sender from transmittin# any additional small se#ments *ile
te T$% connection as o"tstandin# "nac'no*led#ed small se#ments. On a 0AN: tis
modification to te al#oritm as a ne#li#i,le effect@ in contrast: on a NAN: it as a
dramatic effect in red"cin# te n"m,er of small pac'ets in direct correlation to te
net*or' pat con#estion le!el 9as so*n in Ki#"res ) and .>. Te cost is an increase in
session Iitter ,y "p to a ro"nd+trip time inter!al. Applications tat are Iitter+sensiti!e
typically disa,le tis control al#oritm.
Ki#"re ): Nan Interacti!e E3can#e
;Note:$lic' a,o!e for lar#er !ie*
Ki#"re .: Nan Interacti!e E3can#e *it Na#le Al#oritm
;Note:$lic' a,o!e for lar#er !ie*
T$% is not a i#ly efficient protocol for te transmission of interacti!e traffic. Te
typical carria#e efficiency of te protocol across a 0AN is & ,ytes of payload and 1&O
,ytes of protocol o!eread. Across a NAN: te Na#le al#oritm may impro!e tis
carria#e efficiency sli#tly ,y increasin# te n"m,er of ,ytes of payload for eac payload
transaction: alto"# it *ill do so at te e3pense of increased session Iitter.
T$% Vol"me Transfer
Te o,Iecti!e for tis application is to ma3imiLe te efficiency of te data transfer:
implyin# tat T$% so"ld endea!or to locate te point of dynamic eM"ili,ri"m of
ma3im"m net*or' efficiency: *ere te sendin# data rate is ma3imiLed I"st prior to te
onset of s"stained pac'et loss.
K"rter increasin# te sendin# rate from s"c a point *ill r"n te ris' of #eneratin# a
con#estion condition *itin te net*or': *it rapidly increasin# pac'et+loss le!els. Tis:
in t"rn: *ill force te T$% protocol to retransmit te lost data: res"ltin# in red"ced datatransfer
efficiency. On te oter and: attemptin# to completely eliminate pac'et+loss
rates implies tat te sender m"st red"ce te sendin# rate of data into te net*or' so as
not to create transient con#estion conditions alon# te pat to te recei!er. S"c an action
*ill: in all pro,a,ility: lea!e te net*or' *it idle capacity: res"ltin# in inefficient "se of
a!aila,le net*or' reso"rces.
Te notion of a point of eM"ili,ri"m is an important one. Te o,Iecti!e of T$% is to
coordinate te actions of te sender: te net*or': and te recei!er so tat te net*or'
pat as s"fficient data s"c tat te net*or' is not idle: ,"t it is not so o!erloaded tat a
con#estion ,ac'lo# ,"ilds "p and data loss occ"rs. Maintainin# tis point of eM"ili,ri"m
reM"ires te sender and recei!er to ,e syncroniLed so tat te sender passes a pac'et into
te net*or' at precisely te same time as te recei!er remo!es a pac'et from te
net*or'. If te sender attempts to e3ceed tis eM"ili,ri"m rate: net*or' con#estion *ill
occ"r. If te sender attempts to red"ce its rate: te efficiency of te net*or' *ill drop.
T$% "ses a slidin#+*indo* protocol to s"pport ,"l' data transfer 9Ki#"re 4>.
Ki#"re 4: T$% Slidin# Nindo*
;Note:$lic' a,o!e for lar#er !ie*
Te recei!er ad!ertises to te sender te a!aila,le ,"ffer space at te recei!er. Te sender
can transmit "p to tis amo"nt of data ,efore a!in# to a*ait a f"rter ,"ffer "pdate from
te recei!er. Te sender so"ld a!e no more tan tis amo"nt of data in transit in te
net*or'. Te sender m"st also ,"ffer sent data "ntil it as ,een A$Jed ,y te recei!er.
Te send *indo* is te minim"m of te sender[s ,"ffer siLe and te ad!ertised recei!er
*indo*. Eac time an A$J is recei!ed: te trailin# ed#e of te send *indo* is
ad!anced. Te minim"m of te sender[s ,"ffer and te ad!ertised recei!er[s *indo* is
"sed to calc"late a ne* leadin# ed#e. If tis send *indo* encompasses "nsent data: tis
data can ,e sent immediately.
Te siLe of T$% ,"ffers in eac ost is a critical limitation to performance in NANs. Te
protocol is capa,le of transferrin# one send *indo* of data per ro"nd+trip inter!al. Kor
e3ample: *it a send *indo* of -OP. ,ytes and a transmission pat *it an RTT of .OO
ms: a T$% session is capa,le of s"stainin# a ma3im"m transfer rate of -5 J,ps:
re#ardless of te ,and*idt of te net*or' pat. Ma3im"m efficiency of te transfer is
o,tained only if te sender is capa,le of completely fillin# te net*or' pat *it data.
Ueca"se te sender *ill a!e an amo"nt of data in for*ard transit and an eM"i!alent
amo"nt of data a*aitin# reception of an A$J si#nal: ,ot te sender[s ,"ffer and te
recei!er[s ad!ertised *indo* so"ld ,e no smaller tan te Delay+Uand*idt %rod"ct of
te net*or' pat. Tat is:
5indow si4e (le or e3# 6andwidth (bytes7sec# (times# +ound-trip time (sec#
Te 1.+,it field *itin te T$% eader can contain !al"es "p to .):)(): imposin# an
"pper limit on te a!aila,le *indo* siLe of .):)() ,ytes. Tis imposes an "pper limit on
T$% performance of some .- JU per RTT: e!en *en ,ot end systems a!e ar,itrarily
lar#e send and recei!e ,"ffers. Tis limit can ,e modified ,y te "se of a *indo*+scale
option: descri,ed in RK$ 1(&(: effecti!ely increasin# te siLe of te *indo* to a (O+,it
field: ,"t transmittin# only te most si#nificant 1. ,its of te !al"e. Tis allo*s te
sender and recei!er to "se ,"ffer siLes tat can operate efficiently at speeds tat
encompass most of te c"rrent !ery+i#+speed net*or' transmission tecnolo#ies across
distances of te scale of te terrestrial intercontinental ca,le systems.
Alto"# te ma3im"m *indo* siLe and te RTT to#eter determine te ma3im"m
acie!a,le data+transfer rate: tere is an additional element of flo* control reM"ired for
T$%. If a T$% session commenced ,y inIectin# a f"ll *indo* of data into te net*or':
ten tere is a stron# pro,a,ility tat m"c of te initial ,"rst of data *o"ld ,e lost
,eca"se of transient con#estion: partic"larly if a lar#e *indo* is ,ein# "sed. Instead:
T$% adopts a more conser!ati!e approac ,y startin# *it a modest amo"nt of data tat
as a i# pro,a,ility of s"ccessf"l transmission: and ten pro,in# te net*or' *it
increasin# amo"nts of data for as lon# as te net*or' does not so* si#ns of con#estion.
Nen con#estion is e3perienced: te sendin# rate is dropped and te pro,in# for
additional capacity is res"med.
Te dynamic operation of te *indo* is a critical component of T$% performance for
!ol"me transfer. Te mecanics of te protocol in!ol!e an additional o!erridin# modifier
of te sender[s *indo*: te congestion window : referred to as cwnd . Te o,Iecti!e of
te *indo*+mana#ement al#oritm is to start transmittin# at a rate tat as a !ery lo*
pro,a,ility of pac'et loss: ten to increase te rate 9,y increasin# te cwnd siLe> "ntil te
sender recei!es an indication: tro"# te detection of pac'et loss: tat te rate as
e3ceeded te a!aila,le capacity of te net*or'. Te sender ten immediately al!es its
sendin# rate ,y red"cin# te !al"e of cwnd : and res"mes a #rad"al increase of te
sendin# rate. Te #oal is to contin"ally modify te sendin# rate s"c tat it oscillates
aro"nd te tr"e !al"e of a!aila,le net*or' capacity. Tis oscillation ena,les a dynamic
adI"stment tat a"tomatically senses any increase or decrease in a!aila,le capacity
tro"# te lifetime of te data flo*.
Te intended o"tcome is tat of a dynamically adI"stin# cooperati!e data flo*: *ere a
com,ination of s"c flo*s ,ea!es fairly: in tat eac flo* o,tains essentially a fair
sare of te net*or': and so tat close to ma3imal "se of a!aila,le net*or' reso"rces is
made. Tis flo*+control f"nctionality is acie!ed tro"# a com,ination of cwnd !al"e
mana#ement and pac'et+loss and retransmission al#oritms. T$% flo* control as tree
maIor parts: te flo*+control modes of Slow Start and $on#estion A!oidance: and te
response to pac'et loss tat determines o* T$% s*itces ,et*een tese t*o modes of
operation.
T$% Slo* Start
Te startin# !al"e of te cwnd *indo* 9te Initial Nindo*: or IN> is set to tat of te
Sender Ma3im"m Se#ment SiLe 9SMSS> !al"e. Tis SMSS !al"e is ,ased on te
recei!er[s ma3im"m se#ment siLe: o,tained d"rin# te SZN andsa'e: te disco!ered
pat MTF 9if "sed>: te MTF of te sendin# interface: or: in te a,sence of oter
information: )(. ,ytes. Te sender ten enters a flo*+control mode termed Slow Start .
Te sender sends a sin#le data se#ment: and ,eca"se te *indo* is no* f"ll: it ten
a*aits te correspondin# A$J. Nen te A$J is recei!ed: te sender increases its
*indo* ,y increasin# te !al"e of cwnd ,y te !al"e of SMSS. Tis ten allo*s te
sender to transmit t*o se#ments@ at tat point: te con#estion *indo* is a#ain f"ll: and
te sender m"st a*ait te correspondin# A$Js for tese se#ments. Tis al#oritm
contin"es ,y increasin# te !al"e of cwnd 9and: correspondin#ly: openin# te siLe of te
con#estion *indo*> ,y one SMSS for e!ery A$J recei!ed tat ac'no*led#es ne* data.
If te recei!er is sendin# an A$J for e!ery pac'et: te effect of tis al#oritm is tat te
data rate of te sender do",les e!ery ro"nd+trip time inter!al. If te recei!er s"pports
delayed A$Js: te rate of increase *ill ,e sli#tly lo*er: ,"t ne!erteless te rate *ill
increase ,y a minim"m of one SMSS eac ro"nd+trip time. O,!io"sly: tis cannot ,e
s"stained indefinitely. Eiter te !al"e of cwnd *ill e3ceed te ad!ertised recei!e
*indo* or te sender[s *indo*: or te capacity of te net*or' *ill ,e e3ceeded: in
*ic case pac'ets *ill ,e lost.
Tere is anoter limit to te slo*+start rate increase: maintained in a !aria,le termed
ssthresh : or Slow-Start Threshold . If te !al"e of cwnd increases past te !al"e of
sstres: te T$% flo*+control mode is can#ed from Slow Start to con#estion a!oidance.
Initially te !al"e of sstres is set to te recei!er[s ma3im"m *indo* siLe. 2o*e!er:
*en con#estion is noted: sstres is set to alf te c"rrent *indo* siLe: pro!idin# T$%
*it a memory of te point *ere te onset of net*or' con#estion may ,e anticipated in
f"t"re.
One aspect to i#li#t concerns te interaction of te slo*+start al#oritm *it i#capacity
lon#+delay net*or's: te so+called 0on# Kat Net*or's 9or 0KNs: prono"nced
CelepantsC>. Te ,ea!ior of te slo*+start al#oritm is to send a sin#le pac'et: a*ait an
A$J: ten send t*o pac'ets: and a*ait te correspondin# A$Js: and so on. Te T$%
acti!ity on 0KNs tends to cl"ster at eac epoc of te ro"nd+trip time: *it a M"iet period
tat follo*s after te a!aila,le *indo* of data as ,een transmitted. Te recei!ed A$Js
arri!e ,ac' at te sender *it an inter+A$J spacin# tat is eM"i!alent to te data rate of
te ,ottlenec' point on te net*or' pat. D"rin# Slow Start : te sender transmits at a
rate eM"al to t*ice tis ,ottlenec' rate. Te rate adaptation f"nction tat m"st occ"r
*itin te net*or' ta'es place in te ro"ter at te entrance to te ,ottlenec' point. Te
sender[s pac'ets arri!e at tis ro"ter at t*ice te rate of e#ress from te ro"ter: and te
ro"ter stores te o!erflo* *itin its internal ,"ffer. Nen tis ,"ffer o!erflo*s: pac'ets
*ill ,e dropped: and te slo*+start pase is o!er. Te important concl"sion is tat te
sender *ill stop increasin# its data rate *en tere is ,"ffer e3a"stion: a condition tat
may not ,e te same as reacin# te tr"e a!aila,le data rate. If te ro"ter as a ,"ffer
capacity considera,ly less tan te delay+,and*idt prod"ct of te e#ress circ"it: te t*o
!al"es are certainly not te same.
In tis case: te T$% slo*+start al#oritm *ill finis *it a sendin# rate tat is *ell
,elo* te act"al a!aila,le capacity. Te efficient operation of T$%: partic"larly in 0KNs:
is critically reliant on adeM"ately lar#e ,"ffers *itin te net*or' ro"ters.
Anoter aspect of Slow Start is te coice of a sin#le se#ment as te initial sendin#
*indo*. E3perimentation indicates tat an initial !al"e of "p to fo"r se#ments can allo*
for a more efficient session start"p: partic"larly for tose sort+d"ration T$% sessions so
pre!alent *it Ne, fetces <.=. O,ser!ation of Ne, traffic indicates an a!era#e Ne,
data transfer of 14 se#ments. A slow start from one se#ment *ill ta'e fi!e RTT inter!als
to transfer tis data: *ile "sin# an initial !al"e of fo"r *ill red"ce te transfer time to
tree RTT inter!als. 2o*e!er: fo"r se#ments may ,e too many *en "sin# lo*+speed
lin's *it limited ,"ffers: so a more ro,"st approac is to "se an initial !al"e of no more
tan t*o se#ments to commence Slow Start <4=.
%ac'et 0oss
Slo* Start attempts to start a T$% session at a rate te net*or' can s"pport and ten
contin"ally increase te rate. 2o* does T$% 'no* *en to stop tis increaseY Tis slo*start
rate increase stops *en te con#estion *indo* e3ceeds te recei!er[s ad!ertised
*indo*: *en te rate e3ceeds te remem,ered !al"e of te onset of con#estion as
recorded in sstres: or *en te rate is #reater tan te net*or' can s"stain. Addressin#
te last condition: o* does a T$% sender 'no* tat it is sendin# at a rate #reater tan
te net*or' can s"stainY Te ans*er is tat tis is so*n ,y data pac'ets ,ein# dropped
,y te net*or'. In tis case: T$% as to "nderta'e many f"nctions:
Te pac'et loss as to ,e detected ,y te sender.
Te missin# data as to ,e retransmitted.
Te sendin# data rate so"ld ,e adI"sted to red"ce te pro,a,ility of f"rter
pac'et loss.
T$% can detect pac'et loss in t*o *ays. Kirst: if a sin#le pac'et is lost *itin a seM"ence
of pac'ets: te s"ccessf"l deli!ery pac'ets follo*in# te lost pac'et *ill ca"se te
recei!er to #enerate a duplicate A$J for eac s"ccessi!e pac'et Te reception of tese
d"plicate A$Js is a si#nal of s"c pac'et loss. Second: if a pac'et is lost at te end of a
seM"ence of sent pac'ets: tere are no follo*in# pac'ets to #enerate d"plicate A$Js. In
tis case: tere are no correspondin# A$Js for tis pac'et: and te sender[s retransmit
timer *ill e3pire and te sender *ill ass"me pac'et loss.
A sin#le d"plicate A$J is not a relia,le si#nal of pac'et loss. Nen a T$% recei!er #ets
a data pac'et *it an o"t+of+order T$% seM"ence !al"e: te recei!er m"st #enerate an
immediate A$J of te i#est in+order data ,yte recei!ed. Tis *ill ,e a d"plicate of an
earlier transmitted A$J. Nere a sin#le pac'et is lost from a seM"ence of pac'ets: all
s",seM"ent pac'ets *ill #enerate a d"plicate A$J pac'et.
On te oter and: *ere a pac'et is rero"ted *it an additional incremental delay: te
reorderin# of te pac'et stream at te recei!er[s end *ill #enerate a small n"m,er of
d"plicate A$Js: follo*ed ,y an A$J of te entire data seM"ence: after te errant pac'et
is recei!ed. Te sender distin#"ises ,et*een tese cases ,y "sin# tree d"plicate A$J
pac'ets as a si#nal of pac'et loss.
Te tird d"plicate A$J tri##ers te sender to immediately send te se#ment referenced
,y te d"plicate A$J !al"e 9fast retransmit > and commence a seM"ence termed 0ast
+ecovery . In fast reco!ery: te !al"e of ssthresh is set to alf te c"rrent send *indo*
siLe 9te send *indo* is te amo"nt of "nac'no*led#ed data o"tstandin#>. Te
con#estion *indo*: cwnd : is set tree se#ments #reater tan ssthresh to allo* for tree
se#ments already ,"ffered at te recei!er. If tis allo*s additional data to ,e sent: ten
tis is done. Eac additional d"plicate A$J inflates cwnd ,y a f"rter se#ment siLe:
allo*in# more data to ,e sent. Nen an A$J arri!es tat encompasses ne* data: te
!al"e ofcwnd is set ,ac' to sstres: and T$% enters con#estion+a!oidance mode. Kast
Reco!ery is intended to rapidly repair sin#le pac'et loss: allo*in# te sender to contin"e
to maintain te A$J+cloc'ed data rate for ne* data *ile te pac'et loss repair is ,ein#
"nderta'en. Tis is ,eca"se tere is still a seM"ence of A$Js arri!in# at te sender: so
tat te net*or' is contin"in# to pass timin# si#nals to te sender indicatin# te rate at
*ic pac'ets are arri!in# at te recei!er. Only *en te repair as ,een completed does
te sender drop its *indo* to te ssthresh !al"e as part of te transition to
con#estiona!oidance
mode <5=.
Te oter si#nal of pac'et loss is a complete cessation of any A$J pac'ets arri!in# to te
sender. Te sender cannot *ait indefinitely for a delayed A$J: ,"t m"st ma'e te
ass"mption at some point in time tat te ne3t "nac'no*led#ed data se#ment m"st ,e
retransmitted. Tis is mana#ed ,y te sender maintainin# a +etransmission Timer . Te
maintenance of tis timer as performance and efficiency implications. If te timer
tri##ers too early: te sender *ill p"s d"plicate data into te net*or' "nnecessarily. If
te timer tri##ers too slo*ly: te sender *ill remain idle for too lon#: "nnecessarily
slo*in# do*n te flo* of data. Te T$% sender "ses a timer to meas"re te elapsed time
,et*een sendin# a data se#ment and recei!in# te correspondin# ac'no*led#ment.
Indi!id"al meas"rements of tis time inter!al *ill e3i,it si#nificant !ariance: and
implementations of T$% "se a smootin# f"nction *en "pdatin# te retransmission
timer of te flo* *it eac meas"rement. Te commonly "sed al#oritm *as ori#inally
descri,ed ,y Van \aco,son <P=: modified so tat te retransmission timer is set to te
smooted ro"nd+trip+time !al"e: pl"s fo"r times a smooted mean de!iation factor <1O=.
Nen te retransmission timer e3pires: te actions are similar to tat of d"plicate A$J
pac'ets: in tat te sender m"st red"ce its sendin# rate in response to con#estion. Te
tresold !al"e: ssthresh : is set to alf of te c"rrent !al"e of o"tstandin#
"nac'no*led#ed data: as in te d"plicate A$J case. 2o*e!er: te sender cannot ma'e
any !alid ass"mptions a,o"t te c"rrent state of te net*or': #i!en tat no "sef"l
information as ,een pro!ided to te sender for more tan one RTT inter!al. In tis case:
te sender closes te con#estion *indo* ,ac' to one se#ment: and restarts te flo* in
slow start +mode ,y sendin# a sin#le se#ment. Te difference from te initial slow start is
tat: in tis case: te ssthresh !al"e is set so tat te sender *ill pro,e te con#estion area
more slo*ly "sin# a linear sendin# rate increase *en te con#estion *indo* reaces te
remem,ered sstres !al"e.
$on#estion A!oidance
$ompared to Slow Start : con#estion a!oidance is a more tentati!e pro,in# of te
net*or' to disco!er te point of tresold of pac'et loss. Nere Slow Start "ses an
e3ponential increase in te sendin# rate to find a first+le!el appro3imation of te loss
tresold: con#estion a!oidance "ses a linear #ro*t f"nction.
Nen te !al"e of cwnd is #reater tan ssthresh : te sender increments te !al"e of
cwnd ,y te !al"e S2SS 8 S2SS7cwnd : in response to eac recei!ed nond"plicate A$J
<4=: ens"rin# tat te con#estion *indo* opens ,y one se#ment *itin eac RTT time
inter!al.
Te con#estion *indo* contin"es to open in tis fasion "ntil pac'et loss occ"rs. If te
pac'et loss is isolated to a sin#le pac'et *itin a pac'et seM"ence: te res"ltant d"plicate
A$Js *ill tri##er te sender to al!e te sendin# rate and contin"e a linear #ro*t of te
con#estion *indo* from tis ne* point: as descri,ed a,o!e in fast reco!ery.
Te ,ea!ior of cwnd in an idealiLed confi#"ration is so*n in Ki#"re 5:
Ki#"re 5: Sim"lation of Sin#le T$% Transfer
;Note:$lic' a,o!e for lar#er !ie*
alon# *it te correspondin# data+flo* rates. Te o!erall caracteristics of te T$%
al#oritm are an initial relati!ely fast scan of te net*or' capacity to esta,lis te
appro3imate ,o"nds of ma3imal efficiency: follo*ed ,y a cyclic mode of adapti!e
,ea!ior tat reacts M"ic'ly to con#estion: and ten slo*ly increases te sendin# rate
across te area of ma3imal transfer efficiency.
%ac'et loss: as si#naled ,y te tri##erin# of te retransmission timer: ca"ses te sender to
recommence slo*+start mode: follo*in# a timeo"t inter!al. Te correspondin# data+flo*
rates are indicated in Ki#"re P.
Ki#"re P: Sim"lation of T$% Transfer *it Tail Drop T"e"e
;Note:$lic' a,o!e for lar#er !ie*
Te inefficiency of tis mode of performance is ca"sed ,y te complete cessation of any
form of flo* si#nalin# from te recei!er to te sender. In te a,sence of any information:
te sender can only ass"me tat te net*or' is ea!ily con#ested: and so m"st restart its
pro,in# of te net*or' capacity *it an initial con#estion *indo* of a sin#le se#ment.
Tis leads to te performance o,ser!ation tat any form of pac'et+drop mana#ement tat
tends to discard te trailin# end of a seM"ence of data pac'ets may ca"se si#nificant T$%
performance de#radation: ,eca"se s"c drop ,ea!ior forces te T$% session to
contin"ally time o"t and restart te flo* from a sin#le se#ment a#ain.
Assistin# T$% %erformance Net*or'+RED and E$N
Alto"# T$% is an end+to+end protocol: it is possi,le for te net*or' to assist T$% in
optimiLin# performance. One approac is to alter te M"e"e ,ea!io"r of te net*or'
tro"# te "se of +andom *arly 9etection 9RED>. RED permits a net*or' ro"ter to
discard a pac'et e!en *en tere is additional space in te M"e"e. Alto"# tis may
so"nd inefficient: te interaction ,et*een tis early pac'et+drop ,ea!io"r and T$% is
!ery effecti!e.
RED "ses a te *ei#ted a!era#e M"e"e len#t as te pro,a,ility factor for pac'et drop.
As te a!era#e M"e"e len#t increases: te pro,a,ility of a pac'et ,ein# dropped: rater
tan ,ein# M"e"ed: increases. As te M"e"e len#t decreases: so does te pac'et+drop
pro,a,ility. 9See Ki#"re 1O>. Small pac'et ,"rsts can pass tro"# a RED filter relati!ely
intact: *ile lar#er pac'et ,"rsts *ill e3perience increasin#ly i#er pac'et+discard rates.
S"stained load *ill f"rter increase te pac'et+discard rates. Tis implies tat te T$%
sessions *it te lar#est open *indo*s *ill a!e a i#er pro,a,ility of e3periencin#
pac'et drop: ca"sin# a ,ac'+off in te *indo* siLe.
Ki#"re 1O: Red Uea!ior
;Note:$lic' a,o!e for lar#er !ie*
A maIor #oal of RED is to a!oid a sit"ation in *ic all T$% flo*s e3perience
con#estion at te same time: all ten ,ac' off and res"me at te same rate: and tend to
syncroniLe teir ,ea!io"r <11:1&=. Nit RED: te lar#er ,"rstin# flo*s e3perience a
i#er pro,a,ility of pac'et drop: *ile flo*s *it smaller ,"rst rates can contin"e
*ito"t "nd"e impact. RED is also intended to red"ce te incidence of complete loss of
A$J si#nals: leadin# to timeo"t and session restart in slo*+start mode. Te intent is to
si#nal te ea!iest ,"rstin# T$% sessions te li'eliood of pendin# M"e"e sat"ration and
tail drop ,efore te onset of s"c a tail+drop con#estion condition: allo*in# te T$%
session to "nderta'e a fast retransmit reco!ery "nder conditions of con#estion a!oidance.
Anoter o,Iecti!e of RED is to allo* te M"e"e to operate efficiently: *it te M"e"e
dept ran#in# across te entire M"e"e siLe *itin a timescale of M"e"e dept oscillation
te same order as te a!era#e RTT of te traffic flo*s.
Ueind RED is te o,ser!ation tat T$% sets !ery fe* ass"mptions a,o"t te net*or's
o!er *ic it m"st operate: and tat it cannot co"nt on any consistent performance
feed,ac' si#nal ,ein# #enerated ,y te net*or'. As a minimal approac: T$% "ses
pac'et loss as its performance si#nal: interpretin# small+scale pac'et+loss e!ents as pea'
load con#estion e!ents and e3tended pac'et loss e!ents as a si#n of more critical
con#estion load. RED attempts to increase te n"m,er of small+scale con#estion si#nals:
and in so doin# a!oid lon#+period s"stained con#estion conditions.
It is not necessary for RED to discard te randomly selected pac'et. Te intent of RED is
to si#nal te sender tat tere is te potential for M"e"e e3a"stion: and tat te sender
so"ld adapt to tis condition. An alternati!e mecanism is for te ro"ter e3periencin#
te load to mar' pac'ets *it an e3plicit Congestion *1perienced 9$E> ,it fla#: on te
ass"mption tat te sender *ill see and react to tis fla# settin# in a manner compara,le
to its response to sin#le pac'et drop <1(= <1-=. Tis mecanism: *1plicit Congestion
Notification 9E$N>: "ses a &+,it sceme: claimin# ,its . and 4 of te I% Version - Typeof+
Ser!ice 9ToS> field 9or te t*o $"rrently Fn"sed <$F= ,its of te I% 9ifferentiated
Services field>. Uit . is set ,y te sender to indicate tat it is an E$N+capa,le transport
system 9te E$T ,it>. Uit 4 is te $E ,it: and is set ,y a ro"ter *en te a!era#e M"e"e
len#t e3ceeds confi#"red tresold le!els. Te E$N al#oritm is tat an acti!e ro"ter
*ill perform RED: as descri,ed. After a pac'et as ,een selected: te ro"ter may mar'
te $E ,it of te pac'et if te E$T ,it is set@ oter*ise: it *ill discard te selected pac'et.
9See Ki#"re 11>.
Ki#"re 11: Operation of E3plicit $on#estion Notification
;Note:$lic' a,o!e for lar#er !ie*
Te T$% interaction is sli#tly more in!ol!ed. Te initial T$% SZN andsa'e incl"des
te addition of E$N+eco capa,ility and Congestion 5indow +educed 9$NR> capa,ility
fla#s to allo* eac system to ne#otiate *it its peer as to *eter it *ill properly andle
pac'ets *it te $E ,it set d"rin# te data transfer. Te sender sets te E$T ,it in all
pac'ets sent. If te sender recei!es a T$% pac'et *it te E$N+eco fla# set in te T$%
eader: te sender *ill adI"st its con#estion *indo* as if it ad "nder#one fast reco!ery
from a sin#le lost pac'et.
Te ne3t sent pac'et *ill set te T$% $NR fla#: to indicate to te recei!er tat it as
reacted to te con#estion. Te additional ca!eat is tat te sender *ill react in tis *ay at
most once e!ery RTT inter!al. K"rter: T$% pac'ets *it te E$N+eco fla# set *ill
a!e no f"rter effect on te sender *itin te same RTT inter!al. Te recei!er *ill set
te E$N+eco fla# in all pac'ets *en it recei!es a pac'et *it te $E ,it set. Tis *ill
contin"e "ntil it recei!es a pac'et *it te $NR ,it set: indicatin# tat te sender as
reacted to te con#estion. Te E$T fla# is set only in pac'ets tat contain a data payload.
T$% A$J pac'ets tat contain no data payload so"ld ,e sent *it te E$T ,it clear.
Te connection does not a!e to a*ait te reception of tree d"plicate A$Js to detect te
con#estion condition. Instead: te recei!er is notified of te incipient con#estion
condition tro"# te e3plicit settin# of a notification ,it: *ic is in t"rn ecoed ,ac' to
te sender in te correspondin# A$J. Sim"lations of E$N "sin# a RED mar'in#
f"nction indicate sli#tly s"perior tro"#p"t in comparison to confi#"rin# RED as a
pac'et+discard f"nction.
2o*e!er: *idespread deployment of E$N is not considered li'ely in te near f"t"re: at
least in te conte3t of Version - of I%. At tis sta#e: tere as ,een no e3plicit
standardiLation of te field *itin te I%!- eader to carry tis information: and te
deployment ,ase of I% is no* so *ide tat any modifications to te semantics of fields in
te I%!- eader *o"ld need to ,e !ery caref"lly considered to ens"re tat te can#ed
field interpretation did not e3ercise some malformed ,ea!ior in older !ersions of te
T$% stac' or in older ro"ter soft*are implementations.
E$N pro!ides some le!el of performance impro!ement o!er a pac'et+drop RED sceme.
Nit lar#e ,"l' data transfers: te impro!ement is moderate: ,ased on te difference
,et*een te pac'et retransmission and con#estion+*indo* adI"stment of RED and te
con#estion+*indo* adI"stment of E$N. Te most nota,le impro!ements indicated in
E$N sim"lation e3periments occ"r *it sort T$% transactions 9commonly seen in Ne,
transactions>: *ere a RED pac'et drop of te initial data pac'et may ca"se a si3+second
retransmit delay. $omparati!ely: te E$N approac allo*s te transfer to proceed
*ito"t tis len#ty delay.
Te maIor iss"e *it E$N is te need to can#e te operation of ,ot te ro"ters and te
T$% soft*are stac's to accommodate te operation of E$N. Nile te E$N proposal is
caref"lly constr"cted to allo* an essentially "ncoordinated introd"ction into te Internet
*ito"t ne#ati!e side effects: te effecti!eness of E$N in impro!in# o!erall net*or'
tro"#p"t *ill ,e apparent only after tis approac as ,een *idely adopted. As te
Internet #ro*s: its inertial mass #enerates a nat"ral resistance to f"rter tecnolo#ical
can#e@ terefore: it may ,e some years ,efore E$N is *idely adopted in ,ot ost
soft*are and Internet ro"tin# systems. RED: on te oter and: as ad a more rapid
introd"ction to te Internet: ,eca"se it reM"ires only a local modification to ro"ter
,ea!ior: and relies on e3istin# T$% ,ea!ior to react to te pac'et drop.
T"nin# T$%
2o* can te ost optimiLe its T$% stac' for optim"m performanceY Many
recommendations can ,e considered. Te follo*in# s"##estions are a com,ination of
tose meas"res tat a!e ,een *ell st"died and are 'no*n to impro!e T$% performance:
and tose tat appear to ,e i#ly prod"cti!e areas of f"rter researc and in!esti#ation <1=
.
/se a good TC, protocol stack : Many of te performance patolo#ies tat e3ist
in te net*or' today are not necessarily te ,yprod"ct of o!ers",scri,ed
net*or's and conseM"ent con#estion. Many of tese performance patolo#ies
e3ist ,eca"se of poor implementations of T$% flo*+control al#oritms@
inadeM"ate ,"ffers *itin te recei!er@ poor 9or no> "se of pat+MTF disco!ery@
no s"pport for fast+retransmit flo* reco!ery: no "se of *indo* scalin# and
SA$J: imprecise "se of protocol+reM"ired timers: and !ery coarse+#rained timers.
It is "nclear *eter net*or' in#ress+imposed T"ality+of+Ser!ice 9ToS>
str"ct"res *ill adeM"ately compensate for s"c implementation deficiencies. Te
concl"sion is tat attemptin# to address te symptoms is not te same as c"rin#
te disease. A #ood protocol stac' can prod"ce e!en ,etter res"lts in te ri#t
en!ironment.
Implement a TC, Selective Acknowledgment (SAC:# mechanism : SA$J:
com,ined *it a selecti!e repeat+transmission policy: can elp o!ercome te
limitation tat traditional T$% e3periences *en a sender can learn only a,o"t a
sin#le lost pac'et per RTT.
Implement larger buffers with TC, window-scaling options : Te T$% flo*
al#oritm attempts to *or' at a data rate tat is te minim"m of te delay,and*idt
prod"ct of te end+to+end net*or' pat and te a!aila,le ,"ffer space
of te sender. 0ar#er ,"ffers at te sender and te recei!er assist te sender in
adaptin# more efficiently to a *ider di!ersity of net*or' pats ,y permittin# a
lar#er !ol"me of traffic to ,e placed in fli#t across te end+to+end pat.
Support TC, *CN negotiation : E$N ena,les te ost to ,e e3plicitly informed of
conditions relatin# to te onset of con#estion *ito"t a!in# to infer s"c a
condition from te reser!e stream of A$J pac'ets from te recei!er. Te ost can
react to s"c a condition promptly and effecti!ely *it a data flo*+control
response *ito"t a!in# to in!o'e pac'et retransmission.
/se a higher initial TC, slow-start rate than the current % 2SS (2a1imum
Segment Si4e# per +TT . A siLe tat seems feasi,le is an initial ,"rst of & MSS
se#ments. Te ass"mption is tat tere *ill ,e adeM"ate M"e"in# capa,ility to
mana#e tis initial pac'et ,"rst@ te pro!ision to ,ac' off te send *indo* to 1
MSS se#ment so"ld remain intact to allo* sta,le operation if te initial coice
*as too lar#e for te pat. A ro,"st initial coice is t*o se#ments: alto"#
sim"lations a!e indicated tat fo"r initial se#ments is also i#ly effecti!e in
many sit"ations.
/se a host platform that has sufficient processor and memory capacity to drive
the network . Te i#est+M"ality ser!ice net*or' and optimally pro!isioned
access circ"its cannot compensate for a ost system tat does not a!e s"fficient
capacity to dri!e te ser!ice load. Tis is a condition tat can ,e o,ser!ed in lar#e
or !ery pop"lar p",lic Ne, ser!ers: *ere te pea' application load on te ser!er
dri!es te platform into a state of memory and processor e3a"stion: e!en to"#
te net*or' itself as adeM"ate reso"rces to mana#e te traffic load.
All tese actions a!e one tin# in common: Tey can ,e deployed incrementally at te
ed#e of te net*or' and can ,e deployed indi!id"ally. Tis allo*s end systems to o,tain
s"perior performance e!en in te a,sence of te net*or' pro!ider t"nin# te net*or'[s
ser!ice response *it !ario"s internal ToS mecanisms.
$oncl"sion
T$% is not a predicti!e protocol. It is an adapti!e protocol tat attempts to operate te
net*or' at te point of #reatest efficiency. T"nin# T$% is not a case of ma'in# T$% pass
more pac'ets into te net*or'. T"nin# T$% in!ol!es reco#niLin# o* T$% senses
c"rrent net*or' load conditions: *or'in# tro"# te ine!ita,le compromise ,et*een
ma'in# T$% i#ly sensiti!e to transient net*or' conditions: and ma'in# T$% resilient
to *at can ,e re#arded as noise si#nals.
If te performance of end+to+end T$% is te percei!ed pro,lem: te most effecti!e
ans*er is not necessarily to add ToS ser!ice differentiation into te net*or'. Often: te
#reatest performance impro!ement can ,e made ,y "p#radin# te *ay tat osts and te
net*or' interact tro"# te appropriate confi#"ration of te ost T$% stac's.
In te ne3t article on tis topic: *e *ill e3amine o* T$% is facin# ne* callen#es *it
increasin# "se of *ireless: sort+li!ed connections: and ,and*idt+limited mo,ile
de!ices: as *ell as te contin"in# effort for impro!ed T$% performance. Ne[ll loo' at a
n"m,er of proposals to can#e te standard actions of T$% to meet tese !ario"s
reM"irements and o* tey *o"ld interact *it te e3istin# T$% protocol.