Вы находитесь на странице: 1из 41

Review of

TECHNISCHE
UNIVERSITÄT Internet Protocol Suite
ILMENAU
Internet Protocol Suite
‰ Link Layer: Ethernet, PPP, ARP, MAC Addressing
‰ Network Layer: IP, ICMP, Routing
Integrated Hard- and Software Systems

‰ Transport Layer: TCP, UDP, Port Numbers, Sockets


‰ Application Layer: FTP, Telnet & Rlogin, HTTP, RTP

TCP
http://www.tu-ilmenau.de/ihs

‰ Basic Properties
‰ TCP Datagram Format
‰ Connection Setup and Release
‰ MTU and MSS
‰ Cumulative, Delayed and Duplicate Acknowledgements
‰ Sliding Window Mechanism
‰ Flow and Error Control
Internet Protocol Suite

TCP/IP = the “Internet protocol suite“ = a family of protocols for the “Internet”
Internet guesstimates 2003:
‰ 800 million users (x 2 each two years), 200 million permanent hosts

Standardisation:
‰ ISOC: Internet Society
‰ IAB: Internet Architecture Board
z IETF: Internet Engineering Task Force: http://www.ietf.org
Standards & other informations are published as RFCs: Requests for
Comments
z IRTF: Internet Research Task Force

Implementations:
‰ De-facto standard: BSD 4.x implementations (Berkeley Software Distribution)
‰ Subsequent versions come with new TCP features, e.g.
4.3 BSD Tahoe (1988): slow start, congestion avoidance, fast retransmit
4.3 BSD Reno (1990): fast recovery
‰ Other TCP/IP stacks derived from BSD
‰ Implemented mechanisms, default parameter settings, and bugs are
different on different operating systems (e.g. versions of MS Windows)!
Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 2
TCP/IP Layer Overview

TCP/IP Layers Tasks Protocol Examples


(OSI model*)

Application Telnet, rlogin, FTP, SMTP,


Application specific
(7) SNMP, ...
End-to-end flow of data
Transport between application TCP, UDP
(4)
processes

Network Routing of packets


IP, ICMP
(3) between hosts

Hardware interface
Link PPP, Ethernet, IEEE 802.x,
(2)
Packet transfer be-
ARP
tween network nodes

* Mapping between TCP/IP and OSI layers is not always exact.

Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 3


TCP/IP Encapsulation

Example:
user data
Application data
transfer using TCP Application

appl.
user data
header

TCP
TCP
application data
header
20
TCP segment
IP
IP TCP
application data
header header
20
IP datagram Ethernet
20...65536 bytes Driver

eth IP TCP eth


application data
header header header trailer
14 20 20 4
Ethernet frame
Ethernet: 46...1500 bytes

Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 4


TCP/IP Basics: Link Layer

User User User User


Process Process Process Process
Application Layer

TCP UDP Transport Layer

ICMP IP ... Network Layer

Hardware
ARP
Interface ... Link Layer

Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 5


Link Layer Protocols
Examples:
‰ Ethernet (encapsulation of higher layer packets is defined in RFC 894)
‰ PPP: Point-to-Point Protocol for serial lines (RFCs 1332, 1548)

MTU: Maximum Transfer Unit (or Max. Transmission Unit)


‰ Maximum IP packet size in bytes (e.g. for Ethernet: 1500, X.25 Frame
Relay: 576)
Path MTU:
‰ Smallest MTU of any data link in the path between two hosts
‰ Used to avoid IP fragmentation
Path MTU=576
‰ TCP option: path MTU discovery (RFC 1191)
modem eth
Loopback Interface: MTU=576 MTU=1500

‰ A client application can connect to the corresponding server application on


the same host by using the loopback IP address “localhost“ = 127.0.0.1
‰ Implemented at the link layer, i.e. full processing of transport and IP layers
ARP: Address Resolution Protocol (RFC 826)
‰ Address resolution from 32-bit IP addresses to hardware addresses (e.g.
48-bit)

Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 6


TCP/IP Basics: Network Layer

User User User User


Process Process Process Process
Application Layer

TCP UDP Transport Layer

ICMP IP ... Network Layer

Hardware
ARP
Interface ... Link Layer

Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 7


IP: Internet Protocol

IP provides routing (forwarding) between hosts:


‰ Based on 32-bit IP addresses *
‰ Hop-by-hop using routing tables
Unreliable, connectionless datagram delivery service:
‰ packet loss, out-of-order delivery, duplication
IP fragmentation: used on any link with MTU < original datagram length:
‰ Duplicates IP header for each fragment and sets flags for re-assembly
‰ Re-assembly at the receiving host only, never in the network
RFC 791

* Applications use the Domain Name Service (DNS) to convert hostnames


(e.g. “www.lucent.com“) into IP addresses (135.112.22.95) and vice-versa.
IPv6 uses 128-bit addresses

Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 8


IP Datagram Format

QoS
Number requirements; - (reserved) IP datagram “Real“
of 32-bit rarely used - don‘t fragment length in bytes fragment
IPv4 words and supported - more fragments (limit = 65536) offset / 8

Unique 4-bit 4-bit head-


8-bit type of service 16-bit total length (in bytes)
identifier version er length
(counter)
3-bit
16-bit identification 13-bit fragment offset
flags

Limit on the 8-bit time to live 8-bit protocol 16-bit IP header checksum 20 bytes
number of
routers
(countdown) 32-bit source IP address

32-bit destination IP address


Higher layer
identifier, options (if any)
e.g.:
ICMP=1
TCP=6
data
UDP=17 16-bit one‘s complement sum of
the IP header only

checksum error =>


discard datagram + try to send
ICMP message
Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 9
ICMP: Internet Control Message Protocol

ICMP packet consists of IP header + ICMP message


Used for queries and to communicate error messages back to the sender,
e.g.:
‰ “IP header bad“
‰ “echo request“ (or reply)
‰ “host unreachable“
‰ Mobile IP messages
Messages are used by higher layers, e.g.:
‰ ping, traceroute, TCP, ... HTTP
RFC 792

Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 10


TCP/IP Basics: Transport Layer

User User User User


Process Process Process Process
Application Layer

TCP UDP Transport Layer

ICMP IP ... Network Layer

Hardware
ARP
Interface ... Link Layer

Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 11


UDP vs. TCP

UDP: User Datagram Protocol (RFC 768)


‰ Simple, unreliable, datagram-oriented transport of application data
blocks
TCP: Transmission Control Protocol (RFC 793 + others)
‰ Connection-oriented, reliable byte stream service
‰ Details: see section on TCP
Port numbers are used for application multiplexing:
‰ Unique address = IP address + port number = “socket“
‰ Concept of well-known ports, e.g. TCP port 21 for FTP (RFC 1340)

Popular API for TCP and UDP connections: Socket API


‰ “Stream sockets“ use TCP
‰ “Datagram sockets“ use UDP

Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 12


UDP Datagram Format

Used
Usedforfor
application
application
multiplexing

16-bit source port number 16-bit destination port number


8 bytes
UDP datagram 16-bit UDP length 16-bit UDP checksum
length in bytes
(redundant)
data (if any)

Optional 16-bit one‘s complement


sum of UDP pseudo-header (12 bytes
of the IP header ) + UDP header +
data (padded to 16-bit multiple)

checksum error =>


discard datagram silently

Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 13


TCP/IP Basics: Selected Applications

User User User User Application


Process Process Process Process
Layer

TCP UDP Transport Layer

ICMP IP ... Network Layer

Hardware
ARP
Interface ... Link Layer

Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 14


FTP: File Transfer Protocol

File transfer based on TCP


TCP control connection:
‰ To well-known server port 21
‰ ASCII commands
TCP data connection
QoS requirements:
‰ High throughput (optimise TCP bulk data flow)
RFC 959

Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 15


Telnet and Rlogin

Used for remote login based on TCP


‰ Rlogin (RFC 1282):
z Simple protocol designed for UNIX hosts
‰ Telnet (RFC 854):
z Any OS
z Option negotiation
z More flexible and better performance

Client operation principle:


‰ Send each keystroke to the server
‰ Option: TCP’s Nagle algorithm groups multiple bytes into one segment
‰ Display every response from the server
QoS requirements:
‰ Low-RTT transport of small packets (optimise TCP interactive data flow)

RTT = round-trip-time (sender – receiver – sender)


Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 16
HTTP: Hypertext Transfer Protocol
Transfer of webpages based on TCP:
‰ Webpage typically consists of an HTML (Hyper Text Markup Language)
document + various embedded objects, e.g. pictures
HTTP/1.0:
‰ Objects are (requested and received) serially
‰ For each object, a new TCP connection is established, used and
released
‰ Multiple connections: several TCP connections can be used in parallel
HTTP/1.1: performance improvements by:
‰ Persistent Connections:
z TCP connections are not released after each object, but used for the
next one
– avoids TCP connection establishment and termination
– avoids slow start for each new connection
‰ Pipelining:
z Multiple objects can be requested in one packet
z Requested objects are sent sequentially over one TCP connection

Together with multiple connections (HTTP/1.0 feature), these options result in


significant performance improvements
Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 17
RTP: Real-time Transport Protocol

Transfer of real-time data based on UDP


RTP:
‰ for media with real-time characteristics (audio/video)
‰ services: payload type specification, sequence numbering, timestamping,
source identification & synchronization, delivery monitoring
‰ no guaranteed quality of service (QoS)

RTCP (Real-time Transport Control Protocol):


‰ QoS monitoring & periodic feedback:
z Sender report (synchronisation, expected rates, distance)
z Receiver report (loss ratios, jitter)

Network independent: on top of unreliable, low-delay transport service


RFC 1889
ITU-T H.225.0 Annex A => H.323 => e.g. MS Netmeeting, VoIP

Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 18


Summary: Internet Protocol Suite
The TCP/IP protocol suite is a
heterogenous family of protocols
for the global Internet
At the center and always used: IP
‰ Routing between hosts
Application data transport by
‰ UDP: unreliable datagram service
‰ TCP: reliable byte-stream service

TCP/IP stack is part of each operating system:


‰ Numerous different implementations and bugs exist

TCP performance is extremely important!


‰ TCP carries 62% of the flows, 85% of the packets,
and 96% of the bytes of Internet traffic
(http://www.cs.columbia.edu/~hgs/internet/traffic.html)
‰ TCP’s complex error control mechanisms are
designed for wired networks
=> special problems for wireless transport
Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 19
TCP (Transmission Control Protocol)
Properties
Connection-oriented, reliable byte-stream service:
‰ Reliability by ARQ (Automatic Repeat reQuest):
z TCP receiver sends acknowledgements (acks) back to TCP sender to confirm
delivery of received data
z Cumulative, positive acks for all contiguously received data
z Timeout-based retransmission of segments
‰ TCP transfers a byte stream:
z Segmentation into TCP segments, based on MTU
z Header contains byte sequence numbers

Congestion avoidance + flow control mechanism

In the following examples:


‰ Packet sequence numbers (instead of byte sequence numbers)
‰ ack i acknowledges receipt of packets through packet i (instead of bytes)

Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 20


TCP Segment Format
Identifies the
number of the
first data byte
in this segment 16-bit source port number 16-bit destination port number
within the byte
stream 32-bit sequence number

Ack for the 32-bit acknowledgment number 20 bytes


reverse link:
next sequence 4-bit head-
6 bits reserved 6-bit flags 16-bit window size
number that is er length
expected to be
received 16-bit TCP checksum 16-bit urgent pointer

options (if any) Advertised


window
Number of 32-
size:
bit words
number of
data (if any) bytes the
receiver is
16-bit one‘s complement sum of TCP willing to
pseudo-header (12 bytes of the IP accept
URG
header) + TCP header + data
ACK TCP is full duplex:
(padded to 16-bit multiple)
PSH
checksum error RST Each segment contains an ack for the
SYN reverse link
=> discard datagram silently!
FIN
=> using an erroneous header is
dangerous; loss will be detected by A ”pure” ack is a segment with empty data
other mechanisms

Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 21


TCP Connection Establishment and Termination

Active open: Client Server

Segment 1: Three-way handshake Passive open:


SYN + ISN* +
Segment 2:
options, e.g. MSS
SYN, ACK + ISN +
options, e.g. MSS
Segment 3: ACK

*ISN: initial sequence number


(RFC 793)
Passive close:
Active close:
=> Send EOF to
Application close => application
Segment 1: FIN
Half-close #1
Segment 2: ACK;
application can still send data

Half-close #2 Application close =>


Segment 4: ACK
Segment 3: FIN

=> Connection establishment & termination take at least 1 RTT

Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 22


MTU and MSS: Maximum Segment Size

Client Server
Application
Request to connect to Server

find SYN, MSS=536


TCP Connection
network
establishment
interface SYN, ACK, MSS=1460

MSS = 536 TCP MSS = 1460

- Fixed TCP header = 20 - Fixed TCP header = 20

- Fixed IP header = 20 IP - Fixed IP header = 20

MTU = 576 (e.g. modem) MTU = 1500 (e.g. ethernet)


Link Layer

MSS is optionally announced (not negotiated) by each host at TCP connection


establishment. The smaller value is used by both ends, i.e. 536 in the above example.
Note that “real“ TCP payload is smaller if TCP options are used.

Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 23


Cumulative Acknowledgements

A new cumulative ack is generated only on receipt of a new in-sequence


segment

TCP
40 39 38 37
TCP
Router receiver
sender
received:
...
33 34 35 36 35
timestep 36

41 40 39 38
received:
...
35
34 35 36 37 36
37

i data i ack

Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 24


Delayed Acknowledgements

Delaying acks reduces ack traffic


An ack is delayed until
‰ another segment is received, or
‰ delayed ack timer expires (200 ms typical)
New ack not produced
on receipt of segment 36,
but on receipt of 37

40 39 38 37 received:
...
35
33 35 36

41 40 39 38 received:
...
35
35 37 36
37

Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 25


Duplicate Acknowledgements 1

A dupack is generated whenever an out-of-order segment arrives at the


receiver (packet 37 gets lost)

packet loss

40 39 38 37 received:
...
36
34 36
2 timesteps

42 41 40 39 received:
...
36
x
36 36 38
dupack
on receipt of 38

Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 26


Duplicate Acknowledgements 2

Dupacks are not delayed


Dupacks may be generated when
‰ a segment is lost (see previous slide), or
‰ a segment is delivered out-of-order:

40 39 37 38 received:
...
36
34 36
1 timestep

41 40 39 37 received:
...
36
x
36 36 38

dupack
on receipt of 38
Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 27
Duplicate Acknowledgements 3

40 37 39 38
received:
...
Number of
34 36 36 dupacks
depends on
41 40 37 39 how much
received:
... out-of-order
34 36 36 36 a packet is
x
38
dupack

42 41 40 37 received: A series of
...
36 36 36 36 dupacks
x
38
allows the
dupack dupack 39 sender to
43 42 41 40
guess that a
received:
... single
36 36 36 39 36
37
packet has
dupack dupack new ack
38 been lost
39

Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 28


Window Based Flow Control 1
Sliding window protocol
Sender’s window

1 2 3 4 5 6 7 8 9 10 11 12 13

Acks received Not transmitted

Window size W is minimum of


‰ receiver’s advertised window - determined by available buffer space
at the receiver and signalled with each ack
‰ congestion window - determined by the sender, based on received
acks
TCP’s window based flow control is “self-clocking”:
‰ New segments are sent when outstanding segments are ack’d

Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 29


Window Based Flow Control 2
packet
Optimum window size: dimensions:
‰ W = data rate * RTT = “bandwidth-delay product” rate size
(optimum use of link capacity: “pipe is full”) transmit
time

40 39
TCP TCP
38 37
Router receiver
sender

35 36
33 34
W = 8 segments (33...40)

What if window size is too large?


‰ Queuing at intermediate routers (e.g. at wireless access point)
=> increased RTT due to queuing delays
=> potential of packet loss
What if window size is too small?
‰ Inefficiency: unused link capacity

Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 30


Packet Loss Detection Based on Timeout
TCP sender starts a timer for a segment (only one segment at a time)
If ack for the timed segment is not received before timer expires,
outstanding data are assumed to be lost and retransmitted
=> go-back-N ARQ
Retransmission timeout (RTO) is calculated dynamically based on
measured RTT:
‰ RTO = mean RTT + 4 * mean deviation of RTT
z Mean deviation δ = average of |sample – mean| is easier to
calculate than standard deviation (and larger, i.e. more
conservative)
‰ Large variations in the RTT increase the deviation, leading to larger
RTO
‰ RTT is measured as a discrete variable, in multiples of a “tick”:
z 1 tick = 500 ms in many implementations
z smaller tick sizes in more recent implementations (e.g. Solaris)
‰ RTO is at least 2 clock ticks
Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 31
Exponential Backoff

Double RTO on successive timeouts:

T1=RTO T2 = 2 * T1
Timeout interval doubled
Segment
transmitted
Timeout occurs
before ack received,
segment retransmitted

Total time until TCP gives up is up to 9 min


Rationale: Allow an intermediate, congested router to recover
Problem: If ack is lost, TCP just waits for the next timeout

Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 32


Packet Loss Detection Based on Dupacks:
Fast Retransmit Mechanism

TCP sender considers timeout as a strong indication that there is a


severe link problem
On the other hand, continuous reception of dupacks indicates that
following segments are delivered, and the link is ok
=> TCP sender assumes that a (single) packet loss has occurred if it
receives three dupacks consecutively
=> Only the (single) missing segment is retransmitted
=> selective-repeat ARQ

Note: 3 dupacks are also generated if a segment is delivered at least 3


places out-of-order
=> Fast retransmit useful only if lower layers deliver packets “almost
ordered” - otherwise, unnecessary fast retransmit

Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 33


Flow Control by the Sender
Slow Start
Initially, congestion window size (cwnd) = 1 MSS
Increment cwnd by 1 MSS on each new ack
Slow start phase ends when cwnd reaches ssthresh (slow-start
threshold)
=> cwnd grows exponentially with time during slow start (in theory)
‰ Factor of 1.5 per RTT if every other segment is ack’d
‰ Factor of 2 per RTT if every segment is ack’d
‰ In practice: increase is slower because of network delays (see next slide)

Congestion Avoidance
On each new ack, increase cwnd by 1/cwnd segments
=> cwnd grows linearly with time during congestion avoidance (in
theory)
‰ 1/2 MSS per RTT if every other segment ack’d
‰ 1 MSS per RTT if every segment ack’d

Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 34


Slow Start & Congestion Avoidance – Theory

Receiver’s
14 advertised
Congestion
12 window = 12
Avoidance
cwnd (segments)

10
8 ssthresh
6 Slow Start
4
2
0
0 1 2 3 4 5 6 7 8 9
Time / RTT

ƒ Theoretical assumption: after sending n segments, n acks arrive within one RTT.
ƒ Note that Slow Start starts slowly, but speeds up quickly.
Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 35
Slow Start – Reality (Including Network Delay)
Taking network delay into account, “cwnd increases exponentially” turns into:
‰ cwnd increases sub-exponentially
‰ pairs of segments are sent while pipe fills sending rate > data rate (cwnd > 2)
(timestep 4 onwards)
Simple example:
=> at some point in time there will
‰ one-way delay = 1 timestep be a packet loss, causing TCP
‰ data rate = 1 segment / timestep to slow down

#segments
Time- #segments #segments recv'd and
step Sender action cwnd sent outstanding ack'd Receiver action
0 initial values 1 0
send segment 1 1 1
1 1 receive and ack segment 1

2 receive ack 1 2 0
send segments 2 and 3 2 2
3 1 receive and ack segment 2

4 receive ack 2 3 1 1 receive and ack segment 3


send segments 4 and 5 2 3
5 receive ack 3 4 2 1 receive and ack segment 4
send segments 6 and 7 2 4
6 receive ack 4 5 3 1 receive and ack segment 5
send segments 8 and 9 2 5

Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 36


Congestion Control after Packet Loss
Packet loss detected by timeout (=> severe link problem):
Retransmit lost segments
Go back to Slow Start:
‰ Reduce cwnd to initial value of 1 MSS
‰ Set ssthresh to half of window size before packet loss:
z ssthresh = max((min(cwnd, receiver’s advertised window)/2 ), 2 MSS)

Packet loss detected by ≥3 dupacks (=> single packet loss, but link is ok):
Fast Retransmit single missing segment
Initiate Fast Recovery:
‰ Set ssthresh and cwnd to half of window size before packet loss:
z ssthresh = max((min(cwnd, receiver’s advertised window)/2), 2 MSS)
z cwnd = ssthresh + number of dupacks
‰ When a new ack arrives: continue with Congestion Avoidance:
z cwnd = ssthresh

Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 37


Packet Loss Detected by Timeout

25 Timeout
cwnd = 20
cwnd (segments)

20

15
ssthresh = 10
10 ssthresh = 8

5
cwnd = 1
0
0

9
12

15

20

22

25
Time / RTT
Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 38
Packet Loss Detected by ≥3 Dupacks

≥3 Dupacks
10
cwnd = 8
cwnd (segments)

8
6
ssthresh = 4
4
cwnd = 4
2 After Fast Recovery
0
0 2 4 6 10 12 14
Time / RTT
After fast retransmit and fast recovery window size is reduced in half
Multiple packet losses within one RTT can result in timeout
Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 39
Summary: TCP
TCP provides a connection-oriented,
reliable byte-stream service:
‰ application data stream is transferred in segments based on
lower layer MTU
‰ receiver sends back cumulative acknowledgements (acks)
‰ sliding window mechanism with flow control based on
z receiver’s advertised window,
z sender’s Slow Start and Congestion Avoidance mechanisms
‰ Error control & packet loss detection based on
z adaptive retransmission timeout => back to Slow Start,
z duplicate acknowledgments (dupacks) => Fast Retransmit &
Fast Recovery

Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 40


References

The bible:
W. Richard Stevens, “TCP/IP Illustrated, Volume 1: The Protocols“

Douglas E. Comer: Computernetzwerke und Internets. 3. Auflage,


Pearson Studium, Prentice Hall, 2002

The Internet...

Standards (RFCs): http://www.ietf.org/

Wireless Internet Andreas Mitschele-Thiel 6-Apr-06 41

Вам также может понравиться