Internet Protocol

Internet Protocols
The Internet protocols are the worlds most popular open-system (nonproprietary) protocol suite because they can be used to communicate across any set of interconnected networks and are equally well suited for LAN and WAN communications. The Internet protocols consist of a suite of communication protocols, of which the two best known are the TransmissionControl Protocol (TCP) and the Internet Protocol (IP).
Internet protocols span the complete range of OSI model layers
The Four Layers of TCP/IP protocol suite
Application Transport Network Link
Telnet, FTP, E-mail TCP, UDP IP, ICMP, IGMP
Device driver and Interface Card
Two Hosts on A LAN running FTP FTP Application Protocol FTP FTP
User Processes
Handles Application details
Client
Transport
Server
TCP Protocol IP Protocol Ethernet Protocol
TCP IP
Ethernet Driver
TCP
Kernel
Handles Communication details
Network
IP
Ethernet Driver
Link
Ethernet
FTP Client
TCP
FTP Protocol
FTP Server TCP
TCP Protocol
Router
IP Protocol
IP Protocol
IP
Ethernet Driver
Ethernet Protocol
IP
IP
Token ring Driver Token Ring Protocol Token ring Driver
Ethernet Driver
Ethernet
Two Networks connected through Router
Token ring
Various protocols at different layers in the TCP/IP protocol suite
User Process
User Process
User Process
User Process
Application
TCP
UDP
Transport
ICMP
IP
IGMP
Network
ARP
Hardware Interface
RARP
Link
Media
Internet Addresses
Every interface on an internet must have a unique Internet Address (called IP address). These addresses are 32-bit numbers.
The 32-bit addresses are normally written as four decimal numbers, one for each byte of the address. This is called dotted decimal notation. A multihomed host will have multiple IP addresses one per interface.
7 bits
Class A
24 bits HostId 14 bits 16 bits HostId 21 bits 8 bits
NetId
Class B
1 0
NetId
Class C
1 1 0 28 bits
NetId HostId
Class D
1 1 1 0
Multicast group ID 27 bits
Class E
11110
(reserved for future use)
The Domain Name System

Although the network interfaces on a host, and therefore the host itself, are known by IP addresses, humans work best using the name of a host. In the TCP/IP world the Domain Name System (DNS) is a distributed database that provides the mapping between IP addresses and hostnames.
Encapsulation of data as it goes down the protocol stack User data

Application Appl header
User data TCP
TCP header
Application data
TCP Segment
IP
IP header
Ethernet header
TCP header
Application data
IP datagram Ethernet Driver
Ethernet Trailer
IP header
TCP header
Application data
Ethernet Frame
14
20
20
Ethernet
46 to 1500 bytes
Demultiplexing of a received ethernet frame

Application Application Application Application
Demultiplexing based on port no. in TCP or UDP header
TCP
ICMP IGMP IP
UDP
Demultiplexing based on protocol value in IP header
RARP
Demultiplexing based on frame type in Ethernet header
ARP
Ethernet Driver
Incoming frame
Client-Server Model
Most networking applications are written assuming one side is the client and the other the server. The purpose of the application is for the server to provide some defined service for clients. We can categorize servers into two classes: iterative or concurrent.
Port Numbers
The TCP and UDP identify applications using 16-bit numbers. Servers are normally known by their well known port number. For example, every TCP/IP implementation that provides an FTP server uses a port number 21. Telnet server is on TCP port 23. For any implementation of TCP/IP well known port numbers are between 1 and 1023. A client does not care what port number it uses on its end. These port numbers are called ephemeral ports (i.e. short lived). Most TCP/IP implementations allocate between 1024 and 5000.
Application Programming Interfaces

Two popular application programming interfaces (APIs) for applications using the TCP/IP protocols are called sockets and TLI (Transport Layer Interface). The former is sometimes called "Berkeley sockets". The latter, originally developed by AT&T, is sometimes called XTI (X/Open Transport Interface).
Link Layer
Send/Receive IP datagrams for IP Module ARP Requests and Replies RARP Requests and Replies Different link layers - Ethernet, token ring, FDDI and Serial Lines (SLIP & PPP), loopback driver Two standards: Ethernet and IEEE 802 MTU and path MTU
Ethernet and IEEE 802.3

Family of local area networks (LAN), includes 3 main categories:
Ethernet and IEEE 802.3LAN specifications that operate at 10 Mbps over coaxial cable. 100-Mbps EthernetA single LAN specification, also known as Fast Ethernet, that operates at 100 Mbps over twisted-pair cable. 1000-Mbps EthernetA single LAN specification, also known as Gigabit Ethernet, that operates at 1000 Mbps (1 Gbps) over fiber and twisted-pair cables.
Eternet/IEEE802.3 physical characteristics

characterstics Ethernet IEEE802.3 10Base5 IEEE802.3 10Base2 IEEE802.3 10BaseT
Data rate(Mbps)
Signaling method Maximum segment length Media
10
Baseband 500
10
Baseband 500
10
Baseband 185 50ohm coax (thin)
10
Baseband 100 Unshielded twisted pair
50ohm coax 50ohm coax (thick) (thick)
topology
Bus
Bus
Bus
Star
Characteristics of 100BaseT Media Types

Characteristics 100BaseTX 100BaseFX 100BaseT4
Cable
Number of pairs or strands Connector Maximum segment length Maximum network diameter
Category 5 UTP Type 1 and 2 STP

2 pairs ISO-8877 RJ45 connector 100meters 200meters
62.5/125 micron multi-mode fiber

3 strands
CAT 3,4,5 UTP

4 pairs ISO-8877 RJ45 connector
400meters 400meters
100meters 200meters
Various frame fields exist for both Ethernet and IEEE 802.3
Ethernet Encapsulation(RFC 894)
46-1500 bytes
destination address source address type
data 46-1500 bytes
CRC
2
type 0800
IP datagram
46-1500 bytes
type 0806 ARP request/ PAD reply
28
18
type RARP request/ PAD 8035 reply
28
18
IEEE 802.2/802.3 Encapsulation (RFC 1042)

802.2 LLC
length
802.3 MAC
Destination address
802.2 SNAP
data
Source address
DSAPSSAP Cntl Org code type AA AA 03 00
CRC
38-1492
Type IP datagram 0800
38-1492
Type ARP request/ reply PAD 0806
28
10
Type RARP request /replyPAD 8035
28
10
Gigabit Ethernet Gigabit Ethernet is an extension of the IEEE 802.3 Ethernet standard. Gigabit Ethernet offers 1000 Mbps of raw-data bandwidth while maintaining compatibility with Ethernet and Fast Ethernet network devices. Gigabit Ethernet provides for new, full-duplex operating modes for switch-to-switch and switch-to-end-station connections. It also permits half-duplex operating modes for shared connections by using repeaters and CSMA/CD. Furthermore, Gigabit Ethernet uses the same frame format, frame size, and management objects used in existing IEEE 802.3 networks. In general, Gigabit Ethernet is expected to initially operate over fiberoptic cabling but will be implemented over Category 5 unshielded twisted-pair (UTP) and coaxial cabling as well.
Migrating to Gigabit Ethernet

Upgrading switch-to-switch links Upgrading switch-to-server links Upgrading a Fast Ethernet backbone Upgrading a shared FDDI backbone Upgrading high-performance desktops
SLIP: Serial Line IP

It is a simple form of encapsulation for IP datagrams on Serial Line. (RFC 1055) Framing used by serial line are: 1. The IP datagram is prepended and terminated by the special character called END (0xc0). 2. If a byte of the IP datagram equals the END character, the 2 byte sequence 0xdb, 0xdc is transmitted instead. 3. If a byte of the IP datagram equals the SLIP ESC(0xdb), the 2byte sequence 0xdb, 0xdd is transmitted instead. c0 END c0 1 1 ESC db dc 1 1 db 1 ESC db dd 1 1 END c0 1
PPP: Point to Point Protocol

A way to encapsulate IP datagrams on a serial link, async or bit oriented sync links A link control protocol (LCP) to establish, configure and test data link A family of network control protocols (NCPs) PPP has advantages over SLIP RFC 1548 and RFC 1332
Format of PPP frames
flag AddrControl protocol 7E FF 03
information
CRC
flag 7E
2
protocol 0021
upto 1500 bytes IP datagram
protocol link control data C021 protocol 8021

Network control data
Loopback Interface Most implementations support a loopback interface that allows a client and server on the same host to communicate with each other using TCP/IP. The class A network ID 127 is reserved for the loopback interface. By convention, most systems assign the IP address of 127.0.0.1 to this interface and assign it the name localhost. An IP datagram sent to the loopback interface must not appear on any network.
Processing of IP datagrams by loopback interface
IP output function
IP input function
Place on IP input queue loopback driver
yes
destination IP address equal broadcast address or multicast address? no
place on IP input queue Ethernet driver
Destination IP address yes Equal interface IP address?

no, use ARP to get destination Ethernet address
IP
ARP
Ethernet demultiplex based on Ethernet frame type
receive send
ARP
MTU There is a limit on the size of the frame for both Ethernet and 802.3 encapsulation. This limits the number of bytes of data to 1500 and 1492, respectively. This characteristic of the link layer is called the MTU, its maximum transmission unit.
Network Hyperchannel 16 Mbits/sec token ring (IBM) 4 Mbits/sec token ring (IEEE 802.5) FDDI Ethernet IEEE 802.3/802.2 X.25 Point-to-Point (low delay) MTU (bytes) 65535 17914 4464 4352 1500 1492 576 296
Path MTU
When two hosts on the same network are communicating with each other, it is the MTU of the network that is important. But when two hosts are communicating across multiple networks, each link can have a different MTU. The important numbers are not the MTUs of the two networks to which the two hosts connect, but rather the smallest MTU of any data link that packets traverse between the two hosts. This is called the path MTU.
IP: Internet Protocol

All TCP, UDP, ICMP data transmitted as IP datagrams. Provides unreliable, connectionless datagram delivery service. Hosts and routers have a routing table used for all routing decisions. Three types of routes: Host specific, network specific and default routes
Operation
The internet protocol implements two basic functions: addressing and fragmentation. The internet modules use the addresses carried in the internet header to transmit internet datagrams toward their destinations. The selection of a path for transmission is called routing. The internet modules use fields in the internet header to fragment and reassemble internet datagrams when necessary for transmission through "small packet" networks.
0
V E R S I O N
1
Type Of Service
F L A G S
0123456789 0123456789 0123456789 0 1
IHL
Total Length
Identification TTL Protocol
Fragment Offset
Header Checksum
Source Address Destination Address Options

Padding
Type of Service (PreDTRCx) Precedence (000-111) D (1 = minimize delay) T (1 = maximize throughout) R (1 = maximize reliability) C (1 = minimize cost) x (reserved and set to 0)
The Type of Service is used to indicate the quality of the servic desired.
Recommended values for type-of-service field
Application
Telnet FTP SMTP SNMP
Minimize delay
1 0 0 0
Maximize throughput
0 1 1 0
Maximize reliability
0 0 0 1
Minimize cost
0 0 0 0
Value
0x10 0x08 0x08 0x04
Fragmentation
Fragmentation of an internet datagram is necessary when it originates in a local net that allows a large packet size and must traverse a local net that limits packets to a smaller size to reach its destination. The internet fragmentation and reassembly procedure needs to be able to break a datagram into an almost arbitrary number of pieces that can be later reassembled.
Identification The identification field is used to distinguish the fragments of one datagram from those of another. Flags (xDM) x (reserved and set to 0) D (1 = Don't Fragment) M (1 = More Fragments)
fragment offset
The fragment offset field tells the receiver the position of a fragment in the original datagram.
An Example Fragmentation Procedure

If the total length is less than or equal the maximum transmission unit then submit this datagram to the next step in datagram processing; otherwise cut the datagram into two fragments, the first fragment being the maximum size, and the second fragment being the rest of the datagram. The first fragment is submitted to the next step in datagram processing, while the second fragment is submitted to this procedure in case it is still too large.
An Example Reassembly Procedure

For each datagram the buffer identifier is computed as the concatenation of the source, destination, protocol, and identification fields. If this is a whole datagram (that is both the fragment offset and the more fragments fields are zero), then any reassembly resources associated with this buffer identifier are released and the datagram is forwarded to the next step in datagram processing.
Options: variable
The options may appear or not in datagrams. They must be implemented by all IP modules (host and gateways). In some environments the security option may be required in all datagrams. The option field is variable in length. There may be zero or more options. There are two cases for the format of an option: Case 1: A single octet of option-type. Case 2: An option-type octet, an option-length octet, and the actual option-data octets.
Checksum
The internet header checksum is recomputed if the internet header is changed. For example, a reduction of the time to live, additions or changes to internet options, or due to fragmentation. This checksum at the internet level is intended to protect the internet header fields from transmission errors.
Errors
The internet protocol does not provide a reliable communication facility. There are no acknowledgments either end-to-end or hop-by-hop. There is no error control for data, only a header checksum. There are no retransmissions. There is no flow control.
Internet protocol errors may be reported via the ICMP messages.
C:\sahu\ibm\nwp>ipconfig /all Windows 2000 IP Configuration Host Name : SAHU Node Type : Mixed IP Routing Enabled : No WINS Proxy Enabled : No
Ethernet adapter Local Area Connection: Connection-specific DNS Suffix . : Description : D-Link DFE-530TX PCI adapter Physical Address : 00-80-C8-4D-00-55 DHCP Enabled : No IP Address : 169.254.0.15 Subnet Mask : 255.255.0.0 Default Gateway : 169.254.0.2 DNS Servers : 169.254.0.2 C:\sahu\ibm\nwp>
C:\sahu\ibm\nwp>netstat -r Route Table ================================================= Interface List 0x1 MS TCP Loopback interface
0x1000003 ...00 80 c8 4d 00 55 VIA PCI 10/100Mb Fast Ethernet Adapter
================================================= Active Routes: Network Destination Netmask Gateway Interface Metric

0.0.0.0 127.0.0.0 169.254.0.0 169.254.0.15 169.254.255.255 224.0.0.0 255.255.255.255 0.0.0.0 169.254.0.2 169.254.0.15 255.0.0.0 127.0.0.1 127.0.0.1 255.255.0.0 169.254.0.15 169.254.0.15 255.255.255.255 127.0.0.1 127.0.0.1 255.255.255.255 169.254.0.15 169.254.0.15 224.0.0.0 169.254.0.15 169.254.0.15 255.255.255.255 169.254.0.15 169.254.0.15 1 1 1 1 1 1 1
Default Gateway:
169.254.0.2
ARP: Address Resolution Protocol (rfc 826)

ARP is basic in every TCP/IP suite, without App or sysadmin. Provides mapping between 32 bit IP address and 48 bit MAC address ARP cache is maintained to store recent mappings. Normal expiration time is 20 min arp command used to examine and manipulate the cache
Operation of ARP when user types ftp hostname
hostname
hostname resolver IP address
(1)
FTP
(2)
establish connection with IP address
TCP
(3)
send IP datagram to IP address
(5)
ARP
(8) (9)
(4)
IP
(6)
Ethernet driver ARP request (Ethernet broadcast)
Ethernet driver
Ethernet driver
ARP
(7)
ARP
IP
TCP
ARP Packet Format Ethernet Dest Addr Ethernet Source Addr Frame Type 6 2 6 Ethernet Header
HardProto HardProto Sender I Target Target op Sender Eth typetypesizesize AddrEth AddIP Add Address 4 2 2 1 1 2 6 4 6 28 byte ARP request/reply 32 bit Internet address
ARP/RARP
48 bit Ethernet address
arp cache
An ARP cache is maintained on each host. This cache maintains the recent mappings from Internet addresses to hardware addresses. The normal expiration time of an entry in the cache is 20 minutes from the time the entry was created.
We can examine the ARP cache with the arp command. The -a option displays all entries in the cache: % arp -a
Proxy ARP
Proxy ARP lets a router answer ARP requests on one of its networks for a host on another of its networks. This fools the sender of the ARP request into thinking that the router is the destination host, when in fact the destination host is "on the other side" of the router. The router is acting as a proxy agent for the destination host, relaying packets to it from other hosts.
RARP: Reverse Address Resolution Protocol (rfc 903)

RARP is used to obtain IP address when bootstrapping Packet format same as ARP RARP req. is broadcast asking for senders IP address, MAC address provided. Reply is normally unicast. It is optional in TCP/IP implementation
RARP Packet Format

The format of an RARP packet is almost identical to an ARP packet. The only differences are that the frame type is 0x8035 for an RARP request or reply, and the op field has a value of 3 for an RARP request and 4 for an RARP reply.
ICMP: Internet Control Message Protocol (rfc 792)

Considered as part of IP layer. Required in every TCP/IP implementation Communicates error message and other conditions which need attention Acted on by IP or higher layer TCP, UDP Address mask request and reply ICMP timestamp request and reply
ICMP messages encapsulated within an IP datagram IP datagram IP Header 20 ICMP Message 0 78 type code ICMP Message
15 16 checksum
31
Contents depend on type and code
ICMP Message Types

type code 0 0 3 0 1 4 0 5 0 1 8 0 13 0 14 0 17 0 18 0 Description echo reply (ping reply) destination unreachable network unreachable host unreachable source quench (elementary flow control) Redirect redirect for network redirect for host echo request (ping request) timestamp request timestamp reply address mask request address mask reply
ICMP echo Request and Reply

Ping program is used to test whether another host is reachable. The program sends an ICMP echo request message to a host, expecting an ICMP echo reply to be returned. Ping also measures the round-trip time to the host, giving us some indication of how "far away" that host is.
Format of ICMPv4 and ICMPv6 echo request and reply message 0 type 7 8 code 15 16 checksum
31
identifier
Sequence number
Optional data
ICMP Address Mask Request and Reply

The ICMP address mask request is intended for a diskless system to obtain its subnet mask at bootstrap time. The requesting system broadcasts its ICMP request.
ICMP Address Mask request and reply messages 0 7 8 15 code (0) 16 checksum
31
type (17 or 18)
identifier
Sequence number
32-bit subnet mask
ICMP Timestamp Request and Reply

The ICMP timestamp request allows a system to query another for the current time. The recommended value to be returned is the number of milliseconds since midnight, Coordinated Universal Time(UTC).
UDP: User Datagram Protocol (rfc 768)

Simple, datagram oriented, transport protocol No reliability, does not guarantee delivery If exceeds MTU, IP datagram is fragmented ICMP unreachable error - Path MTU discovery with UDP ICMP source quench error UDP server - client IP address and port number, input queue, restricting IP
UDP encapsulation IP Datagram UDP Datagram IP Header 20 UDP Header UDP Data
8 UDP Header 0 15 16 bit source port 16 bit UDP Length
16 Destination port 16bit UDP checksum
31
Data (if any)
Protocol Application
The major uses of this protocol is the Internet Name Server, the Trivial File Transfer, SNMP. Protocol Number This is protocol 17 when used in the Internet Protocol.
UDP Server Design

Servers typically interact with the operating system and most servers need a way to handle multiple clients at the same time. Client IP Address and Port Number Destination IP Address UDP Input Queue Restricting Local IP Address Restricting Foreign IP Address Multiple Recipients per Port
Broadcasting, Multicasting
There are three kinds of IP addresses: unicast, broadcast, and multicast. Broadcasting is sending a packet to all hosts on a network (usually a locally attached network) and multicasting is sending a packet to a set of hosts on a network.
Broadcasting and multicasting only apply to UDP, where it makes sense for an application to send a single message to multiple recipients. TCP is a connection-oriented protocol that implies a connection between two hosts (specified by IP addresses) and one process on each host (specified by port numbers).
Filtering that takes place up the protocol stack when a frame is received. deliver
UDP
discard deliver discard deliver
IP
Device driver
discard deliver
Interface card
discard
Broadcasting
The four different forms of IP broadcast addresses:
Limited Broadcast Net-directed Broadcast Subnet-directed Broadcast All-subnets-directed Broadcast
Multicasting
IP multicasting provides two services for an application. 1. Delivery to multiple destinations. There are many applications that deliver information to multiple recipients: interactive conferencing and dissemination of mail or news to multiple recipients. 2. Solicitation of servers by clients.
Multicast Group Addresses

The format of a class D IP address.
26 bits
Class D
1110
Multicast group ID
A multicast group address is the combination of the high-order 4 bits of 1110 and the multicast group ID. These are normally written as dotted-decimal numbers and are in the range 224.0.0.0 through 239.255.255.255. Some multicast group addresses are assigned as well-known addresses by the IANA. The multicast address 224.0.1.1 is for NTP, the Network Time Protocol, 224.0.0.9 is for RIP-2.
Mapping of a class D IP address into Ethernet multicast address. The IANA owns an Ethernet address block, which in hexadecimal is 00:00:5e. This is the high-order 24 bits of the Ethernet address, meaning that this block includes addresses in the range 00:00:5e:00:00:00 through 00:00:5e:ff:ff:ff. The IANA allocates half of this block for multicast addresses. Ethernet Multicast Address: 01:00:00:00:00:00 Ethernet Broadcast Address: ff:ff:ff:ff:ff:ff
Since the upper 5 bits of the multicast group ID are ignored in this mapping, it is not unique. Thirty-two different multicast group IDs map to each Ethernet address. For example, the multicast addresses 224.128.64.32 (hex e0.80.40.20) and 224.0.64.32 (hex e0.00 40.20) both map into the Ethernet address 01:00:5e:00:40:20.
How it works? The sending process specifies a destination IP address that is a multicast address, the device driver converts this to the corresponding Ethernet address, and sends it. The receiving processes must notify their IP layers that they want to receive datagrams destined for a given multicast address, and the device driver must somehow enable reception of these multicast frames. This is called "joining a multicast group." When a multicast datagram is received by a host, it must deliver a copy to all the processes that belong to that multicast group
Multicast Socket options

The API support for traditional multicasting requires, new socket options.
IF_ADD_MEMBERSHIP struct ip_mreq IF_DROP_MEMBERSHIP struct ip_mreq
Join a multicast group leave a multicast group
Sending appl
sendto dest IP=224.0.1.1 dest port=123 UDP UDP Protocol = UDP Perfect software filtering based on destination IP Frame type = 0800 datalink datalink Imperfect software filtering based on destination Enet Port 123
Receiving appl
123
UDP
Join 224:.0:1:1
IPv4
IPv4
IPv4 Receive 01:00:5e:00:01:01 datalink 00:0a:95:79:bc:b4
00:04:ac:17:bf:38
Enet hdr Dest Enet=01:00:5e:00:01:01 Frame type =0800
IPv4 hdr
UDP hdr
UDP data Dest port = 123
Dest IP=224..0.1.1 PROTOCOL=UDP
Multicast example of a UDP datagram
IGMP: Internet Group Management Protocol

The Internet Group Management Protocol (IGMP), which is used by hosts and routers that support multicasting. It lets all the systems on a physical network know which hosts currently belong to which multicast groups. This information is required by the multicast routers, so they know which multicast datagrams to forward onto which interfaces. IGMP is defined in RFC 1112
Encapsulation of an IGMP message within an IP datagram

Like ICMP, IGMP is considered part of the IP layer. Also like ICMP, IGMP messages are transmitted in IP datagrams. Unlike other protocols IGMP has a fixed-size message, with no optional data. IGMP messages are specified in the IP datagram with a protocol value of 2
Format of fields in IGMP message
The IGMP version is 1. An IGMP type of 1 is a query sent by a multicast router, and 2 is a response sent by a host The group address is a class D IP address. In a query the group address is set to 0, and in a report it contains the group address being reported.
Joining a Multicast Group

A process must have a way of joining a multicast group on a given interface. A process can also leave a multicast group that it previously joined. These are required parts of any API on a host that supports multicasting. A process can join the same group on multiple interfaces. Membership in a multicast group on a given interface is dynamic-it changes over time as processes join and leave the group.
IGMP Reports and Queries

IGMP messages are used by multicast routers to keep track of group membership on each of the router's physically attached networks 1. A host sends an IGMP report when the first process joins a group. 2. A host does not send a report when processes leave a group. 3. A multicast router sends an IGMP query at regular intervals to see if any hosts still have processes belonging to any groups. 4. A host responds to an IGMP query by sending one IGMP report for each group that still contains at least one process
IGMP reports and queries
TCP: Transmission Control Protocol

TCP provides a connection-oriented, reliable, byte stream service Two endpoints communicating with each other on a TCP connection TCP packetizes the user data into segment Sets a timeout any time sends data Acknowledges data received by the other end
TCP services
Reorders out-of-order data discards duplicate data provides end to end flow control calculates and verifies a mandatory end-toend checksum Popular Apps: Telnet, Rlogin, Ftp and SMTP
TCP Header
The TCP data is encapsulated in an IP datagram.
IP datagram
TCP segment IP Header TCP Header
TCP data
20 bytes
20 bytes
0123456789 0123456789 012345678901
Source Port
Destination Port
Sequence Number
Acknowledgement Number
4b Hdr U|A|P|R| S |F Len ReservedR|C|S|S |Y | I
G|K|H| T|N|N
Window Size 16 Bit

Urgent Pointer Padding
Check Sum Options Data
The sequence number identifies the byte in the stream of data from the sending TCP to the receiving TCP that the first byte of data in this segment represents. When a new connection is being established, the SYN flag is turned on. The sequence number field contains the initial sequence number (ISN) chosen by this host for this connection. The sequence number of the first byte of data sent by this host will be the ISN plus one because the SYN flag consumes a sequence number. The acknowledgment number contains the next sequence number that the sender of the acknowledgment expects to receive.
Flags
URG The urgent pointer is valid ACK The acknowledgement number is valid PSH The receiver should pass this data to the application as soon as possible
RST Reset the connection

SYN Synchronize sequence numbers to initiate a connection. FIN The sender is finished sending data
Connection Establishment and Termination

TCP is connection-oriented protocol. Before either end can send data to the other, a connection must be established.
client
SYN_SENT (active open)
server
SYN J
SYN K, Ack J + 1 LISTEN (passive open) SYN_RCVD
ESTABLISHED
Ack K + 1
ESTABLISHED
TCP three-way handshake
Packets exchanged when a TCP connection is closed
client
FIN_WAIT_1
(Active close) Ack M+1 FIN_WAIT_2 TIME_WAIT Ack N+1 FIN N
server
FIN M CLOSE_WAIT (passive close)
LAST_ACK
CLOSED
Packet exchange for TCP connection client server

SYN_SENT (active open)
SYN J
SYN K, Ack J + 1
LISTEN (passive open) SYN_RCVD
ESTABLISHED
Ack K + 1
data(request) Data(reply) Ack of reply

FIN_WAIT_1 (Active close) Ack M+1 FIN_WAIT_2 TIME_WAIT Ack N+1 FIN N FIN M
ESTABLISHED
CLOSE_WAIT
(passive close)
LAST_ACK
CLOSED
Timeout of Connection Establishment

There are several instances when the connection cannot be established. In one example the server host is down.
How frequently the client's TCP sends a SYN to try to establish the Connection? The second segment is sent 5.8 seconds after the first, and the third is sent 24 seconds after the second.
BSD implementations of TCP run a timer that goes off every 500 ms.
Maximum Segment Size

The maximum segment size (MSS) is the largest "chunk" of data that TCP will send to the other end. When a connection is established, each end can announce its MSS. When a connection is established, each end has the option of announcing the MSS it expects to receive. (An MSS option can only appear in a SYN segment.) If one end does not receive an MSS option from the other end, a default of 536 bytes is assumed. (This default allows for a 20-byte IP header and a 20-byte TCP header to fit into a 576-byte IP datagram.)
TCP State Transition Diagram

The rules regarding the initiation and termination of a TCP connection can be summarized in a state transition diagram.
Starting point CLOSED

Appl: passive open send: <nothing> Recv:SYN; send:SYN,ACK
TCP State Transition Diagram
LISTEN
Passive open
Appl:active open
Appl:send data Send: SYN
Recv:RST
Send: SYN
SYN_RCVD
Recv:SYN Send: SYN,ACK simultaneous open Recv:SYN
SYN_SENT
Recv:SYN, Send: ACK ACK Recv: FIN Send: ACK
Appl: close or timeout
Send: <nothing> Appl: close send:FIN
ESTABLISHED
Appl:close Data transfer rate
Send:FIN
CLOSE_WAIT
Appl: send close FIN
Simultaneous close
FIN_WAIT_1
Recv : ACK send : <nothing>
Recv: FIN Send: ACK Recv:FIN, ACK
CLOSING
LAST_ACK
recv:ACK send: <nothing>
Send:ACK Recv: FIN Send: ACK
Recv:ACK send: <nothing>
FIN_WAIT_2
TIME_WAIT
2MSL timeout
Active close
2MSL Wait State

The TIME_WAIT state is also called the 2MSL wait state. Every implementation must choose a value for the maximum segment lifetime (MSL). It is the maximum amount of time any segment can exist in the network before being discarded. RFC 793 specifies the MSL as 2 minutes. Common implementation values, however, are 30 seconds, 1 minute, or 2 minutes.
TCP Options
The TCP header can contain options. The only options defined in the original TCP specification are the end of option list, no operation, and the maximum segment size option.
End of option Kind =0 1 byte No operation Kind =1
1 byte
Maximum Segment size
Kind =2 1 byte
Len =4 1 byte
MSS 2 bytes
TCP Server Design

Most TCP servers are concurrent. When a new connection request arrives at a server, the server accepts the connection and invokes a new process to handle the new client. Depending on the operating system, various techniques are used to invoke the new server. Under Unix the common technique is to create a new process using the fork function. Lightweight processes (threads) can also be used. TCP Server Port Numbers Restricting Local IP Address Restricting Foreign IP Address Incoming Connection Request Queue
TCP Interactive data Flow

Interactive data segments smaller than mss Rlogin a single byte of data Telnet one line at a time delayed acknowledgments, reduce no. of segments Nagle algorithm to reduce no of small segments on slower WAN, facility to disable.
client
keystroke
server
data byte
server Ack of data byte Echo of data byte echo
display Ack of echoed data byte
Remote echo of interactive key stroke
TCP Bulk Data Flow

Bulk data transfer, control on send/recv buffers, no control on congestion TCP uses sliding window protocol for flow control. Fast sender, slow receiver Sliding windows, window size advertise Slow start - congestion window on sender Bulk data throughput- Bandwidth delay product
client 1 2 3
2049:3073, ack 1, win 4096 1:1025, ack 1, win 4096 1025:2049, ack 1, win 4096
server
4
ack 2049, win 4096
5
ack 3073, win 4096
Transfer of 3072 bytes from client to server
Sliding Windows
Offered window advertised by receiver usable window 10 11 cant send until window moves
sent and acknowledged sent, not ACKed
can send ASAP
Visualization of TCP sliding Window
1. The window closes as the left edge advances to the right. 2. The window opens when the right edge moves to the right, allowing more data to be sent. 3. The window shrinks when the right edge moves to the left.
closes
shrinks
opens
Movement of window edges
Bulk Data Throughput

The interaction of the window size, the windowed flow control, and slow start on the throughput of a TCP connection carrying bulk data. We can calculate the capacity of the pipe as capacity (bits) = bandwidth (bits/sec) x round-trip time (sec) This is normally called the bandwidth-delay product. This value can vary widely, depending on the network speed and the RTT between the two ends. For example, a Tl telephone line (1,544,000 bits/sec) across the United States (about a 60-ms RTT) gives a bandwidth-delay product of 11,580 bytes.
Either the bandwidth or the delay can affect the capacity of the pipe between the sender and receiver.
A doubling of the RTT-doubles the capacity of the pipe. Doubling the bandwidth also doubles the capacity of the pipe.
Congestion
Congestion can occur when data arrives on a big pipe (a fast LAN) and gets sent out a smaller pipe (a slower WAN). Congestion can also occur when multiple input streams arrive at a router whose output capacity is less than the sum of the inputs.
It can lead to the router discarding packets.
TCP Timeout and Retransmit

TCP manages 4 different timers for reliable A ReTransmit timer, expecting an ACK from other end A persist timer keeps window size when other side advertises zero size window A Keepalive timer for detecting crash or reboot of other size A 2MSL timer for Time-Wait state
Persist Timer
Using window size, the receiver perform flow control by specifying the amount of data it is willing to accept from the sender. What happens when the window size goes to 0? This effectively stops the sender from transmitting data, until the window becomes nonzero. If an acknowledgment is lost, we could end up with both sides waiting for the other: the receiver waiting to receive data (since it provided the sender with a nonzero window) and the sender waiting to receive the window update allowing it to send. To prevent this form of deadlock from occurring the sender uses a persist timer that causes it to query the receiver periodically, to find out if the window has been increased.
Silly Window Syndrome

Window-based flow control schemes, such as the one used by TCP, can fall victim to a condition known as the silly window syndrome (SWS). When it occurs, small amounts of data are exchanged across the connection, instead of full-sized segments
TCP Performance
It is now common for off-the-shelf hardware (workstations and faster personal computers) to deliver 800,000 bytes or more per second. It is a worthwhile exercise to calculate the theoretical maximum throughput we could see with TCP on a 10 Mbits/sec Ethernet. The next slide shows the total number of bytes exchanged for a fullsized data segment and an ACK.
Field sizes for Ethernet theoretical maximum throughput calculation.

Field Data #bytes ACK #bytes
Ethernet preamble Ethernet destination address Ethernet source address Ethernet type field IP header TCP header user data pad (to Ethernet minimum) Ethernet CRC interpacket gap (9.6 microsec)
total
8 6 6 2 20 20 1460 0 4 12
1538
8 6 6 2 20 20 0 6 4 12
84
We first assume the sender transmits two back-to-back full-sized data segments, and then the receiver sends an ACK for these two segments. The maximum throughput (user data) is then throughput = 2 x 1460 bytes / (2 x 1538 + 84 bytes) x 10,000,000 bits/sec / 8 bits/byte = 1,155,063 bytes/sec If the TCP window is opened to its maximum size (65535, not using the window scale option), this allows a window of 44 1460-byte segments. If the receiver sends an ACK every 22nd segment the calculation becomes throughput = 22 x 1460 bytes / (22 x 1538 + 84 bytes) x 10,000,000 bits/sec / 8 buts/byte = 1,183,667 bytes/sec
Moving to faster networks, such as FDDI (100 Mbits/sec), indicates that three commercial vendors have demonstrated TCP over FDDI between 80 and 98 Mbits/sec.
The following practical limits apply for any real-world scenario. 1. You cant run any faster than the speed of the slowest link. 2. You cant go any faster than the memory bandwidth of the slowest machine. This assumes your implementation makes a single pass over the data. 3. You cant go any faster than the window size offered by the receiver, divided by the round-trip time.
The bottom line in all these numbers is that the real upper limit on how fast TCP can run is determined by the size of the TCP window and the speed of light.
Unix: TCP/IP Implementation Details

Berkeley networking code - implementation by HP-UX, Sun Solaris, AIX and NT Two APIs: Sockets and TLI (Transport Layer interface) System calls and Library Functions: socket, connect, listen, accept, send, receive etc. 4.4 BSD supports: TCP/IP, XNS, OSI protocols, and Unix domain protocols
7: Application
Process
System Calls
6: Presentation 5: Session
(Socket, Bind, Connect, etc.)
Socket Layer
Protocol Layer
(TCP/IP, XNS, OSI, UNIX)
4: Transport 3: Network
Interface Layer
(Ethernet, SLIP, Loopback, etc.)
2: Data Link
Media
1: Physical
General Organization of networking code in Net/3
NT: TCP/IP Implementation Details

Stack is High Perf., portable, 32 Bit of standard TCP/IP protocol. Slight diff. in implementation, configuration and services for NT and 9x platform TCP/IP suite makes Windows NT an internet ready platform Support for standard features, Performance enhancements and services availability
Architectural Model
Suite comprises of core protocols, services and interfaces Transport Driver Interface Network Device Interface (NDIS) Interfaces for user mode applicationWindows socket and NetBIOS
TDI
Windows Sockets NetBT TCP ICMP IP
NetBIOS Support
User mode
Kernel
Interface
mode
UDP IGMP ARP
NDIS
Interface
Network Card Driver(s) Network Media

Internet Protocol

Загружено:

Сведения о документе

Исходное описание:

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Internet Protocol

Загружено:

Авторское право:

Доступные форматы

Internet Protocols

Internet protocols span the complete range of OSI model layers

The Four Layers of TCP/IP protocol suite

Application Transport Network Link

Telnet, FTP, E-mail TCP, UDP IP, ICMP, IGMP

Device driver and Interface Card

Handles Application details

Handles Communication details

FTP Server TCP

Two Networks connected through Router

Various protocols at different layers in the TCP/IP protocol suite

24 bits HostId 14 bits 16 bits HostId 21 bits 8 bits

Multicast group ID 27 bits

(reserved for future use)

The Domain Name System

Encapsulation of data as it goes down the protocol stack User data

User data TCP

Demultiplexing of a received ethernet frame

Demultiplexing based on protocol value in IP header

Application Programming Interfaces

Ethernet and IEEE 802.3

Eternet/IEEE802.3 physical characteristics

50ohm coax 50ohm coax (thick) (thick)

Characteristics of 100BaseT Media Types

Category 5 UTP Type 1 and 2 STP

62.5/125 micron multi-mode fiber

CAT 3,4,5 UTP

Ethernet Encapsulation(RFC 894)

data 46-1500 bytes

type RARP request/ PAD 8035 reply

IEEE 802.2/802.3 Encapsulation (RFC 1042)

DSAPSSAP Cntl Org code type AA AA 03 00

Type IP datagram 0800

Type ARP request/ reply PAD 0806

Type RARP request /replyPAD 8035

Migrating to Gigabit Ethernet

SLIP: Serial Line IP

PPP: Point to Point Protocol

Format of PPP frames

flag AddrControl protocol 7E FF 03

upto 1500 bytes IP datagram

protocol link control data C021 protocol 8021

Processing of IP datagrams by loopback interface

Place on IP input queue loopback driver

destination IP address equal broadcast address or multicast address? no

place on IP input queue Ethernet driver

Destination IP address yes Equal interface IP address?

IP: Internet Protocol

0123456789 0123456789 0123456789 0 1

Identification TTL Protocol

Source Address Destination Address Options

Recommended values for type-of-service field

An Example Fragmentation Procedure

An Example Reassembly Procedure

Internet protocol errors may be reported via the ICMP messages.

================================================= Active Routes: Network Destination Netmask Gateway Interface Metric

ARP: Address Resolution Protocol (rfc 826)

Operation of ARP when user types ftp hostname

send IP datagram to IP address

Ethernet driver ARP request (Ethernet broadcast)

RARP: Reverse Address Resolution Protocol (rfc 903)

RARP Packet Format