Вы находитесь на странице: 1из 15

White Paper

Supporting Differentiated Service Classes:


TCP Congestion Control Mechanisms

Chuck Semeria
Marketing Engineer

Juniper Networks, Inc.


1194 North Mathilda Avenue
Sunnyvale, CA 94089 USA
408 745 2000 or 888 JUNIPER
www.juniper.net

Part Number: 200022-001 02/02


Contents
Executive Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
TCP Segments and Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Segments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
TCP Acknowledgement Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
TCP Congestion Control Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Slow-Start . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Congestion Avoidance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Fast Retransmission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Fast Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Example: Throughput for a Typical TCP Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Recent Enhancements to TCP Congestion Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Appendix: References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
RFCs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Textbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Technical Papers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

List of Figures
Figure 1: TCP Segments and Sequence Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Figure 2: ACKing a Single Segment or Multiple Segments . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Figure 3: Segment Misordering and Duplicate ACKs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Figure 4: TCP Slow-Start . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Figure 5: Slow-Start with Congestion Avoidance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Figure 6: Fast Recovery after Fast Retransmission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Figure 7: Throughput for a Sample TCP Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Figure 8: Real-World TCP Throughput . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Copyright © 2002, Juniper Networks, Inc.


Executive Summary
This paper is part of a series of papers published by Juniper Networks that describe the
mechanisms that allow you to support differentiated service classes in large Internet Protocol
(IP) networks. This paper provides an overview of the classic Transmission Control Protocol
(TCP) congestion control mechanisms, including slow-start, congestion avoidance, fast
retransmit, and fast recovery. It also briefly discusses several of other minor enhancements that
are designed to fine-tune TCP performance when responding to congestion indications. Host
TCP congestion control mechanisms work with router active queue memory management
techniques including random early detection (RED), weighted RED (WRED), and explicit
congestion notification (ECN) to allow you to control the average queuing delay while
supporting transient fluctuations in queue size. The other papers in this series provide
technical discussions of queue scheduling disciplines, active queue memory management, and
other issues related to the deployment of differentiated service classes in your network.

Perspective
TCP is the predominant transport protocol used in public and private IP networks. Depending
on the statistics that you reference, TCP-based traffic accounts for 80 to 95 percent of all traffic
on large IP networks.
Applications frequently send large amounts of data across a network from one host to another.
IP is a connectionless protocol that does not guarantee that the data it carries will not be
damaged, lost, duplicated, or misordered. Consequently, applications that require a reliable
data transfer service use TCP to establish virtual connections across an unpredictable and
unreliable network. Without TCP, application developers would have to build reliability
(including packet loss detection and recovery) into each application.
The fundamental characteristics of TCP include the following:
■ TCP is a connection-oriented service. Before data can be transmitted between two hosts,
one system initiates a connection by “calling” the other system. If the system receiving the
connection request accepts the call, messages are exchanged between the two hosts to
verify that the session is authorized and to provide parameters that control the data
exchange.
■ TCP provides a reliable delivery service. While the data stream is transmitted, the hosts at
each end of the connection exchange acknowledgements (ACKs) verifying that data has
been received without error. The source TCP maintains a record of the packets that it sends
and waits for an ACK before sending the next set of packets. The source TCP also maintains
a timer indicating when it sends a packet and retransmits the packet if the timer expires
before the ACK is received from the remote host.
■ The source TCP always attempts to fill the “pipe” between the sending and receiving hosts
while adapting its transmission rate to avoid potential congestion in the network. TCP
continually monitors and modifies its transmission rate so that the rate at which it injects
packets into the network is just below the point at which packet loss occurs.

Copyright © 2002, Juniper Networks, Inc. 3


Supporting Differentiated Service Classes: TCP Congestion Control Mechanisms

■ All TCP connections are full duplex. This means that a TCP connection supports the
simultaneous transfer of data in both directions.
While it is beyond the scope of this paper to provide a complete description of TCP, this paper
does provide a brief summary of the relevant TCP congestion control mechanisms that execute
on host systems. It is important to understand how TCP responds to congestion if you are to
fully understand the mechanisms that routers use to support the delivery of differentiated
service classes. Keep in mind that the descriptions in this paper communicate only the essence
of these fundamental concepts and take some liberties with what you would actually see if you
examined a packet trace in a production network. This paper primarily explains the basics
without attempting to explain the behavior of every deviant in every corner case.

TCP Segments and Acknowledgements


This section reviews the following aspects of TCP that are fundamental to an understanding its
adaptive timeout and retransmission strategy:
■ Segments

■ Acknowledgements

Segments
The basic unit of transfer between two hosts in a TCP connection is called a segment. A segment
consists of a TCP header and its associated data. Since each TCP segment is transmitted in an
IP datagram and because IP datagrams can be reordered as they cross the network, TCP
segments can arrive at the destination TCP in a different order than originally transmitted by
the source TCP. Of course they can also be corrupted, dropped, or duplicated along the way.
For the stream of bytes that the source TCP transmits to the destination TCP, the source TCP
assigns a sequence number to each byte in the stream. To allow the destination TCP to keep
track of what it has received and reorder misordered segments, each TCP header carries a
32-bit sequence number that is used to identify the data carried in each segment. The sequence
number in each TCP header is set to the specific number that the source TCP assigns to the first
byte of data in the given segment (see Figure 1).

Figure 1: TCP Segments and Sequence Numbers

Data stream of 8000 bytes

8000 7001 4000 3001 3000 2001 2000 1001 1000 1

8000 7001 7001 4000 3001 3001 3000 2001 2001 2000 1001 1001 1000 1 1
Data TCP Data TCP Data TCP Data TCP Data TCP
header header header header header

Copyright © 2002, Juniper Networks, Inc. 4


Supporting Differentiated Service Classes: TCP Congestion Control Mechanisms

In Figure 1, the source TCP needs to transmit a stream of 8000 bytes to the destination TCP. The
source TCP divides the stream into eight 1000-byte groups and prepends a TCP header to each
group to create eight TCP segments. Note that the sequence number carried in each TCP
header represents the number that the source TCP assigns to the first byte of data carried in
each segment.

TCP Acknowledgement Process


TCP uses ACKs to support the reliable transmission of data. When the source TCP transmits
segments, it expects the destination TCP to ACK the segments when they are received. Figure 2
illustrates how the destination TCP can respond to the receipt of segments by either sending an
ACK for each individual segment or by acknowledging multiple segments using a single ACK.
Note that the ACK number used by the destination TCP is the number of the next byte in the
stream that the destination TCP expects to receive from the source TCP. If the destination TCP
ACKs 2001, it informs the source TCP that it has successfully received all bytes up to and
including byte 2000.

Figure 2: ACKing a Single Segment or Multiple Segments

Source TCP Destination TCP Source TCP Destination TCP

1000 1 1 1000 1 1

ACK = 1001
2000 1001 1001
2000 1001 1001

ACK = 2001 ACK = 2001

Because IP packets can be reordered as they cross your network, TCP segments can also be
reordered as they cross your network. When the destination TCP receives a misordered
segment, it responds by immediately transmitting a duplicate ACK to the source TCP (see
Figure 3).

Copyright © 2002, Juniper Networks, Inc. 5


Supporting Differentiated Service Classes: TCP Congestion Control Mechanisms

Figure 3: Segment Misordering and Duplicate ACKs

Source TCP Destination TCP

1000 1 1 Receive segment 1

2000 1001 1001 Receive segment 2

ACK segment 2 (a single ACK


ACK = 2001
for all data through byte 2000 )

4000 3001 3001 Receive segment 4

Expecting byte 2001, segments are


ACK = 2001
misordered, respond with a duplicate ACK

3000 2001 2001 Receive segment 3

Segment 3 received, ACK segment 4


ACK = 4001
(a single ACK for all data through byte 4000 )

5000 4001 1001 Receive segment 5

6000 5001 1001 Receive segment 6

ACK segment 6 (a single ACK


ACK = 6001
for all data through byte 6000 )

TCP Congestion Control Mechanisms


TCP congestion control prevents a source from exceeding network capacity by allowing it to
adapt its transmission rate to avoid congestion in routers, on links, or at the destination host.
The basic congestion control mechanisms supported by TCP include:
■ Slow-start

■ Congestion avoidance

■ Fast retransmission

■ Fast recovery

Slow-Start
When a TCP connection is first established, the source TCP does not transmit a full receiver’s
advertised window of segments. Instead, the source TCP avoids exceeding the capacity of the
network by transmitting only a few packets at the beginning, waiting for the ACKs to those

Copyright © 2002, Juniper Networks, Inc. 6


Supporting Differentiated Service Classes: TCP Congestion Control Mechanisms

packets, and then gradually increasing its transmission rate. This allows the source TCP to
probe the network to determine the amount of bandwidth that is currently available for the
connection. This slow-start mechanism is used:
■ At the beginning of each new TCP connection

■ When an existing TCP connection is restarted after a long idle period

■ When an existing TCP connection is restarted after the retransmission timer expires
As a result, slow-start keeps TCP from flooding the network with packets when a new TCP
session is established or immediately after a period of congestion ends.
Figure 4 illustrates the operation of the TCP slow-start mechanism. With slow-start the sender
must maintain a congestion window (cwnd) which represents its estimate of the amount of
traffic that the network can absorb without becoming congested (its transmission window
size). When a TCP session is first established, cwnd is initialized to the size of a single segment
advertised by the destination host at the other end of the connection. The source TCP can
transmit the minimum of its cwnd (representing flow control imposed by the sender) and the
destination’s advertised window (representing flow control imposed by the receiver).

Figure 4: TCP Slow-Start

Source Destination

1 Segment

CWDN = 1 ACK

2 Segments CWND

CWDN = 2

4 Segments
1

CWDN = 4 Time

The source TCP initiates slow-start by transmitting one segment and waiting for its ACK.
When the ACK is received, the source increases cwnd from one to two, and two segments are
sent. When these two segments are acknowledged, the source increases cwnd from two to four,
and four segments are sent. The exponential growth of cwnd continues until either its value
exceeds the destination’s advertised window or packets are dropped due to congestion. The
following section, "Congestion Avoidance," describes how the source TCP responds to packet
loss.

Copyright © 2002, Juniper Networks, Inc. 7


Supporting Differentiated Service Classes: TCP Congestion Control Mechanisms

The source TCP can determine that a packet has been dropped by the network in one of two
ways:
■ Duplicate ACKs

■ Expiration of the retransmission timer


The absence of a single segment in the middle of a transmission window of segments causes
the destination TCP to immediately generate a duplicate ACK. Recall that TCP does not send
negative acknowledgements (NACKs) or ACKs using packet numbers. Rather, the destination
TCP cumulatively ACKs data that has been received in sequence by responding with a
sequence number in the data stream. For example, when a destination TCP receives all of the
data in the stream up to byte 2000, it responds with an ACK of 2001 indicating that the next
segment the destination expects to receive begins with byte 2001. If a segment is dropped by an
intermediate router, the destination TCP continues to buffer subsequent packets as they arrive
but, because it has not received the next segment that it expected to receive, it continues to
ACK 2001. Because the receipt of duplicate ACKs can also mean that the segment is
misordered, the source TCP uses the receipt of three duplicate ACKs as an indication that a
packet is lost and not misordered.
The loss of the last packet in a transmission window of segments causes the retransmission
timer of the source TCP to expire because there are no subsequent segments to generate a
duplicate ACK. The source TCP expects the receiver to transmit ACKs as it successfully
receives new bytes in the data stream. Each time the sender transmits a segment, it starts a
timer and waits for an ACK. This timer supports adaptive retransmission because the timeout
value changes as the sample round trip times (RTTs) of the connection constantly change with
the load placed on the network. If the retransmission timer expires before the data in the
segment is acknowledged, the source TCP assumes that the segment was either lost or
corrupted and retransmits the segment.

Congestion Avoidance
When the source TCP discovers that a packet has been dropped by the network, it sets the
variable ssthresh (slow-start threshold) equal to one-half of the current value of cwnd. The
source reduces its transmission rate by returning to slow-start mode, but this time it
exponentially increases its transmission rate until cwnd is equal to the value of ssthresh. At this
point, the sender increases cwnd linearly (by at most one segment per RTT), allowing it to
slowly increase its transmission rate as it begins to approach the previous cwnd value that
caused packets to be dropped. When the value of cwnd is less than or equal to ssthresh, the
source TCP is in slow-start mode; when cwnd is greater than ssthresh, the source TCP is in
congestion avoidance mode (see Figure 5).

Copyright © 2002, Juniper Networks, Inc. 8


Supporting Differentiated Service Classes: TCP Congestion Control Mechanisms

Figure 5: Slow-Start with Congestion Avoidance

CWND
Point of network congestion
y

Slow-start

y/2 ssthresh

Congestion
avoidance
1

Time

Slow-start with congestion avoidance causes TCP to reduce the value of cwnd by half each time
it experiences a packet loss. Consequently, if congestion leading to packet loss continues for a
period of time, the volume of traffic injected into the network and the rate of retransmission by
the source TCP decrease exponentially. This causes the source TCP to back off and allows
routers to empty their congested queues.

Fast Retransmission
As discussed earlier, TCP assumes that a packet has been dropped when it receives duplicate
ACKs. The challenge is that the receipt of duplicate ACKs can also mean that the packet may
simply be out of order. Rather than immediately responding to a duplicate ACK by
retransmitting the lost segment, the source TCP waits until it receives three duplicate ACKs.
Fast retransmission enhances TCP performance in the following ways:
■ Eliminates unnecessary packet retransmission and wasted network capacity if the packet is
simply out of order and not dropped
■ Allows higher channel utilization and connection throughput
■ Allows TCP to not wait for the retransmission timer to expire before resending a potentially
lost segment

Fast Recovery
When the source TCP receives duplicate ACKs, data is still flowing to the destination because
the destination TCP can generate duplicate ACKs only if subsequent segments are received. In
this case, the source TCP does not suddenly reduce the flow of data by returning to slow-start.
Instead, after responding to the receipt of three duplicate ACKs by retransmitting the lost
segment, the source TCP sets cwnd to half its current value and performs congestion avoidance.
This provides better overall throughput for the TCP session (see Figure 6).

Copyright © 2002, Juniper Networks, Inc. 9


Supporting Differentiated Service Classes: TCP Congestion Control Mechanisms

Figure 6: Fast Recovery after Fast Retransmission

Sender receives a duplicate ACK


CWND
y

y/2

Time

Fast recovery prevents the TCP session pipe from being completely empty after the fast
retransmission of a single lost segment. This enhances TCP session performance by eliminating
the need to return to slow-start and then slowly fill the TCP session pipe after a single packet
loss from a window of data. However, while fast recovery improves TCP performance when a
single packet is dropped from a window of data, it does not improve performance when
multiple packets are dropped from a window of data.

Example: Throughput for a Typical TCP Session


Figure 7 illustrates the throughput for a sample TCP session over time.

Figure 7: Throughput for a Sample TCP Flow

CWND Point of network congestion


y
Slow-start Slow-start and
congestion
B C avoidance E F G
Rretransmission timer
time-out interval

y/2
Fast Fast Fast Fast Fast
recovery and recovery and recovery and recovery and recovery and
A D
congestion congestion congestion congestion congestion
avoidance avoidance avoidance avoidance avoidance
1

When a TCP connection is first established, the source TCP attempts to avoid immediately
overloading the network by assuming that the network has very little capacity. As shown in
throughput curve A, TCP begins with slow-start but rapidly increases its transmission rate to
quickly determine the current capacity of the network.

Copyright © 2002, Juniper Networks, Inc. 10


Supporting Differentiated Service Classes: TCP Congestion Control Mechanisms

Slow-start eventually reaches a transmission rate where the source TCP receives duplicate
ACKs indicating either the loss or misordering of a segment in the middle of a transmission
window of segments. As shown in throughput curve B, the source TCP sets ssthresh equal to
one-half of the current value of cwnd and then performs fast restart with congestion avoidance
as it approaches the previous value of cwnd that resulted in packet loss.
Fast restart with congestion avoidance will eventually reach a transmission rate where the
source TCP receives duplicate ACKs indicating either the loss or misordering of a segment in
the middle of a transmission window of segments. As shown in throughput curve C, the
source TCP sets ssthresh equal to one-half of the current value of cwnd and then performs fast
restart with congestion avoidance.
Now, assume that the last segment in a transmission window of segments is dropped by an
intermediate router due to a buffer overload. This causes the source TCP to wait for the
retransmission timer to expire before it can retransmit the lost segment. As shown in
throughput curve D, the source TCP sets ssthresh equal to one-half of the current value of cwnd,
returns to slow-start, and performs congestion avoidance. Eventually the source TCP reaches a
transmission rate where it receives a duplicate ACK indicating either the loss or misordering of
a segment in the middle of a stream of segments.
Throughput curves E, F, and G show TCP in its familiar “sawtooth” operational mode. These
throughput curves show how the source TCP periodically receives duplicate ACKs indicating
either the loss or misordering of a segment in the middle of a transmission window of
segments. TCP responds by executing fast recovery with congestion avoidance as it continues
to probe the network.
Figure 7 presents an idealized version of the throughput for a TCP session because the value of
cwnd that results in packet loss remains constant. In a production network, cwnd is constantly
changing, resulting in a real-world throughput curve that looks more like Figure 8.

Figure 8: Real-World TCP Throughput

Point of network congestion


CWND

Time

Recent Enhancements to TCP Congestion Control


The information about how TCP maintains fairness among TCP flows that share a common
bottleneck link, maximizes the packet throughput for each session, and avoids congesting the
network has increased dramatically since the mid to late 1980s. Modern TCP implementations
support a number of different algorithms designed to control network congestion and
maintain suitable packet throughput, as follows:

Copyright © 2002, Juniper Networks, Inc. 11


Supporting Differentiated Service Classes: TCP Congestion Control Mechanisms

■ TCP Tahoe, first implemented in 4.3 BSD Tahoe TCP in 1988, initiated support for three of
the key algorithms discussed above: slow-start, congestion avoidance, and fast retransmit.
These algorithms were originally proposed by Van Jacobson in 1988.
■ TCP Reno, first implemented in 4.3 BSD Reno TCP in 1990, supports all of the Van Jacobson
enhancements introduced in TCP Tahoe and extends the fast retransmit algorithm to
support fast recovery. By supporting fast recovery, TCP Reno overcomes the throughput
performance limitations of TCP Tahoe that occur when a single packet is lost from a
window of data.
■ TCP Vegas, discussed in a number of research papers in 1994, enhances the congestion
avoidance algorithm of TCP Tahoe and TCP Reno by dynamically increasing and
decreasing the transmission window size according to the observed RTT of the packets that
it has previously sent. If the observed RTT becomes large, the network is experiencing
congestion, causing TCP Vegas to reduce its window size. Likewise, if the observed RTT
becomes small, the network is not experiencing congestion causing TCP Vegas to increase
its window size. Another modification introduced by TCP Vegas is that during slow-start
the rate of cwnd increase is half that of TCP Tahoe and TCP Reno—cwnd is doubled with the
receipt of every other ACK instead of every ACK.
■ TCP selective acknowledgement (SACK), specified in Request for Comments 2018 (October
1996), enhances the throughput performance of TCP Reno when multiple packets are
dropped from a single window of data. When a TCP receiver observes that arriving packets
are not continuous (the packets are out of order), it responds to the TCP sender with ACKs
that contain the SACK option. This option contains information that allows the TCP sender
to specifically identify which packets have been received by the destination TCP. This
information allows the TCP sender to accurately determine which segments are missing
and retransmit only the missing packets. The TCP SACK option is currently being
implemented in many popular operating systems and will soon be widely deployed.
■ TCP NewReno, specified in RFC 2582 (April 1999), enhances TCP throughput performance
when multiple packets are dropped from a single window of data for TCP Reno
connections that do not support the TCP SACK option. When multiple packets are dropped
from a single window of data, the ACK for the retransmitted packet acknowledges some
but not all of the packets transmitted before the fast retransmit. This is referred to as a
partial ACK. During fast recovery when a TCP sender receives a partial ACK, the TCP
sender concludes that the indicated packet was lost and retransmits that packet. TCP
NewReno overcomes the throughput performance penalty when multiple segments are
dropped from a single window of data.
■ The duplicate-SACK (D-SACK) extension, specified in RFC 2883 (January 2000), allows a
TCP receiver to use a SACK to report the receipt of duplicate segments. This extension
allows the TCP sender to identify the segment received by the TCP receiver, including
duplicate segments. If the TCP sender determines that the destination TCP received two
copies of a segment and that the retransmission of the duplicate segment was unnecessary,
the TCP sender can undo the halving of cwnd. The D-SACK extension overcomes the
throughput performance penalty that results from halving the congestion window.
■ The Limited Transmit extension, specified in RFC 3042 (January 2001), enhances TCP
throughput performance by avoiding unnecessary retransmit timeouts. The source TCP,
instead of transmitting the packet suspected of being dropped, transmits a new segment
after receiving one or two duplicate ACKs. The Limited Transmit mechanism enhances
TCP throughput performance by allowing a TCP session with a small window to recover

Copyright © 2002, Juniper Networks, Inc. 12


Supporting Differentiated Service Classes: TCP Congestion Control Mechanisms

from less than a full window of packet loss without a retransmit timeout. Recall that
explicit congestion notification (ECN) is an active queue management mechanism that also
helps to avoid unnecessary retransmit timeouts.
The list of proposed enhancements continues to grow as research and standards communities
expand their knowledge about TCP performance and develop new extensions as new
problems arise. However, the following challenges are associated with these enhancements:
■ The enhancements require a considerable amount of time to achieve wide-spread
deployment. This means that you can be assured that your network will be required to
carry flows where TCP senders and receivers execute divergent TCPs. These TCPs are
implemented by different vendors, execute on different operating systems, support
different sets of congestion control algorithms, vary in their conformance to Internet
Engineering Task Force (IETF) standards, provide different levels of performance when
running on hosts than when running on servers, and can interact in unpredictable ways
when executing congestion control algorithms.
■ Most of these enhancements are designed to streamline long-term TCP sessions by
avoiding unnecessary retransmit timeouts and improving performance when experiencing
reordered, delayed, or corrupted packets. They are not specifically designed to enhance the
performance of most of the traffic found on large IP networks—short-term Web-based
traffic flows.
Since host response to active queue memory management techniques (RED, WRED, and ECN)
determines how well routers can manage congestion in the core of your network, you need to
be aware of the many issues that determine if the flows traversing your network are TCP
compatible, non-TCP compatible, or simply nonresponsive.

Summary
This paper covered several TCP congestion control mechanisms that allow host systems to
respond to implicit or explicit congestion indications. The ability of host systems to control
their transmission rate allows you to manage the average queuing delay in core routers and
support transient fluctuations in queue size. These standard TCP congestion control
mechanisms include slow-start, congestion avoidance, fast retransmit, and fast recovery. This
paper also described several recent enhancements to TCP, including TCP Reno, TCP Vegas,
TCP SACK, TCP NewReno, D-SACK, and Limited Transmit.

Copyright © 2002, Juniper Networks, Inc. 13


Supporting Differentiated Service Classes: TCP Congestion Control Mechanisms

Appendix: References

RFCs
RFC 793, Postel, J. “Transmission Control Protocol - DARPA Internet Program Protocol
Specification.” DARPA, September 1981.
RFC 813, Clark, D. “Window and Acknowledgment Strategy in TCP.” July 1982.
RFC 1072, Jacobson, V. and R. Braden. “TCP Extensions for Long-Delay Paths.” October 1988.
RFC 1191, Mogul, J. and S. Deering. “Path MTU Discovery.” November 1990.
RFC 1323, Jacobson, V., Braden, R., and D. Borman. “TCP Extensions for High Performance.”
May 1992.
RFC 2018, Mathis, M., Mahdavi, J., Floyd, S. and A. Romanow. “TCP Selective
Acknowledgement Options.” October 1996.
RFC 2414, Allman, M., Floyd, S. and C. Partridge. “Increasing TCP's Initial Window Size.”
September 1998.
RFC 2581, Allman, M., Paxson, V. and W. Stevens. “TCP Congestion Control.” April 1999.
RFC 2582, Floyd, S. and T. Henderson. “The NewReno Modification to TCP’s Fast Recovery
Algorithm.” April 1999.
RFC 2883, Floyd, S., Mahdavi, J., Mathis, M., and M. Podolsky. “An Extension to the Selective
Acknowledgement (SACK) Option for TCP.” July 2000.
RFC 3042, Allman, M., Balakrishnan, H., and S. Floyd. “Enhancing TCP's Loss Recovery Using
Limited Transmit.” January 2001.

Textbooks
Comer, Douglas. Internetworking with TCP/IP Vol. II: ANSI C Version: Design, Implementation, and
Internals. Prentice Hall, June 1998. (ISBN 0139738436)
Comer, Douglas and Stevens, David L. Internetworking with TCP/IP Vol. I: Principles, Protocols,
and Architecture. Prentice Hall, February 2000. (ISBN 0130183806)
Huston, Geoff. Internet Performance Survival Guide: QoS Strategies for Multiservice Network. John
Wiley & Sons, February 2000. (ISBN 0471378089)
Partridge, Craig. Gigabit Networking. Addison-Wesley Pub. Co., January 1994. (ISBN
0201563339)
Stevens, W. Richard. TCP/IP Illustrated, Volume 1: The Protocols. The Addison-Wesley
Professional Computing Series; Addison-Wesley Pub. Co., January 1994. (ISBN 0201633469)
Stevens, W. Richard and Wright, Gary R. (Contributor). TCP/IP Illustrated, Volume 2: The
Implementation. The Addison-Wesley Professional Computing Series, Addison-Wesley Pub.
Co., January 1995. (ISBN 020163354X)

Technical Papers
Allman, M. and A. Faulk. “On the Effective Evaluation of TCP.” ACM Computer
Communication Review. October 1999.

Copyright © 2002, Juniper Networks, Inc. 14


Supporting Differentiated Service Classes: TCP Congestion Control Mechanisms

Fall, K. and S. Floyd. “Simulation-Based Comparisons of Tahoe, Reno and SACK TCP”,
Computer Communication Review. July 1996.
Floyd, S. “A Report on Some Recent Developments in TCP Congestion Control.” June 2000.
Hoe, J. “Improving the Start-Up Behavior of a Congestion Control Scheme for TCP.” ACM
SIGCOMM, August 1996.
Jacobson, V. “Congestion Avoidance and Control”, Computer Communication Review, vol. 18,
no. 4, pp. 314-329, August 1988.

Copyright © 2002, Juniper Networks, Inc. All rights reserved. Juniper Networks is registered in the U.S. Patent and Trademark Office and in other countries
as a trademark of Juniper Networks, Inc. G10, Internet Processor, Internet Processor II, JUNOS, JUNOScript, M5, M10, M20, M40, M40e and M160 are
trademarks of Juniper Networks, Inc. All other trademarks, service marks, registered trademarks, or registered service marks are the property of their respective
owners. All specifications are subject to change without notice.
Juniper Networks assumes no responsibility for any inaccuracies in this document. Juniper Networks reserves the right to change, modify, transfer, or otherwise
revise this publication without notice.

Copyright © 2002, Juniper Networks, Inc. 15

Вам также может понравиться