TCP -- The Internet Connection


Supporting all connection-oriented services over the Internet, TCP is one of the most widely used protocols today

From its beginnings as part of the early ARPANET research efforts within the Government's Defense Advanced Research Project Agency (DARPA), TCP has become one of the most prevalent protocols in use today. Virtually all of the Internet based connection oriented services make use of the Transmission Control Protocol, better known as TCP. In general terms, TCP provides full-duplex, connection oriented transfer of byte streams between processes on machines interconnected over an Internet. It provides flow control services that manage the transfer of information between machines and networks that are not able to keep up with the offered traffic load.

The protocol provides a connection oriented set of services with a corresponding state machine that describes the behavior of well-behaved TCP implementations. The general application programming interface used for TCP communications is the Socket interface. Originally derived from the BSD Unix implementation, this interface can be found in many real-time operating systems. In addition, extensions to the socket interface have been developed to support the peculiarities of the Windows programming environment. The Windows interface has been come to be known as Winsock.

RFC 793 identifies TCP as addressing some of the core functional requirements of internetworked communications. These characteristics include: With these general service characteristics in mind, lets take a look at the internal workings of the protocol.

The Protocol

A convenient starting point to review the TCP protocol is to first take a look at the TCP frame header. Carried by the IP header to route messages to the appropriate hosts, the TCP header defines a format for controlling connections between client processes in peer to peer conversations. Fields are included to support application addressing, connection control, information sequencing and the selection of value added service options.

Figure 1 - The TCP Protocol Header (From RFC793)


Each of the fields shown in Figure 1 has a role to play in controlling information flow in TCP conversations. They are used as follows: Through this information, all TCP operations, connection establishment/termination and data transfer are performed.

Connection Management

Prior to transferring information through TCP, the participating processes must first establish a logical connection. The state machine employed to describe the connection establishment and termination process is shown in Figure 2.

Figure 2 - TCP Connection State Machine (Also from RFC 793)


Many of the indications shown in the state transitions of the diagram directly reference to control bits (SYN, FIN...) just covered in the discussion of the TCP packet format. There are two primary transitions that a user can make from the Closed state, either through a passive Open where the local user has requested the TCP process to wait for connections against a specific port address. An active Open occurs when the local user explicitly requests the establishment of a connection to a process residing at a port address of a defined machine. This will result in a packet with the SYN control bit being sent to the peer process. If you follow through this state machine, you can see that the typical PDU exchange in connection establishment is as follows:

Connect Request (SYN) -------------------->
<--------------------------- (SYN, ACK) Connect Acknowledgment
Acknowledgment(ACK) ------------------>

Note that to accept a connect request packet (SYN) from another station, the recipient must be in the "Listen" state, that is entered through a passive open command received from the calling application. Once both participants in the TCP connection have received acknowledgments to the SYN packet, they enter into the "Estab" state, in which information transfer can occur.

Once in the established state, either station can send information. As information is sent by each station, the amount of unacknowledged traffic is tracked. Once the amount of unacknowledged information reaches the window size indicated by the peer, no more information is sent until the pending traffic is acknowledged. As the information is transmitted, a retransmission timer is set. The value of the timer begins with a value that should be calculated based on round-trip time measurements. As retransmissions are sent, this retry timer is typically increased exponentially to avoid piling traffic on a system that may not be responding due to congestion conditions. Retransmission timing on TCP is particularly challenging, given that TCP connections can range from logical connections between two processes on the same computer to connections between machines literally halfway around the world, spanning several subnets.

The selection and management of window sizes presents an important engineering challenge. Receivers selecting a particularly small receive window will result in the peer sending many more packets than would be necessary if the window size had been larger. Also, buffering algorithms that allow information to accumulate for a brief period of time prior to transmissions can result in the transmission of larger packets that place a much smaller load on the network. The choice of mechanisms within each end system is up to the designers of the TCP software; the RFCs merely provide guidance in these areas.

No special processing is provided to quickly determine that a peer TCP service has become unreachable. If an application is not attempting to transmit information over the connection, no keep alive mechanisms are supported by the base protocol. Therefore, it can take a long time to determine that a connection is no longer operating. While this can be a challenge in developing applications, it has the important advantage in that it limits the amount of extraneous traffic sent on a network. Applications requiring fast detection of a peer becoming unavailable will typically incorporate some form of periodic keep-alive message. In these applications, a failure to receive the keep-alive from the peer will indicate that it is no longer reachable.

Unlike the unbalanced connection establishment phase (where one station must be passively listening), termination of the connection can occur through an asynchronous transmission by either station. In this case, either station can request that the connection be ended through transmitting a packet with the FIN option set. When both stations are operating correctly, the exchange of traffic in closing a TCP connection is as follows:

(FIN) ---------------------------------->
<------------------------------ (ACK)
(Notify User)
<-------------------------------(FIN)
(ACK) --------------------------------->

Note that this procedure assumes that both applications are participating in the connection closure process. Situations in which communications between the stations have been discontinued require intervention of upper layer services. In these cases, application programs time-out the connection termination process to prevent control blocks (and the connections) from being orphaned and consuming system resources.

The uses for TCP are numerous. World Wide Web information transfers, File Transfers (FTP), remote terminal sessions and E-Mail (through SMTP) are some of the more widely used applications that make use of TCP connections. Through the services provided by TCP, the Internet application layer services can focus on the burden of sequencing and managing application services. TCP controls the establishment of connections, sequencing of information and retransmission of information as appropriate.