Вы находитесь на странице: 1из 25

Week 5: TCP Client-Server and UDP Sockets

Module 4, 5 & Chapter 5, 8

Objectives
understand normal startup and termination of TCP connections and how to examine the established TCP connections understand the concepts and techniques of signal handling in Unix; understand how to avoid generating zombie child processes at server site; understand the various abnormal terminations of TCP connections and the techniques to make TCP applications more robust.

System call sequences


TCP client socket() connect() write() socket() bind() listen() accept() read() write() read() process TCP server
well-know port

TCP Echo Example of Server and Client


fgets
Computer

fputs

TCP client

writen readline

readline writen

TCP server

TCP Echo server


create socket, bind servers well-known port wait for client connection to complete concurrent server str_echo
read a line and echo the line

TCP Echo client


create socket, fill in Internet socket address structure Connect to server str_cli
read a line, write to server. read echoed line, write to standard output

Normal startup
When the server starts, it calls socket, bind, listen and accept, blocking in the call to accept. When the client start on the same host, the client calls socket and connect, the latter causing TCPs three-way handshake to take place. When the three-way handshake completes, connect returns in the client and accept returns in the server. The connection is established.

4 way handshake TCP termination


client FIN ACK FIN ACK server

Normal termination
When a EOF (^D) is typed, fgets returns NULL and str_cli returns to main, then main terminates. In client site, closing of all open descriptors, so the client socket is closed by the kernel.
This sends a FIN to the server, to which the server TCP responds with an ACK. At this point the server socket is in the CLOSE_WAIT state and the client socket is in the FIN_WAIT_2 state.

When the server TCP receives the FIN, the server child is blocked in a readline, then return 0. This causes the str_echo to return to the server child main. In server site, closing the connected socket by child
a FIN sent from the server to the client, and an ACK sent from the client to the client. The client socket enters the TIME_WAIT state.

Normal termination
Another part of process termination is for the SIGCHLD signal to be sent to the parent when the server child terminates
we dont catch the signal in this case, and the default action of this signal is to be ignored. The child enters the zombie state.

Cleaning up zombie processes requires deal with Unix signals.

Signal handling
A signal is a notification to a process that an event has occurred. The process doesnt know ahead of time exactly when a signal will occur. Signals can be sent
by one process to another process by the kernel to a process

Every signal has a disposition (the action). We set the disposition of a signal by calling the sigaction function. Three choices for disposition
providing a function that is called whenever a specific signal occurs ignoring a signal by setting its deposition to SIG_IGN setting the default disposition for a signal by setting its disposition to SIG_DFL.

signal function
typedef void Sigfunc(int) Sigfunc *signal(int signo, Sigfunc *func) { struct sigaction act, oact; act.sa_handler = func; sigemptyset(&act.sa_mask); act.sa_flags = 0; if (signo == SIGALRM) #ifdef SA_INTERRUPT act.sa_flags |= SA_INTERRUPT; #endif else #ifdef SA_RESTART act.sa_flags |= SA_RESTART; #endif if (sigaction(signo, &act, &oact) < 0 ) return SIG_ERR; return oact.sa_handler; }

Summarize signal handling


Once a signal handler is installed, it remains installed While a signal handler is executing, the signal being delivered is blocked. If a signal is generated one or more times while it is blocked, it is normally delivered only one time after the signal is unblocked. It is possible to selectively block and unblock a set of signals using the sigprocmask function.
This lets us protect a critical region of code by preventing certain signals from being caught while the region of code is executing.

Zombie process
Zombie process is a process which lost its life.
for instance, a process has been terminated, then its child process will be in zombie state. the zombie process has some information about its process ID, termination status, and resource utilisation. It a process terminates, and that process has children in the zombie state, the parent process ID of all the zombie children is set to 1.

The zombie process takes up space in the kernel. Whenever we fork children, we must wait for them to prevent them from becoming zombies.
We can establish a signal handler to catch SIGCHILD and within the handler we call wait.

wait and waitpid


To avoid generating zombie child processes in a concurrent server, the parent process must catch and handle the SIGCHLD signal with wait() or waitpid() function call. If there are no terminated children for the process calling wait, but the process has one or more children that are still executing, then wait blocks until the first of the existing children terminate. waitpid gives us more control over which process to wait for and whether or not to block.

Difference between wait and waitpid


To avoid generating zombie child processes in a concurrent server, calling wait() is not sufficient.
all signals are generated before the signal handler is executed, and the signal handler is executed only one time because Unix signals are normally not queued. Sometimes, probably dependent on the timing of the FINs arriving at the server host, the signal handler is executed three or four times, leaving some zombie processes. waitpid()

is used to handle the situation when many child processes raise signals SIGCHLD at about the same time.
If a client establish multiple TCP connections of the concurrent server, each of which is handled by a child process of the server. the solution is to use function waitpid() in a while loop.

Abnormal termination
To make the TCP network applications more robust, we must consider various situations of abnormal termination of TCP connection.
termination of server process SIGPIPE signal problem Crashing of server host Crashing and rebooting of server host shutdown of server host

Elementary UDP Sockets


UDP is another kind of transport layer protocol.
It is an unreliable and connection-less datagram protocol. It is much less complicated than TCP

There is no actual connection between the client and server processes. Each datagram is a self-contained unit of data with the socket address of its destination After receiving the first datagram from the client, the server can use this pair of IP address and port number to send its own datagrams back to the client.
The datagram received by the server process also contains the IP address and the port number of the UDP socket of the client process. unlike TCP which can get the clients IP address and port number from connect(), the server can start communicating after receiving the first datagram.

TCP & UDP Client/Server


TCP client socket() TCP server

UDP client
well-know port

socket()

UDP server

socket()

bind()

socket()

bind()

well-know port

connect()

listen()

sendto

recvfrom
blocks until datagram received from a client

accept() write() read() process write() read()

process read() sendto close

close

recvfrom() and sendto()


recvfrom()

and sendto() are two functions used to receive and sent datagrams through UDP sockets.
ssize_t recvfrom(int sockfd, void *buff, size_t nbytes, int flags, struct sockaddr *from, socklen_t *addrlen) ssize_t sendto(int sockfd, const void *buff, size_t nbytes, int flags, const struct sockaddr *to, socklen_t addrlen)

The client does not establish a connection with the server.


Instead, the client just sends a datagram to the server using the sendto function, which requires the address of the destination (the server) as a parameter.

The server does not accept a connection from a client.


Instead, the server just calls the recvfrom function, which waits until data arrives from some client. recvfrom() returns the protocol address of the client, along with the datagram, so the server can send a response to the correct client.

Arguments in recvfrom() and sendto()


sockfd, buff and nbytes are identical to the first three argument for read and write: descriptor, pointer to buffer to read into or write from, and number of bytes to

read/write to in sendto() is a socket address structure containing IP address and port number of where the data is to be sent. the recvfrom() fills in the socket address structure pointed to by from with the protocol address of who sent the datagram. final argument in sendto() is an integer, while it is a pointer to an integer in recvfrom(). (why?) A NULL for the 5th argument in recvfrom() means that the receiving process is not interested in the socket address of the source of datagram.

Client/Server with two clients


fork connection client server child listening server fork server child connection client

TCP

TCP

TCP

client

server socket receive buffer

client

UDP

UDP

UDP

datagram

datagram

Problems with UDP echo


Lost datagrams
If a client datagram is lost, the client will block forever in its call to recvfrom(), waiting for a server reply that will never arrive. If the client datagram arrives at the server but the servers reply is lost, the client will again block forever in its call to recvfrom().

Verifying received response


The client of the UDP can receive datagrams from any source. It cannot check whether the datagram received is from the server which it communicates with.

Server not running


The client blocks forever in its call to recvfrom(), waiting for a server reply that will never appear.

Improvements
To solve the problem of verifying received response,
we could allocate another socket address structure by calling
malloc().

We compare the length returned by recvfrom() in the value-result argument and then compare the socket address structure themselves using
memcmp().

This solution could cause another problem that the client will ignore the response from the server if
the server has multiple interface and its kernel choose a different interface for the outgoing datagrams it sends back to the client; if the client sends the datagrams to the non-primary IP address of the interface.

A possible solution to this problem is to


have the server create multiple sockets and bind each of its IP address to a socket explicitly. The server then has to use select() or poll() to wait for the datagram from all the sockets simultaneously.

Asynchronous errors and connected UDP socket


UDP socket doesnt necessarily need to be connected to any server, but it has many advantages of using connected UDP sockets
the client will be able to detect asynchronous errors. For instance an asynchronous error occurs if the client in the UDP echo example sends a datagram to a non-existing server.

The solution to the problem of server not running is using a connected UDP socket.
the client will receive the asynchronous error from its UDP additional benefits: improved performance and the client will receive the datagrams only from the connected server.

More problems with UDP


UDP does not have flow control
the UDP of the receiving host is simply a datagram demultiplexor using the port number to dispatch datagrams to the UDP sockets. Each socket has a receiving buffer with limited size. When the buffer is full, the arriving datagrams will be discarded.

UDP does not have congestion control either.


if datagrams on the receiving host starts being discarded, the sending host doesnt care and keeps on sending.

More advantage of connected UDP socket


B C
.35

206.62.226.32/27 206.62.226.64/27 Y

A
.66

A connected UDP socket can also be used to determine the outgoing interface that will be used to a particular destination.
connecting B from A, the interface .35 will be the outgoing interface Connecting X from A, the interface .66 will be the one.

What really happened is


When connect() is called, the kernel chooses the local IP address, by searching the routing table for destination IP address, then using the primary IP address for the resulting interface.

Вам также может понравиться