Академический Документы
Профессиональный Документы
Культура Документы
Reference: Internetworking With TCP/IP, Volume III: Client-Server Programming And Applications, Windows Socket Version By Douglas E. Comer and David L. Stevens.
Learning Outcomes
Knowledge
Explain the principles of network programming Explain the client-server model Get familiar with WinSock APIs
Skills
Design and develop the client side of client-server network applications using socket programming
Outline
Client Server Model Concept of Socket Address Structures
Byte order
Client-server paradigm uses the direction of initiation to categorize whether a program is a client or server.
4
An application that waits for incoming communication requests from clients is called a server
E.g., Web server, FTP server, Telnet server, SMTP server The server receives a clients request, performs the necessary computation, and returns the result to the client Remark: A server is usually designed to provide service to multiple clients
How ???
Remark:
Because the Web servers and Web browsers are designed based on HTTP, different Web browsers can communicate with all kinds of Web servers.
Programmers develop network applications by using SOCKET APIs To learn SOCKET programming
Learn some data structures Learn the socket APIs Learn multi-threading
10
Socket Abstraction
Socket is an interface between applications and the network services provided by OS An application sends and receives data through a socket
Application SOCKET API TCP/IPv4 TCP/IPv6 UNIX
11
Process A data
socket
socket
12
socket
socket
socket
Browser C
13
History of Sockets
In early 1980s, the Advanced Research Projects Agency (ARPA) funded a group at UC Berkeley to transport TCP/IP software to the UNIX operating system. As part of the project, the designers created an interface that applications use for network communication. The interface used an abstraction known as a socket, and the API became known as socket API. Many computer vendors adopted the Berkeley UNIX operating system, and the socket interface became very popular. Subsequently, Microsoft chose the socket interface as the primary network API for its operating systems, called WinSock.
Old version: Windows Sockets 1.1 Todays version: Windows Sockets 2
http://www.sockets.com/winsock2.htm The important header file: winsock2.h
The socket interface has become a de facto standard throughout the computer industry.
14
They allowed for multiple families of protocols, with TCP/IP represented as a single family. E.g.,
PF_INET or AF_INET (IPv4)
Our focus
PF_INET6 or AF_INET6 (IPv6) PF_UNIX or AF_UNIX (Unix domain protocols) PF_IPX or AF_IPX (IPX protocols)
15
Socket Descriptors
To perform file I/O, we use a file descriptor. Similarly, to perform network I/O, we use a socket descriptor:
Each active socket is identified by its socket descriptor. The data type of a socket descriptor is SOCKET. Hack into the header file winsock2.h:
typedef u_int SOCKET; /* u_int is defined as unsigned int */
In UNIX systems, socket is just a special file, and socket descriptors are kept in the file descriptor table. The Windows operating system keeps a separate table of socket descriptors (named socket descriptor table, or SDT) for each process.
16
17
Types of Sockets
Under protocol family AF_INET
Stream socket
Uses TCP for connection-oriented reliable communication Identified by SOCK_STREAM s = socket(AF_INET, SOCK_STREAM, 0) ;
Datagram socket
Uses UDP for connectionless communication Identified by SOCK_DGRAM s = socket(AF_INET, SOCK_DGRAM, 0) ;
RAW socket
Uses IP directly Identified by SOCK_RAW Advanced topic. Not covered by COMP2330.
18
The internal data structure for a socket contains many fields, but the system leaves most of them unfilled. The application must make additional procedure calls to fill in the socket data structure before the socket can be used. The socket is used for data communication between two processes (which may locate at different machines). So the socket data structure should at least contain the address information, e.g., IP addresses, port numbers, etc.
19
Fig 5.2 from Comer. Conceptual operating system (Windows) data structures after five calls to socket() by a process. The system keeps a separate socket descriptor table for each process; threads in the process share the table.
20
C Review: Structure
C structures are collections of related variables under one name.
struct employee {
char firstname[20]; char lastname[20]; int age; float salary; };
21
memcpy(): copy characters between buffers void *memcpy(void *dest, const void *src, size_t count);
dest : New buffer. src : Buffer to copy from. count : Number of bytes to copy.
22
Addressing Issue
How to identify a communication?
Source IP Address, Source Port Number, Destination IP Address, Destination Port Number TCP or UDP
TCP/IP define a communication endpoint to consist of an IP address and a protocol port number.
when you create the socket, you already specified whether this socket is using TCP or UDP.
Unfortunately, not all address families define endpoints that fit into the sockaddr structure! To keep program portable and maintainable, TCP/IP code should not use the sockaddr structure in declarations. Instead, another structure sockaddr_in should be used to represent a socket address in AF_INET family.
25
The IP address is stored in structure sin_addr with type struct in_addr: struct in_addr { union { struct { u_char s_b1,s_b2,s_b3,s_b4; } S_un_b; struct { u_short s_w1,s_w2; } S_un_w; u_long S_addr; } S_un; #define s_addr S_un.S_addr };
The 32-bit IP address can be explained as 4 unsigned bytes, or 2 unsigned short, or an unsigned long. By using a union, you can manipulate the IP address conveniently.
26
sin_family
sin_port sin_addr
2 bytes
2 bytes
4 bytes
14 bytes
sa_data
sin_zero
8 bytes
A pointer to a struct sockaddr_in can be cast to a pointer to a struct sockaddr and vice-versa.
27
An Example
To specify an endpoint address 158.182.9.1:5678
#include <winsock2.h> struct sockaddr_in addr; addr.sin_family = AF_INET; addr.sin_port = htons(5678); addr.sin_addr.s_addr = htonl(2662729985);
Byte Ordering
Byte Ordering: Memory is organized in bytes. E.g, a 16-bit integer takes 2 consecutive bytes; a 32-bit integer takes 4 consecutive bytes. Different machines use different host bye orderings: little-endian: least significant byte located at lower memory E.g., Intel 80x86 big-endian: least significant byte located at higher memory E.g., Motorola 68000 These machines may communicate with one another over the network, and they may have different understandings on the same data. Not a problem for data type char
29
Little-Endian Computer
Receive an integer 68032266!
30
The members sin_port and sin_addr in struct sockaddr_in should be in network byte order
sin_port: TCP or UDP port number, put into the TCP/UDP header sin_addr: IP address, put into the IP header
The member sin_family will not be sent out to the network. So it is not necessary to be in network byte order. If your application needs to transfer integers between two machines, you should transfer them into network byte order.
You dont know the type of CPU that executes your program!
31
s: short
l: long h-to-n-s h-to-n-l n-to-h-s
Remark:
u_short: unsigned 16-bit integer u_long: unsigned 32-bit integer
n-to-h-l
32
WinSock API
Function Name
WSAStartup
Meaning
Initialize the socket library (Windows only)
WSACleanup
WSAGetLastError socket connect
closesocket
bind listen accept
WinSock API
Function Name
recv
Meaning
Acquire incoming data from a stream connection or the next incoming message (mainly for TCP)
recvfrom
send
Receive the next incoming datagram and record its source endpoint address (UDP)
Send outgoing data or a message (mainly for TCP)
sendto
select
shutdown
inet_addr inet_ntoa gethostbyname
bind()
listen() accept() recv() send() closesocket()
WSACleanup()
35
UDP Server
WSAStartup()
bind()
sendto()
recvfrom()
process request
recvfrom() closesocket()
data (reply)
sendto()
closesocket()
WSACleanup()
WSACleanup()
36
Differences
TCP is connection-oriented, UDP is connectionless
TCP applications use listen(), accept(), connect() to setup TCP connections
Data transfer
TCP applications use send() and recv() UDP applications use sendto() and recvfrom()
37
WSAStartup()
Programs using WinSock must call WSAStartup() before using sockets. It is needed because the operating system uses dynamically linked libraries (DLLs).
int WSAStartup ( WORD wVersionRequested, LPWSADATA lpWSAData ); wVersionRequested : The highest version of Windows Sockets support that the caller can use. The high order byte specifies the minor version (revision) number; the low-order byte specifies the major version number. lpWSAData : A pointer to the WSADATA data structure that is to receive details of the Windows Sockets implementation. WSAStartup() returns zero if successful.
Example code: WSADATA wsadata; int err; err = WSAStartup(MAKEWORD(2,2), &wsadata); if (err != 0) return;
38
WSACleanup()
Once an application finishes using and closing sockets, it calls WSACleanup() to deallocate all data structures and socket bindings. A program usually calls WSACleanup() only when it is completely finished and ready to exit.
int WSACleanup(void);
The return value is zero if the operation was successful. Otherwise, the value SOCKET_ERROR
is returned.
39
WSAGetLastError()
An application calls the WSAGetLastError function to retrieve the specific error code following an unsuccessful socket function call. An application should call WSAGetLastError immediately after a socket function returns an error indication.
So that you can know what the problem is.
40
socket()
The Windows Sockets socket() function creates a socket.
SOCKET socket ( int af, int type, int protocol ); af: An address family specification. type: A type specification for the new socket. The following are the only two type specifications supported for Windows Sockets 1.1: SOCK_STREAM and SOCK_DGRAM. In Windows Sockets 2, some new socket types are introduced. protocol: A particular protocol to be used with the socket that is specific to the indicated address family. Remark: we usually set it as 0.
If no error occurs, socket() returns a descriptor referencing the new socket. Otherwise, a value of INVALID_SOCKET is returned.
Example code:
SOCKET s; s = socket(AF_INET, SOCK_STREAM, 0); if (s == INVALID_SOCKET) { printf(Error code: %d\n, WSAGetLastError()); return; }
41
closesocket()
Once a client or server finishes using a socket, it calls closesocket() to deallocate it. If only one process is using the socket, closesocket() immediately terminates the connection and deallocates the socket. If several processes share a socket, closesocket() decrements a reference count and deallocates the socket when the reference count reaches zero.
42
Remark: INADDR_ANY represents a wildcard address that matches any of the computers IP address. In winsock2.h: #define INADDR_ANY (u_long)0x00000000
Remark: inet_addr() takes an ASCII string that contains a dotted decimal address and returns the equivalent IP address in binary. argv[1] represents the first argument to the command.
The server doesnt know who will connect to it. It needs to define the local port number only.
The client must provide the IP address and port number of the server. 43
Require the user to identify the server when invoking the program
the most popular way
44
Example code: void main (int argc, char * argv[]) { struct sockaddr_in servaddr; servaddr.sin_addr.s_addr = inet_addr(argv[1]); }
45
Every time you need to map a domain name into its corresponding IP address, you call the function gethostbyname().
It will trigger the OS to contact the DNS server to obtain the IP address.
47
gethostbyname()
Sometimes one domain name are mapped to a list of IP addresses (e.g., for a cluster server). Instead of returning a single IP address, it will return a complicated data structure hostent which can store a list of IP addresses.
struct hostent { char * h_name; /* official host name */ char ** h_aliases; /* other aliases */ short h_addrtype; /* address type */ short h_length; /* length of each address in bytes */ char ** h_addr_list; /* list of addresses */ #define h_addr h_addr_list[0] }; struct hostent * gethostbyname( const char* name);
Example code: char* server = www.comp.hkbu.edu.hk; 1. It will return a NULL pointer struct hostent *h; if error occurs. 2. The returned IP address is in network byte order. if ( h = gethostbyname(server) ) memcpy(&servaddr.sin_addr, h->h_addr, h->h_length); 48
getservbyname()
Most client programs must look up the protocol port for the specific service they wish to invoke. To do so, the client invokes library function getservbyname() which takes two parameters: (1) a string that specifies the desired service; (2) a string that specifies the transport protocol being used (i.e., tcp or udp).
struct servent { char * s_name; char ** s_aliases; short s_port; char * s_proto; };
/* official service name */ /* other aliases */ /* port for this service */ /* protocol to use */
Example code: struct servent *sptr; if ( sptr = getservbyname (smtp, tcp) ) { /* port number is now in sptr->s_port */ } else { /* error occurred handle it */ }
49
bind()
When a socket is first created, it has no associated endpoint addresses. Function bind() can specify the local endpoint address for a socket.
Usually, a server calls bind() to specify the well-known port at which they will await connections. A client can skip bind() and let OS to decide.
int bind(SOCKET s, const struct sockaddr * addr, int addrlen); s: the socket descriptor addr: a pointer to the address assigned to the socket addrlen: the size (in bytes) of the real address structure
50
Local IP address
Its trivial for most machines that has only one IP address. But for a machine with multiple IP addresses, it could be more complicated.
TCP client software usually leaves the local endpoint address unfilled, and allows TCP/IP module to select the correct local IP address and an unused local port number automatically.
51
2. 3.
Place the socket in passive mode, making it ready for use by a server.
Accept the next connection request from the socket, and obtain a new socket for the connection.
accept()
4.
Repeatedly receive a request from the client, formulate a response, and send a reply back to the client according to the application protocol.
recv() and send()
5.
When finished with a particular client, close the connection and return to step 3 to accept a new connection.
closesocket()
52
listen()
TCP server calls listen() to place a socket in passive mode and makes it ready to accept incoming connections.
A socket in passive mode is used for acceptance of TCP connection request, not for data transfer.
To handle multiple clients for the server, listen() tells the operating system to queue the connection requests for the servers socket.
int listen(SOCKET s, int backlog); s: the socket descriptor backlog: the maximum length of the queue of pending connections
53
accept()
After the server calls listen(), the server can then call accept() to extract the next incoming connection request from the listening queue. Function accept() creates a new socket for each new successful connection request, and returns the descriptor of the new socket to its caller. This new socket is used to transfer data.
Return value: If no error occurs, accept() returns the socket descriptor for the newly accepted socket. Otherwise, a value of INVALID_SOCKET is returned .
54
accept()
Example code: SOCKET listenfd, connfd; struct sockaddr_in cliaddr; int clilen; listenfd = socket(AF_INET, SOCK_STREAM, 0); bind(listenfd, (struct sockaddr*)&servaddr, sizeof(servaddr)); listen(listenfd, 10); connfd = accept(listenfd, (struct sockaddr *)&cliaddr, &clilen);
socket
Client Process
data data
data
socket
56
socket
Remark:
socket
socket
A single listening socket is used by the server to accept incoming connection requests.
57
Browser C
2.
3.
Find the IP address and protocol port number of the server with which communication is desired; Allocate a socket;
socket()
Specify that the connection needs an arbitrary, unused protocol port on the local machine, and allow TCP to choose one;
A side-effect of connect() connect() send() and recv() closesocket()
4.
5. 6.
Communicate with the server using the application-level protocol; Close the connection.
58
connect()
A client calls connect() to establish an active connection to a remote server.
int connect (SOCKET s, struct sockaddr * addr, int addrlen); s: the socket descriptor addr: a pointer to the servers address addrlen: the length of the servers address structure If no error occurs, connect() returns 0. Otherwise, it returns SOCKET_ERROR.
Example code: Remark: SOCKET s; servaddr is declared as type struct struct sockaddr_in servaddr; sockaddr_in. It is cast into type struct sockaddr s = socket(AF_INET, SOCK_STREAM, 0); when calling connect(). /* fill in the structure servaddr */ if ( connect(s, (struct sockaddr *)&servaddr, sizeof(servaddr)) == SOCKET_ERROR ) { /* error handling */ } 59
TCP Communications
When TCP is used, TCP module uses a sending buffer and a receiving buffer for reliable data transmission.
These buffers are inside the OS kernel and different from your application memory buffer!
60
If no error occurs, send() returns the total number of bytes sent. Otherwise, a value of SOCKET_ERROR is returned.
62
Challenges
TCP allows reliable data transmission between two processes
Data are regarded as byte stream. There is no data boundary.
Problems
The receiver doesnt know how much data to receive. The number of bytes returned by recv() cannot be known in advance.
Multiple calls of recv() is a common practice. But you dont know how many times to call recv()! Solutions?
63
/* receiver code */ #define BLEN 20000 char buf[BLEN]; int n, c=0; for (n = 0; n < BLEN; n += c) { c = recv(s, &buf[n], BLEN-n, 0); }
buf
&buf[n]
n += c
Question: in reality, how can the client know the length of the incoming data, i.e., 20000 bytes?
64
buf
n bptr bptr
65
The receiver extracts length, then allocates memory, then receives the data (see Example 1).
/* sender code */ SOCKET sock; int len, m; char *buf; len = 20000; buf = malloc(len); /* fill the buf by your data */ m = htonl(len); send(sock, &m, sizeof(m), 0); send(sock, buf, len, 0); /* receiver code */ SOCKET sock; int i, c, n, len; char *buf, *ptr; ptr = &n; len = sizeof(n); for (i = 0; i < len; i += c) c = recv(sock, &ptr[i], len-i, 0); len = ntohl(n); buf = malloc(len); for (i = 0; i < len; i += c) c = recv(sock, &buf[i], len-i, 0);
66
4. 5.
6.
recvfrom() returns the number of bytes in the message if successful, and SOCKET_ERROR to indicate that an error has occurred.
The receiver doesnt know who will send him data. Any machine can send data to this receiver if it knows the receivers IP and UDP port.
Connected UDP
UDP is connectionless.
When call sendto() and recvfrom(), the peers address is included or returned as a parameter. An unconnected UDP socket can receive packets from any machine.
69
70
More References
Learn by practicing!
Some example codes are available on the course web site. Use WireShark to learn basic socket programming debugging skills. Start with some toy programs. Finish the course project individually.
References:
Douglas E. Comer and David L. Stevens, Internetworking with TCP/IP, Volume III: client-server programming and applications, 1997, Prentice-Hall. W. Richard Stevens, Bill Fenner, and Andrew M. Rudoff, UNIX Network Programming, Volume I: The Sockets Networking API, 3rd Edition, 2004, Addison-Wesley. Winsock Programming FAQ: http://www.tangentsoft.net/wskfaq/
71