Вы находитесь на странице: 1из 44

Introduction to UNIX System Programming

By Armin R. Mikler

Overview

Basic UNIX Commands Files


Buffered vs. non-buffered I/O Basic System Calls Whats a process anyway? The fork() System Call Coordinating Processes (wait, exit, etc) Pipes

Processes

Inter-Process Communication

Basic UNIX Commands

Login

Basic Commands:

username password

==> User Shell.

The User Shell is:


The Command Interpreter A running program UNIX Commands are (often small) programs. What else does the Shell do?

who am I pwd who what ps (*) finger ls mkdir rm (-i -f -r) touch cat (note: there is no dog) grep

more basic UNIX

Editors

The UNIX Manual

emacs vi joe sed and others gcc g++ perl java etc.

use the manual pages to get information about a specific command or system call. The UNIX manual is divided into sections. Careful!! The same system call can (and does) appear in different sections with different context. Use man -si subject to refer to section i.

Compilers/Interpreters

man pages

man -k keyword(s)

prints the header line of manual pages that contain the keyword(s) same as man -k Which manual section contains UNIX user commands? Which manual section contains UNIX system calls? What is the difference between commands and system calls?

apropos keyword(s)

Questions:

TRY xman, the manual pages for X11.

Files

UNIX Input/Output operations are based on the concept of files. Files are an abstraction of specific I/O devices. A very small set of system calls provide the primitives that give direct access to I/O facilities of the UNIX kernel. Most I/O operations rely on the use of these primitives. We must remember that the basic I/O primitives are system calls, executed by the kernel. What does that mean to us as programmers???

UNIX I/O Primitives

open: Opens a file for reading or writing, or creates an empty file. create: Creates an empty file close: Closes a previously opened file read: Extracts information from a file write: Places information into a file lseek: Moves to a specific byte in the file unlink: Removes a file remove: Removes a file

A rudimentary example:
#include <fcntl.h> /* controls file attributes */ #include<unistd.h> /* defines symbolic constants */ main() { int fd; /* a file descriptor */ ssize_t nread; /* number of bytes read */ char buf[1024]; /* data buffer */ /* open the file data for reading */ fd = open(data, O_RDONLY); /* read in the data */ nread = read(fd, buf, 1024); /* close the file */ close(fd); }

Buffered vs unbuffered I/O

The system can execute in user mode or kernel mode! Memory is divided into user space and kernel space! What happens when we write to a file?

the write call forces a context switch to the system. What?? the system copies the specified number of bytes from user space into kernel space. (into mbufs) the system wakes up the device driver to write these mbufs to the physical device (if the file-system is in synchronous mode). the system selects a new process to run. finally, control is returned to the process that executed the write call.

Discuss the effects on the performance of your program!

Un-buffered I/O

Every read and write is executed by the kernel. Hence, every read and write will cause a context switch in order for the system routines to execute. Why do we suffer performance loss? How can we reduce the loss of performance? ==> We could try to move as much data as possible with each system call. How can we measure the performance?

Buffered I/O

explicit versus implicit buffering:

explicit - collect as many bytes as you can before writing to file and read more than a single byte at a time. However, use the basic UNIX I/O primitives

Careful !! Your program my behave differently on different systems. Here, the programmer is explicitly controlling the buffer-size

implicit - use the Stream facility provided by <stdio.h> FILE *fd, fopen, fprintf, fflush, fclose, ... etc. a FILE structure contains a buffer (in user space) that is usually the size of the disk blocking factor (512 or 1024)

File Locking

Consider the following problem: Processes can obtain a unique integer by reading from a file. The file contains a single integer (at all times), which must be incremented by the process that executes a read. Since multiple processes can compete for the file (a unique integer), we must make sure that the file access is synchronized. HOW??

What happens if we use buffered I/O ?

lockf()

lockf() is a C-Library function for locking records of a file. Its prototype is int lockf( int fd, int func, long size); func-parameters are:

F_ULOCK: 0 (unlock a locked section) F_LOCK: 1 (locks a section) F_TLOCK: 2 (Test and Lock a section) F_TEST: 3 (Test section for Locks) see the UNIX manual pages!!

If we rewind the file before locking AND use a size of 0L as the corresponding size parameter, the entire file is being locked. lseek(fd, 0L, 0) can be used to rewind the file (fd) to the beginning.

flock()

flock() is a UNIX system call to apply or remove an advisory lock to an open file The locking is only on an advisory basis (not absolute) Prototype: int flock(fd, operation) see manual pages

UNIX Processes

A program that has started is manifested in the context of a process. A process in the system is represented:

Process Identification Elements Process State Information Process Control Information User Stack Private User Address Space, Programs and Data Shared Address Space

Process Control Block

Process Information, Process State Information, and Process Control Information constitute the PCB. All Process State Information is stored in the Process Status Word (PSW). All information needed by the OS to manage the process is contained in the PCB. A UNIX process can be in a variety of states:

States of a UNIX Process


User running: Process executes in user mode Kernel running: Process executes in kernel mode Ready to run in memory: process is waiting to be scheduled Asleep in memory: waiting for an event Ready to run swapped: ready to run but requires swapping in Preempted: Process is returning from kernel to user-mode but the system has scheduled another process instead Created: Process is newly created and not ready to run Zombie: Process no longer exists, but it leaves a record for its parent process to collect.

See Process State Diagram!!

Creating a new process

In UNIX, a new process is created by means of the fork() - system call. The OS performs the following functions:

It allocates a slot in the process table for the new process It assigns a unique ID to the new process It makes a copy of process image of the parent (except shared memory) It assigns the child process to the Ready to Run State It returns the ID of the child to the parent process, and 0 to the child.

Note, the fork() call actually is called once but returns twice - namely in the parent and the child process.

Fork()

Pid_t fork(void) is the prototype of the fork() call. Remember that fork() returns twice

in the newly created (child) process with return value 0 in the calling process (parent) with return value = pid of the new process. A negative return value (-1) indicates that the call has failed

Different return values are the key for distinguishing parent process from child process! The child process is an exact copy of the parent, yet, it is a copy i.e. an identical but separate process image.

A fork() Example
#include <unistd.h> main() { pid_t pid /* process id */ printf(just one process before the fork()\n); pid = fork(); if(pid == 0) printf(I am the child process\n); else if(pid > 0) printf(I am the parent process\n); else printf(DANGER Mr. Robinson - the fork() has failed\n) }

Basic Process Coordination

The exit() call is used to terminate a process.

Its prototype is: void exit(int status), where status is used as the return value of the process. exit(i) can be used to announce success and failure to the calling process.

The wait() call is used to temporarily suspend the parent process until one of the child processes terminates.

The prototype is: pid_t wait(int *status), where status is a pointer to an integer to which the childs status information is being assigned. wait() will return with a pid when any one of the children terminates or with -1 when no children exist.

more coordination

To wait for a particular child process to terminate, we can use the waitpid() call.

Prototype: pid_t waitpid(pid_t pid, int *status, int opt)

Sometimes we want to get information about the process or its parent.


getpid() returns the process id getppid() returns the parents process id getuid() returns the users id use the manual pages for more id information.

Orphans and Zombies or MIAs

A child process whose parent has terminated is referred to as orphan. When a child exits when its parent is not currently executing a wait(), a zombie emerges.

A zombie is not really a process as it has terminated but the system retains an entry in the process table for the non-existing child process. A zombie is put to rest when the parent finally executes a wait().

When a parent terminates, orphans and zombies are adopted by the init process (prosess-id -1) of the system.

Inter-Process Communication

In addition to synchronizing different processes, we may want to be able to communicate data between them. Note, that we are dealing with processes in the same machine. Hence, we can use shared memory segments to send messages between processes. One of the way to establish a communication channel between processes with a parent-child relationship is through the concept of pipes. We can use the pipe() system call to create a pipe.

UNIX Pipes

At the UNIX command level, we can use pipes to channel the output of one command into another

ls | wc prototype: int pipe(int filedes[2]) filedes[0] will be a file descriptor open for reading filedes[1] will be a file descriptor open for writing the return value of pipe() is -1 if it could not successfully open the file descriptors.

At the process level we use the pipe() system call.


But how does this help to communicate between processes?

Basic Inter-Process Communication


by Armin R. Mikler

Overview

What is IPC ? How can we achieve IPC? The pipe at the shell level! The pipe between processes! The pipe() system call! closing the pipe! Programming with pipes.

FIFOs - named Pipes FIFOs vs. regular pipes Steps for using a FIFO

mkfifo to make a FIFO open the FIFO

Other IPC concepts


size of a pipe Non-blocking read() and write() The select() system call

signals shared memory semaphores sockets

What is IPC

Inter-Process Communication allows different processes to exchange information and synchronize their actions. Why do processes have to synchronize their actions? We need to distinguish how processes may be related:

Parent / Child relationship i.e., the child process was created by the parent Processes that are not related yet execute on the same host Processes that are not related and execute on different hosts

Why do we have to make this distinctions?

some similarities

consider a program that consists of multiple functions.

how can we exchange information between the main() function and any of the other functions func()? how do we produce side effects in func() that are visible in main()? what do we need to do to guarantee that func() accesses the same variables as main()?

The trick is to either allow different functions to work with identical memory locations or to create a communication channel in the form of parameter lists or return values.

IPC between user processes on the same system


User Process User Process

OS - Kernel

shared resources

IPC between processes on different systems


User Process User Process

OS-Kernel

OS-Kernel

Network

How do we achieve IPC

Processes need to use some facility that they have in common. Both processes must speak the same IPClanguage. What facilities can two or more processes share when they reside on the same host?

Memory File System Space Communication Facilities Common communication protocol provided by the OS (signals)

Inter-Process Communication using PIPES

In addition to synchronizing different processes, we may want to be able to communicate data between them. For the time, we are dealing with processes in the same machine. Hence, we can use shared memory segments to send messages between processes. A pipe is a one-way communication channel which can be used to connect two related processes

Pipes contd

Unix provides a construct called pipe, a communication channel through which two processes can exchange information. One of the way to establish a communication channel between processes with a parent-child relationship is through the concept of pipes. Why do the processes need to be related?

UNIX Pipes contd

At the UNIX command level, we can use pipes to channel the output of one command into another

ls | wc the shell actually creates a child process, uses exec() to execute the corresponding program (i.e., ls and wc)

How does the shell implement the pipe-command i.e., ls|wc ?? How would you implement the ability to pipe?? Discuss....

the pipe() system call

At the process level we use the pipe() system call.


prototype: int pipe(int filedes[2]) filedes[0] will be a file descriptor open for reading filedes[1] will be a file descriptor open for writing the return value of pipe() is -1 if it could not successfully open the file descriptors.

But how does this help to communicate between processes??

example
#include .... main() { int p[2], pid; char buf[64]; if(pipe(p) == -1) { perror(pipe call); exit(1); } /* at this point we have a pipe p with p[0] opened for reading and p[1] opened for writing - just like a file */ write(p[1], hi there, 9); read(p[0], buf, 9); printf(%s\n, buf); }

A pipe to itself ?
Process write()

read()

A channel between two processes


Remember: parent/child relationship! What does that mean?


the child was created by a fork() call that was executed by the parent. the child process is an image of the parent process ---> all the file descriptors that are opened by the parent are now available in the child. The file descriptors refer to the same I/O entity, in this case a pipe. The pipe is inherited by the child and may be passed on to the grand-children by the child process or other children by the parent. This can easily lead to a chaotic conglomeration of pipes throughout our system of processes

The open pipe problem


Child Process write() Parent Process write()

read()

read()

The fix
Child Process write() write()

read()

read()

closing the pipe

The file descriptors associated with a pipe can be closed with the close(fd) system call Some Rules:

A read() on a pipe will generally block until either data appears or all processes have closed the write file descriptor of the pipe! Closing the write fd while other processes are writing to the pipe does not have any effect! Closing the read fd while others are still reading will not have any effect! Closing the read while others are still writing will cause an error to be returned by the write and a signal is sent by the kernel (Broken Pipe!!)

The size of a pipe

In most cases, we only transfer small amounts of data through a pipe - but we for some applications we may want to send and receive large data blocks. A valid question is: How much data will fit into a pipe ?? Why do we care? Remember - a write() will block until the requested number of bytes have been written. The POSIX standard specifies a minimum size of 512 bytes!

Вам также может понравиться