Вы находитесь на странице: 1из 10

Unit - 4 1

Unit-4

Theoretical Concept of Unix Operating System

BASIC FEATURES OF UNIX OPERATING SYSTEM

It is written in high-level language, 'C' making it easy to port to different configurations.

• It is a good operating system, especially, for programs. UNIX programming


environment is unusually rich and productive. It provides features that allow
complex programs to be built from simpler programs.
• It uses a hierarchical file system that allows easy maintenance and efficient
implementation.
• It uses a consistent format for files, the byte stream, making application programs
easier to write.
• It is a multi-user, multiprocess system. Each user can execute several processes
simultaneously.
• It hides the machine architecture from the user, making it easier to write
programs that run on different hardware implementation.

FILE STRUCTURE

A file in UNIX is a sequence of bytes. Different programs expect various levels of


structure, but the Kernel does not impose any structure on files, and no meaning is
attached to its contents - the meaning of the bytes depends solely on the programs that
interpret the file. This is not true of just disc files but of peripheral devices as well.
Magnetic tapes, mail messages, character typed on the keyboard, line printer output,
data flowing in pipes - each of these is just a sequence of bytes as far as the system and
the programs in it are concerned.

Files are organized in tree-structured directories. Directories are themselves files that
contain information on how to find other files. A path name to a file is a text string that
identifies a file by specifying a path through the directory structure to the file.
Syntactically it contains of individual file name elements separated by the slash
character. For example, in /usr/Akshay/data, the first slash indicates the root of the
directory tree, called the root directory. The next element, usr, is a subdirectory of the
root, Akshay is a subdirectory of usr and data is a file or a directory in the directory
Akshay.

ACME/Gulshan Soni
Unit - 4 2

Figure 2 shows a typical UNIX file systems. The file system

Figure 2 : UNIX file, system

is organised as a tree with a single root node called root (written "f); every non-leaf-node
of the file system structure is a directory of files, and files at the leaf nodes of the tree are
either directories or regular files or special device files. ldey contains device files, such
as /dev/console, /dev/lp0, /dev/mt0 and so on; /bin contains the binaries of essential
UNIX system programs.

Create, open, read, write, close, uplink and trunc are system calls which are used for
basic file manipulation. 7be create system call, given a pathname, creates a (empty) file
(or truncates and existing one). An existing file is opened by the open system call, which
takes a path name and a node (such as read, write or read-write) and returns a small
descriptor which may then be passed to a read or write system call (along with a buffer
address and a number of bytes to transfer) to perform data transfer to or from the file.

A file descriptor is an index into a small table of open files for this process. Descriptors
start at 0 and seldom get higher than 6 or 7 for typical programs, depending on the
maximum number of simultaneously open files.

ACME/Gulshan Soni
Unit - 4 3

Each read or write updates the current offset into the file, which is associated with file
table entry and is used to determine the position in the file for the next read or write.

CPU SCHEDULING

CPU scheduling in UNIX is designed to benefit interactive processes. Processes are


given small CPU time slices by a priority algorithm that reduces to round-robin
scheduling for CPU-bound jobs.

The scheduler on UNIX system belongs to the general class of operating system
schedulers known as round robin with multilevel feedback which means that the kernel
allocates the CPU time to a process for small time slice, preempts a process that
exceeds its time slice and feed it back into one of several priority queues. A process may
need many iterations through the "feedback loop" before it finishes. When kernel does a
context switch and restores the context of a process. the process resumes execution
from the point where it had been suspended.

Each process table entry contains a priority field. There is a process table for each
process which contains a priority field for process scheduling. The priority of a process,
is lower if they have recently used the CPU and vice versa.

The more CPU time a process accumulates, the lower (more positive) its priority
becomes, and vice versa, so there is negative feedback in CPU scheduling and it is
difficult for a single process to take all the CPU time. Process aging is employed to
prevent starvation.

Older UNIX systems used a 1-second quantum for the round- robin scheduling. 4.33SD
reschedules processes every 0.1 second and recomputes priorities every second. The
round-robin scheduling is accomplished by the -time-out mechanism, which tells the
clock interrupt driver to call a kernel subroutine after a specified interval; the subroutine
to be called in this case causes the rescheduling and then resubmits a time-out to call
itself again. The priority recomputation is also timed by a subroutine that resubmits a
time-out for itselfevent. The kernel primitive used for this purpose is called sleep (not to
be confused with the user-level library routine of the same name.) It takes an argument,
which is by convention the address of a kernel data structure related to an event that the
process wants to occur before that process is awakened. When the event occurs, the
system process that knows about it calls wakeup with the address corresponding to the
event, and all processes that had done a sleep on the same address are put in the ready
queue to be run.

MEMORY MANAGEMENT

The CPU scheduling is strongly influenced by memory management schemes. At least


part of a process must be contained in primary memory to run; a process cannot be
executed by a CPU if it is existing entirely in main memory. It is not also possible to
contain all active processes in the main memory. For example 4MB main memory will

ACME/Gulshan Soni
Unit - 4 4

not be able to provide space for 5MB process. It is the job of memory management
module to decide which process should reside (at least partially) in main memory, and
manage the parts of the virtual address of a process which are residing on secondary
storage devices. It monitors the amount of physical memory and provide swapping of
processes between physical memory and secondary storage devices.

Swapping

The early development of UNIX systems transferred entire processes between primary
memory and secondary storage device but did not transfer parts of a process
independently, except for shared text Such a memory management policy is called
swapping. UNIX was first implemented on PDP-11, where the total physical memory was
limited to 256Kbytes. The total memory resources were insufficient to justify or support
complex memory management algorithms. Thus, UNIX swapped entire process memory
images.

Allocation of both main memory and swap space is done first- fit. When the size of a
process memory image increases (due to either stack expansion or data expansion), a
new piece of memory big enough for the whole image is allocated. The memory image is
copied, the old memory is freed, and the appropriate tables are updated. (An attempt is
made in some systems to find memory contiguous to the end of the current piece, to
avoid some copying.) If no single piece of main memory is large enough, the process is
swapped out such that it will be swapped back in with the new size.

There is no need to swap out a sharable text segment, because it is read-only, and there
is no need to read in a sharable text segment for a process when another instance is
already in memory. That is one of the main reasons for keeping track of sharable text
segments: less swap traffic. The other reason is the reduced amount of main memory
required for multiple processes using the same text segment.

Decisions regarding which processes to swap in or swap out are made by the scheduler
process (also known as the swapper). The scheduler wakes up at least once every 4
seconds to check for processes to be swapped in or out. A process is more likely to be
swapped out if it is idle or has been in main memory for a long time, or is large; if no
obvious candidates are found, other processes are picked by age. A process is more
likely to be swapped in if its has been swapped out a long time, or is small. There are
checks to prevent thrashing, basically by not letting a process be swapped out if it's not
been in memory for a certain amount of time.

If jobs do not need to be swapped out, the process table is searched for a process
deserving to be brought in (determined by how small the process is and how long it has
been swapped out). Processes are swapped out until there is not enough memory
available.

Many UNIX systems still use the swapping scheme just described. All Berkeley UNIX
systems, on the other hand, depend primarily on paging for memory-contention

ACME/Gulshan Soni
Unit - 4 5

management, and depend only secondarily on swapping. A scheme similar in outline to


the traditional one is used to determine which processes get swapped in or out. but the
details differ and the influence of swapping is less.

Demand Paging

Berkeley introduced demand paging to UNIX with BSD (Berkeley System) which
transferred memory pages instead of processes to and from a secondary device; recent
releases of UNIX system also support demand paging. Demand paging is done in a
straightforward manner. When a process needs a page and the page is not there, a
page fault to the kernel occurs, a frame of main memory is allocated, and then the
process is loaded into the frame by the kernel.

The advantage of demand paging policy is that it permits greater flexibility in mapping
the virtual address of a process into the physical memory of a machine, usually allowing
the size of a process to be greater than the amount of availability of physical memory
and allowing more Processes to fit into main memory. The advantage of a swapping
policy is that is easier to implement and results in less system overhead.

Blocks and Fragments

Most of the file system is taken up by data blocks, which contain whatever the users
have put in their files. Let us consider how these data blocks are stored on the disk.

The hardware disk sector is usually 512 bytes. A block size larger than 512 bytes is
desirable for speed. However, because UNIX file systems usually contain a very large
number of small files, much larger blocks would cause excessive internal fragmentation.
That is why the earlier 4.IBSD file system was limited to a 1024-byte (IK) block.

The 4.2BSD solution is to use two block sizes for files which have no indirect blocks: all
the blocks of a file are of a large block size (such as 8K), except the last. The last block
is an appropriate multiple of a smaller fragment size (for example, 1024) to fill out the
file. Thus, a file of size 18,000 bytes would have, two 8K blocks and one 2K fragment
(which would not be filled completely).

The block and fragment sizes are set during file-system creation according to the
intended use of the file system: If many small files are expected, the fragment size
should be small; if repeated transfers of large files are expected, the basic block size
should be large, Implementation details force a maximum block-to-fragment ratio of 8:1,
and a minimum block size of 4K, so typical choices are 4096: 512 for the former case
and 8 192: 1024 for the latter.

Suppose data are written to a file in transfer sizes of 1K bytes, and the block and
fragment sizes of the file system are 4K and 512 bytes. The file system will allocate a 1K
fragment to contain the data from the first transfer. The next transfer will cause a new 2K
fragment to be allocated. The data from the original fragment must be copied into this

ACME/Gulshan Soni
Unit - 4 6

new fragment, followed by the second 1 K transfer. The a] location routines do attempt
to find the required space on the disk immediately following the existing ferment so that
no copying is necessary, but, if they cannot do so, up to seven copies may be required
before the fragment becomes a block. Provisions have been made for programs to
discover the block size for a file so that transfers of that size can be made, to avoid
fragment recopying

Inodes

Associated with each file in LTNIX is a little table (on disk) called an i-node. An inode is a
record that describes the attributes of a file, including the lay out of its data on disk.
Inodes exist in a static form on disk and the kernel read them into the main memory and
manipulates them. Disk inodes consist of the following fields:

• File owner identifier - File ownership is divided between an individual owner


and a group owner and defines the set of users who have access rights to a file.
There supervisor has access rights to all files in the system.
• File type - Files may be of type regular, directory, character or block special or
pipes.
• File access permission - The system protects files according to three classes:
the owner and the group owner of the file and other users; each class has access
rights to read, write and execute the file which can be set individually. Although
directory is a file but it cannot be executed, execution permission for a directory
gives the right to search the directory, for a file name.
• File access times - Giving the time the file was last modified, when it was last
accessed.

In addition, the inode contains 15 pointers to the disk blocks containing the data contents
of the file. The first 12 of these pointers (as shown in figure 3) point to direct blocks; that
is, they contain addresses of blocks that contain data of

ACME/Gulshan Soni
Unit - 4 7

Figure 3 : Direct and indirect block of inode

the file. Thus, the data for small files (no more than 12 blocks) can be referenced
immediately, because a copy of the inode is kept in main memory while a file is open. If
the block size is 4K, then up to 48K of data may be accessed directly from the inode.

The next three pointers in the inode point to indirect blocks. If the file is large enough to
use indirect blocks, the indirect blocks are each of the major block size; the fragment
size applies to only data blocks. The first indirect block pointer is the address of a single
indirect block. The single indirect block is an index block, containing not data, but rather
the addresses of blocks that do contain data. Then, there is a double-indirect-block
pointer, the address of a block that contains the addresses of blocks that contain
pointers to the actual data blocks.

The last pointer would contain the address of a triple indirect block; however, there is no
need for it. The minimum block size for a file system in 4.2BSD is 4K, so files wit as
many as 232 bytes will use only double, not triple, indirection. That is, as each block
pointer takes 4 bytes, we have 49,152 (4K x 12) bytes accessible in direct blocks,
4,194,304 bytes accessible by a single indirection, and 4,294,967,296 bytes reachable
through double indirection, for a total of 4,299,210,752 bytes, which is larger than 232
bytes.

The number 232 is significant because the file offset in the file structure in main memory
is kept in a 32-bit word. Files therefore cannot be larger than 232 bytes. Since file

ACME/Gulshan Soni
Unit - 4 8

pointers are signed integers (for seeking backward and forward in a file), the actual
maximum file size is 232-1 bytes. Two gigabytes is large enough for most purposes.

Directory Structure

Before a file can be read, it must be opened. When a file is opened, the operating
system uses the path name supplied by the user to locate the disk blocks, so that it can
read and write the file later. Mapping path names onto i-nodes (or the equivalent) brings
us to the subject of how directory systems are organized. These vary from quite simple
to reasonably sophisticated.

Now let us consider some examples of systems with hierarchical directory trees. Figure
4 shows an MS-DOS directory entry. It is 32 bytes long and contains the file name and
the first block number, among other items. The first block number can be used as an
index into the FAT, to find the second block number, and so on. In this way all the blocks
can be found a given file. Except for the root directory, which is fixed size (1 12 entries
for a 360K disk). MS-DOS directories am files and may contain an arbitrary number of
entries.

Figure 4 : The MS-DOS directory entry

The directory structure used in UNIX is extremely simple, as shown in figure 5. Each
entry contains just a file name and its i-node number. All the information about the type,
size. times, ownership, and disk blocks is contained in the i- node (see figure 3). All
directories UNIX are files, and may contain arbitrarily many of these entries.

Figure 5 : A Unix directory entry

When a file is opened, the file system must take the file name supplied and locate its
disk blocks. Let us consider how the path name /usr/ast/mbox is looked up. We will use
UNIX as an example, but the algorithm is basically the same for all hierarchical directory

ACME/Gulshan Soni
Unit - 4 9

systems. First the file system locates the root directory. In UNIX its i-node is located at a
fixed place on the disk.

Then it looks up the first component of the path, usr, in the root directory to find the i-
node, the system locates the directory for/usr and looks up the next component, ast, in it.
when it has found the entry for ast, it has the i node for directory for /usr /ast. From this i-
node it can find the directory itself and look up mbox. The i- node for this file is then read
into memory and kept there until the file is closed. The lookup process is illustrated in
figure 6.

Figure 6: The steps In looking up/usr/ast/mbox

Relative path names are looked up the same way as absolute ones, only starting from
the working directory instead of starting from the root directory. Every directory has
entries for and which are put there when the directory is created. The entry has the i
node number for the parent directory, and searches that directory for disk. No special
mechanism is needed to handle these names. As far as the directory system is
concerned, they are just ordinary ASCII

User to user communication:

INTRODUCTION

UNIX for easy communication. Each of the commands has its own advantages and is
appropriate in different situations. You should learn to use these commands and actually
put them into practice.

ACME/Gulshan Soni
Unit - 4 10

Unlike the other commands which you can learn by solitary effort, the communication
features are best mastered by working with a partner to whom you can send practice
messages. Of course this is not an essential requirement, and you could, if necessary,
learn even the communication commands all by yourself.

OBJECTIVES

This unit will take you further down the road to exploring UNIX. By now you have had a
feel of what UNIX is like. In this unit you will learn about some more of the strong points
of UNIX. By the end of this unit you should be able to:

• Communicate on-line with other users on your machine using

1. write
2. wall

• Communicate off-line with other users with the help of mail and news

ACME/Gulshan Soni

Вам также может понравиться