Вы находитесь на странице: 1из 69

File-System

FILE CONCEPT
It provides the mechanism for online storage of

and access to both data and programs of OS and


all the users of the computer system.

The file system consists of two distinct parts:


collection of files each storing related data
directory structure which organizes and provides

information about all the files in the system.

FILE CONCEPT
A file is a named collection of related information

that is recorded on secondary storage.


A file is a sequence of bits, bytes, lines or records.
Files represent programs and data.
Data

files

may

be

numeric,

alphabetic,

alphanumeric or binary.
Files may be free form such as text files or may be

formatted rigidly.

Many different types of information may be stored in a


file source programs, object programs, executable
programs, numeric data, text etc.
A file has a certain defined structure which depends
on its type.
Text file sequence of characters organized into lines
Source file sequence of sub routines and functions
each of which is further organized as declarations
followed by executable statements.

Object file sequence of bytes organized into


blocks understandable by the systems linker.
Executable file series of code sections that the
loader can bring into memory and execute.

File Attributes
A file is referred to by its name. A name is usually a string of
characters.
A files attributes vary from one OS to another but consist of these
Name: symbolic file name is the only information kept in human
readable form.
Identifier: number which identifies the file within the file system; it is
the non human readable name for the file.
Type: information is needed for systems that support different types
of files.

File Attributes
Location: this information is a pointer to a device and
to the location of the file on that device.
Size: the current size of the file
Protection: Access control information determines who
can do reading, writing, executing etc.
Time, date and user identification: This information
may be kept for creation, last modification and last
use.

File Attributes
The information about all files is kept in the directory

structure which resides on secondary storage.


A directory entry consists of the files name and its

unique identifier.
The identifier in turn locates the other file attributes.

File Operations
A file is an abstract data type.
OS can provide system calls to create, write, read,

reposition, delete and truncate files.


Creating a file First space in the file system must be

found for the file. Second, an entry for the new file
must be made in the directory.

File Operations
Writing a file To write a file, specify both the name

of the file and the information to be written to the file.


The system must keep a write pointer to the location

in the file where the next write is to take place.


Reading a file To read from a file, directory is

searched for the associated entry and the system


needs to keep a read pointer to the location in the
file where the next read is to take place.

File Operations
Because a process is either reading from or writing to a file,

the current operation location can be kept as a per process


current file position pointer.
Repositioning within a file Directory is searched for the

appropriate entry and the current file position pointer is


repositioned to a given value. This operation is also known as
file seek.
Deleting a file To delete a file, search the directory for the

named file. When found, release all file space and erase the
directory entry.

File Operations
Truncating a file User may want to erase the
contents of a file but keep its attributes. This function
allows all attributes to remain unchanged except for file
length.
OS keeps a small table called the open file table
containing information about all open files.
The open () call can also accept access mode
information create, read only, read write, append
only, etc. This mode is checked against files
permissions

File Operations

OS uses two levels of internal tables a per

process table and a system wide table.


The per process table tracks all files that a

process has open. Stored in this table is


information regarding the use of the file by the
process.

File Operations
Each entry in the per process table points to a
system wide open file table.
The system wide table contains process
independent information.
Once a file has been opened by one process, the
system wide table includes an entry for the file.
The open file table also has an open count
associated with each file to indicate how many
processes have the file open.

File Operations
File pointer System must keep track of the last read

write location as a current file position pointer.

File open count As files are closed, OS must reuse

its open file entries or it could run out of space in the


table. File open counter tracks the number of opens and
closes and reaches zero on the last close.

FileofOperations
Disk location
the file The information
needed to locate the file on disk is kept in
memory so that the system does not have to read
it from disk for each operation.
Access rights Each process opens a file a
file in an access mode. This information is stored
on the per process table so the OS can allow or
deny subsequent I/O requests.

FILE OPERATIONS
File locks allow one process to lock a file and
prevent other processes from gaining access to it.
File locks are useful for files that are shared by
several processes.
A shared lock is where several processes can
acquire the lock concurrently.
An exclusive lock is where only one process at a
time can acquire such a lock.

File types
A common technique for implementing file types is

to include the type as part of the file name.

The name is split into two parts

a name and

an extension separated by a period character.

The system uses the extension to indicate


the

type of the file and

type

of operations performed on that file.

File Structure
File types also can be used to indicate the internal

structure of the file.

Hence some OSs impose a minimal number of file

structures.

MAC OS also supports a minimal number of file

structures.

It expects files to contain two parts a resource fork

and a data fork.

The resource fork contains information of interest to

the user.

The data fork contains program code or data

traditional file contents.

Internal file structure


Files store information.
When it is used, this information must be accessed
and read into computer memory.

Acyclic graph directories


A tree structure prohibits the sharing of files and

directories.

An acyclic graph i.e. a graph with no cycles

allows directories to share subdirectories and


files.
The same file or subdirectory may be in two
different directories.

Acyclic graph directories


With a shared file, only one actual file exists, so any
changes made by one person are immediately

visible to the other.


Sharing is particularly important for subdirectories;
a new file created by one person will automatically

appear in all the shared subdirectories.

A common way, is to create a new directory entry

called a link.

A link is effectively a pointer to another file or

subdirectory. For example, a link may be


implemented as an absolute or a relative path
name.

When a reference to a file is made, we search the

directory.

If the directory entry is marked as a link, then the

name of the real file is included in the link


information.

We resolve the link by using that path name to

locate the real file.

Another common approach for implementing shared

files is simply to duplicate all information about


them in both sharing directories.
Thus, both entries are identical and equal. The link is

clearly different from the original directory entry; thus,


the two are not equal.
A major problem with duplicate directory entries is

maintaining consistency when a file is modified.

An acyclic-graph directory structure is more flexible

than is a simple tree structure, but it is also more


complex.
A file may now have multiple absolute path names.

Consequently, distinct file names may refer to the


same file.
If we are trying to traverse the entire file system-to

find a file to copy all files to backup storage-this


problem becomes significant, since we do not want to
traverse shared structures more than once.

Another problem involves deletion. When can the

space allocated to a shared file be deallocated and


reused?

One possibility is to remove the file whenever

anyone deletes it, but this action may leave


dangling pointers to the now-nonexistent file.
Worse, if the remaining file pointers contain actual
disk
addresses, and the space is subsequently reused
for other files, these dangling pointers may point
into the middle of other files.

File Concept
Contiguous logical address space
Types:

Data

numeric

character

binary

Program

File Structure
None - sequence of words, bytes
Simple record structure

Lines

Fixed length

Variable length

Complex Structures

Formatted document

Relocatable load file

Can simulate last two with first method by inserting appropriate control

characters

Who decides:

Operating system

Program

File Attributes
Name only information kept in human-readable form
Identifier unique tag (number) identifies file within file system
Type needed for systems that support different types
Location pointer to file location on device
Size current file size
Protection controls who can do reading, writing, executing
Time, date, and user identification data for protection, security, and usage
monitoring
Information about files are kept in the directory structure, which is maintained
on the disk

File Operations
File is an abstract data type
Create
Write
Read
Reposition within file
Delete
Truncate
Open(Fi) search the directory structure on disk for entry Fi, and move the
content of entry to memory
Close (Fi) move the content of entry Fi in memory to directory structure
on disk

Open Files
Several pieces of data are needed to manage open files:
File pointer: pointer to last read/write location, per process that has
the file open
File-open count: counter of number of times a file is open to allow
removal of data from open-file table when last processes closes it
Disk location of the file: cache of data access information
Access rights: per-process access mode information

Open File Locking


Provided by some operating systems and file systems
Mediates access to a file
Mandatory or advisory:
Mandatory access is denied depending on locks held and requested
Advisory processes can find status of locks and decide what to do

File Types Name, Extension

Access Methods

Sequential Access

Direct Access

n = relative block number

read next
write next
reset
no read after last write
(rewrite)
read n
write n
position to n
read next
write next
rewrite n

Sequential-access File

Simulation of Sequential Access on


Direct-access File

Example of Index and Relative Files

Directory Structure
A collection of nodes containing information about all files

Directory

Files

F1

F2

F3

F4
Fn

Both the directory structure and the files reside on disk


Backups of these two structures are kept on tapes

Disk Structure
Disk can be subdivided into partitions
Disks or partitions can be RAID protected against failure
Disk or partition can be used raw without a file system, or formatted

with a file system

Partitions also known as minidisks, slices


Entity containing file system known as a volume
Each volume containing file system also tracks that file systems info in

device directory or volume table of contents

As well as general-purpose file systems there are many special-

purpose file systems, frequently all within the same operating system
or computer

A Typical File-system Organization

Operations Performed on Directory


Search for a file
Create a file
Delete a file
List a directory
Rename a file
Traverse the file system

Organize the Directory


(Logically) to Obtain
Efficiency locating a file quickly
Naming convenient to users

Two users can have same name for different files

The same file can have several different names

Grouping logical grouping of files by properties, (e.g., all Java

programs, all games, )

Single-Level Directory
A single directory for all users

Naming problem
Grouping problem

Two-Level Directory
Separate directory for each user

Path name
Can have the same file name for different user
Efficient searching
No grouping capability

Tree-Structured Directories

Tree-Structured Directories (Cont.)


Efficient searching
Grouping Capability
Current directory (working directory)

cd /spell/mail/prog

type list

Tree-Structured Directories (Cont)


Absolute or relative path name
Creating a new file is done in current directory
Delete a file

rm <file-name>
Creating a new subdirectory is done in current directory

mkdir <dir-name>
Example: if in current directory /mail
mkdir count
mail
prog

copy prt exp count

Deleting mail deleting the entire subtree rooted by mail

Acyclic-Graph Directories
Have shared subdirectories and files

Acyclic-Graph Directories (Cont.)


Two different names (aliasing)
If dict deletes list dangling pointer

Solutions:

Backpointers, so we can delete all pointers


Variable size records a problem

Backpointers using a daisy chain organization

Entry-hold-count solution

New directory entry type

Link another name (pointer) to an existing file

Resolve the link follow pointer to locate the file

General Graph Directory

General Graph Directory (Cont.)


How do we guarantee no cycles?

Allow only links to file not subdirectories

Garbage collection

Every time a new link is added use a cycle detection algorithm to


determine whether it is OK

File System Mounting


A file system must be mounted before it can be accessed
A unmounted file system (i.e., Fig. 11-11(b)) is mounted at a

mount point

(a) Existing (b) Unmounted Partition

Mount Point

File Sharing
File sharing is desirable for users who want to
collaborate and to reduce the effort required to
achieve a computing goal.
Multiple users: When an OS accommodates
multiple users, the issues of file sharing, file
naming and file protection become preeminent.
System mediates file sharing.
The system can either allow a user to access the
files of other users by default or require that a user
specifically grant access to the files.

File Sharing
Sharing of files on multi-user systems is desirable
Sharing may be done through a protection scheme
On distributed systems, files may be shared across a network
Network File System (NFS) is a common distributed file-sharing method

File Sharing Multiple Users


User IDs identify users, allowing permissions and protections to be

per-user

Group IDs allow users to be in groups, permitting group access

rights

File Sharing Remote File Systems


Uses networking to allow file system access between systems

Manually via programs like FTP

Automatically, seamlessly using distributed file systems

Semi automatically via the world wide web

Client-server model allows clients to mount remote file systems from

servers

Server can serve multiple clients

Client and user-on-client identification is insecure or complicated

NFS is standard UNIX client-server file sharing protocol

CIFS is standard Windows protocol

Standard operating system file calls are translated into remote calls

Distributed Information Systems (distributed naming services) such

as LDAP, DNS, NIS, Active Directory implement unified access to


information needed for remote computing

File Sharing Failure Modes


Remote file systems add new failure modes, due to network failure,

server failure

Recovery from failure can involve state information about status of

each remote request

Stateless protocols such as NFS include all information in each

request, allowing easy recovery but less security

Protection
When information is stored in a computer

system, it should be kept safe from physical


damage (reliability) and improper access
(protection).

Reliability is provided by duplicate copies of files.


Protection can be provided in many ways such

as physically removing the floppy disks and


locking them up.

Types of Access
Complete protection to files can be provided by

prohibiting access.
Systems that do not permit access to the files of

other users do not need protection.


Both these approaches are extreme.
Hence controlled access is required.

Protection mechanisms provide controlled access by limiting the

types of file access that can be made.

Several different types of operations may be controlled


i. Read

ii. Write

iii. Execute

iv. Append

v. Delete vi. List


Other operations such as renaming, copying etc may also be

controlled.

Access Control
The most common approach to the protection problem

is to make access dependent on the identity of the


user.
The most general scheme to implement identity-

dependent access is to associate with each file and


directory an access- control list (ACL) specifying user
names and the types of access allowed for each user.

Access Control
The main problem with access lists is their

length.

Three classifications of users in connection with

each file:
a)

Owner user who created the file

b)

Group set of users who are sharing the


file and need similar access

c)

Universe all other users in the system

Access Lists and Groups


Mode of access: read, write, execute
Three classes of users
a) owner access

b) group access

c) public access

RWX
111
RWX
110
RWX
001

Ask manager to create a group (unique name), say G, and add

some users to the group.

For a particular file (say game) or subdirectory, define an

appropriate access.

owner
chmod

group
761

public
game

Attach a group to a file


chgrp

game

Access Control

With the more limited protection classification, only

three fields are needed to define protection.


Each field is a collection of bits and each bit either

allows or prevents the access associated with it.


A separate field is kept for the file owner for the

files group and for all the other users.

Other Protection Approaches


Another approach to protection problem is to associate a

password with each file.


If the passwords are chosen randomly and changed often, this

scheme may be effective in limiting access to a file.


Use of passwords has certain disadvantages

The number of passwords that a user needs to remember


may become large making the scheme impractical.

If only one password is used for all the files, then once it is
discovered, all files are accessible.

End of Chapter 10

Вам также может понравиться