Академический Документы
Профессиональный Документы
Культура Документы
(Dosya Dzenleme)
Ders Notlar #1
(Disk Organization & Performance)
Alim-i mursid, koyun olmal; ku olmamal.
Koyun, kuzusuna st; ku yavrusuna kay verir.
Definitions-1
File Structure is a combination of representations for data in files and of operations
Sequential Search
simple index search (before 1960 )
tree structures
( BST-1960s, AVL-1963, B-tree (B+-tree) 1970s
Simple Hashing (before 1980)
Dynamic Hashing (after 1980)
Based on
usage characteristics of data
data type (basic vs. multi-dim data)
physical characteristics of machine
Definitions-2
Physical File is the particular collections of bytes stored in disk.
Logical file is the view of physical file from the standpoint of application
program.
There are thousands of physical files on disk, but the program can have
only 20 logical files.
OS make the connections between physical file and logical files.
Example:
int fd = open (filename, flags[,mode]);
Read/Write: First OS make the connection then read/write with using
logical descriptor.
Physical Devices as Files. The program access the file without knowing
whether the file comes from disk, tape, another computer,
keyboard(stdin), screen (stdout)...
% list.exe > myoutput
% prog1 | prog2
% list | sort
MAIN GOAL
Increase reliability while increasing the speed at the lowest cost
structures and the algorithms are used to predict the efficiency of file
operations.
Now we will study physical characteristics of the hardware...
Storage Hierarchy
Machine
instructions
Cost/unit increase
Cache
1M, 15 cycle
Capacity decrease
100.000
times slower
Reliability increase
Non-volatile media
Secondary Storage
Non-volatile media
(like 10sec
vs. 10days)
80G, 10 milisec.
CD-RW
DVD-RW
Floppy Disk
Serial access
Magnetic Tape
storage
DVD (Digital Video Disk): store optically, read by laser. (smaller wavelength and laser
type)
Magnetic Tapes:
cheap storage for archieval purposes
Only Sequential acces
Charactericstics of 9-track tape
Tape density: bpi = Bpi (6250 ~ 30.000)
Tape speed: ips (30 ~ 200)
Gap size: 0,3 ~ 0,75 inch (i)
frame
track
9-track
tape
Data block
8
Gap (R/W durma/kalkma iin braklan alan)
Tape length
How much tape is needed to store 1 million 100-B records if we have tape
s= n * (b+g)
Blocking factor (bf) = # of records / block
bf=1b=100/6250 Bpi = 0,016 i s= 1 million * (0,3+0,016) =26.333 feet
bf=50b=100*50/6250 Bpi = 0,8 i s= (1 million/50) * (0,3+0,8) = 1833 feet
Effective recording density (erd)
=#of Bytes /data_block /length required to store data block
speed (ips)
If we have 200 ips tape, determine the nominal and effective
trans rates?
Nominal trans. rate= 6250 Bpi * 200 ips= 1250 KBps
bfr = 1 effective trans. rate = 316,4 Bpi * 200 = 63,3 KBps
bfr = 50 effective trans. rate = 4545,45 bpi * 200 = 909 KBps
10
11
full rotation,
tt = (# of bytes transfered / # of bytes on a track) * rotation time
the speed at which bytes pass by the disk head, to be transfered
to/from memory.
TOTAL ACCESS TIME= ts+tr+tt
transfer_rate: B/msec (Sample value= 100MB/sec)
13
Example1
10.000 rotateperminute disk
Bytes / sector= 512
Sectors / track = 170
Tracks / cylinder = 16
Ave. Seek time= 8 msec
Ave. Rotational delay= 3 msec
transfer rate= 1/6 * (512*170)= 14500 bytes / msec
Transfer time for a single sector? = (6/170)msec
aaaaa
aa---
bbbbb
aaaaa
aabbb
bbccc
ccc--
Disk may have lots of small-sized chunks of unallocated blocks, but no large
chunks. Thus it may not be possible to allocate space for a large file, even though
disk has plenty of free space. This is called external fragmentation.
15
Block-level interface(block,page )
Block is a sequence of bytes.
Adv.:
OS hides hardware details (like different sector sizes, different addressing) by block-level interface. OS
maintains mapping b/w blocks and sectors.
Blocking increase througput (successful data transfer rate).
A Block size is at least 1 sector-size and determined by OS.. OS views the disk as a series of blocks.
16
Block size:
Block contention increase with larger blocks. Thus, OLAP applications/web
search applications, which has higher random access prefers small-sized block.
Desicion support/data warehouse applciations, which has higher sequential access
prefers large-sized block.
17
In terms of
Prefered size
Application
Block contention
Small
OLAP
Random row
access speed
Small
OLAP
Sequential row
access speed
Large
Desicion support
Data warehouse
18
File-level interface
Client views file a sequence of bytes. (No notion of block,
20
Extent-size: 8 blocks
22
34.000*256B
8704 KB file
Track-based
access
100 tracks (not
contigious)
cluster-based
access
256-byte
34.000
records
.
.
Tracksize:170*512
byte
Cluster size=4096 B
This is 4096 / (170*512) of track
There are 2125 clusters
Track-based
sequential
access=
(8msec+3msec+6
msec)*100=1.7sec
=23,97sec
23
24
25
Arraival time
Complete time
(by fifo algoritm)
Complete time
1000
7.85
7.85
(1.)
3000
7000
2000
8000
5000
0
0
20
30
40
20.7
37.55
56.6
77.45
92.3
20.7
37.55
77.5
48.4
63.25
(2.)
(3.)
(6.)
(4.)
(5.)
26
A journey of Byte
Write(append) the 1-B value in ch in
program to textfile
physical
1)
28
I/O processor/
DMA controller
Disk
controller
DISK
2)
1-) Move mode / locate mode: to eliminate data transfer OH.
2-) Scatter input / gather output (vectored I/O): to eliminate 2step process to scatter OH and useful data of block.
Disk Controller
DISK CONTROLLER CONTROLS THE DISK while hiding the details of disk access.
Disk controller is an interface b/w computer and disk-drive. Transfers R-W request/from disk,
controlling disk arm, provides reliability by applying checksums to sectors, remap the bad sectors.
Disk controller moves the head to the correct position, correct track, correct sector for reading
and writing.
ATA(advanced tech. attachment)=IDE(integrated drive electronics)
EX: 133 MB/s with ATA/133, 150 MB/S with SATA-serialATA, ATAPI
2 IDE port on PC, each port can atmost access 2 disk, one master and the other is slave.
SCSI (small computer system interface) a system bus standart coordinating many type of
devices on a single bus. Provides a basement for RAID disks.
Ex:max 16 devices in ULtra 320 SCSI.
IDE
SCSI
Cost
Cheap
Expensive
#of devices
16
Maintainance
Easy
Hard
Usage
At home
Business
Speed
133MB/s
320 MB/s
29
30
Disk Cache
Disk cache A kind of buffering! Block of memory set aside to
contain blocks of data from disk. Disk cache is bundled with disk
drive.
Improves performance.
When data is requested from secondary storage, the file manager looks
into the disk cache to see if it contains the requested data.
Compare the following access times:
transfer a sector= ts +1/2 tr + sector rotation time
transfer a track = ts + tr
31
Disk Striping
Two 20GB drives are always faster than a single 40GB drive. Because
Using Two 20GB drives is efficient if both can be always kept busy.
To increase efficiency, we have to balance the workload among the multiple
32
33
2 problems:
Note that thare are 4 disk accesess
for a single sector write.
More vulnerable to non-recoverable
multi-disk failure
RAID
34
Buffer Management
Working with large chunks of data in MM so that
Read data in memory multiple times (caching)
coordination uses some techniques such as Least Recently Used, FIFO, clock-replacement
algorithm..
How many buffers do we need?
1: Even if we ( the program) transmit data in only one direction, 1 buffer causes
problems like I/O bound processing..(CPU wants to be filling the buffer at the
same time that I/O is being performed= Enabling I/O-CPU overlapping ONLY
by using at least 2 buffers!! Fig.3.22 )
2: At least 2 (one for input, the other for output): still similiar problems occur.
Solution : Apply Multiple buffering strategy :Tradeoff: (as cost of memory decrease using
many buffers is possible) the more buffers there are but the more complex management
transfer the buffer to the disk with 1
is required.
access when either