Академический Документы
Профессиональный Документы
Культура Документы
Data, Indices,
Catalogs, Log
2
Logical layer
Query manager
3
Physical layer
Logical layer: SQL manager
Data, Indices,
Catalogs, Log
4
Abstraction levels
View 1 … View n
The views describe
the users' point of view
logical
indipendence The logical schema defines
Logical schema the logical structure
physical
The physical schema
indipendence describes how the data
Physical schema
is actually stored on disk
Data
5
Users of a DBMS
6
Storage manager
7
Performance of a memory
data size
access time = latency +
transfer speed
8
Main memory
9
Secondary memory
10
Tertiary memory
11
Implications for DBMSs
12
Hard disk spindle
head platter
The set of
corresponding tracks
on all platters is called
a cylinder
sector
arm sector of
a track
track
Every block can be
Every track is
identified by 3
divided in blocks
coordinates:
composed of a fixed
Cylinder, Head, Sector
number of sectors
block
13
Capacity evolution
Density evolution
The heads
16
An example: Seagate Barracuda 7200.12
17
Performance
2.2 15.5
Seek Time (Ts)
<0.1 <0.1
Settle Time
20
Rotational Latency (60/Spindle Speed)* 0.5 * 1000
22
Pages
23
Example
Page size P = 4 KB
Tt = 4/(21.56*1024) = 0.18 ms
Page size P = 64 KB
Tt = 64 /(21.56*1024) = 2.9 ms
24
The physical DB
25
The physical DB(cont.)
26
The storage model of DB2
27
Using extents
28
Tablespace
29
Tablespace types
3 different types of tablespace exist:
SMS (System Managed Space)
storage is managed by the OS
DMS (Database Managed Space)
storage is managed by the user
Automatic Storage
is managed by the DBMS
30
Comparison among types (i)
Creation
SMS: CREATE TABLESPACE … MANAGED BY SYSTEM
DMS: CREATE TABLESPACE … MANAGED BY DATABASE
AS: CREATE TABLESPACE … [MANAGED BY AUTOMATIC STORAGE]
Defining containers
SMS: directory name
DMS: device or file (fixed size)
AS: automatic, containers exist in every path associated
to the DB
31
Comparison among types (ii)
Initial allocation
SMS: OS-based (fragmentation likely)
DMS: at creation time (fragmentation unlikely when
container=device)
AS: system/user: at creation time
temporary: whenever needed
Modifying containers
SMS: not allowed
DMS: creating/removing containers
(automatic balance)
AS: automatic
32
Comparison among types (iii)
33
Comparison among types (iv)
Maximum size
SMS: n x maximum file size
DMS: 512 GB (64 TB for large tablespace)
AS: file system size
Object separability (e.g., tables and indices)
SMS: no, everything in the same tablespace
DMS: objects can be stored into different tablespaces
AS: objects can be stored into different tablespaces
34
Which is the best type?
AS
Large tables
Simplified management of container enlargement
Storing different objects (e.g., indices, tables)
into different tablespaces (performance)
DMS
Large tables
Control over where data are stored
Control over storage status
Storing different objects (e.g., indices, tables)
into different tablespaces (performance)
SMS
Small tables
Control over where data are stored
Control over storage status
35
Tablespace attributes
36
Why don’t we simply use the file system?
37
Organizing data in a file
File Reference schema (simplified)
File Header
Record 0 Field 1 Field 2 Field 3 … Field k
… Page 0
Record 1 Field 1 Field 2 Field 3 Field k
…
Page 1
39
Fixed-length records
40
Variable-length records
0 4 8 12 13 23 28 40 41
Record Header
42
Organizing records into pages
44
Organizing pages into slots
46
Reading and writing pages
Reading a single tuple requires bringing the corresponding page
into main memory, in a DBMS-managed area called buffer pool
Every buffer in the pool could host the copy of a disk page
Managing the buffer pool is fundamental for performance,
and a DBMS module is devoted to this, the Buffer Manager (BM)
The BM is called also when writing to disk,
because we should update a modified page on disk
The BM plays a fundamental role, as we shall see, in
transaction management, in order to guarantee the DB
integrity when faults are present
47
The Buffer Manager
48
Interface of the Buffer Manager
The interface offered by the BM to other DBMS modules
includes four basic methods:
getAndPinPage: requests the page to the BM an puts a “pin” on it, indicating that it is used
unPinPage: releases the page, clearing a pin
setDirty: indicates that the page has been modified, it is now “dirty”
flushPage: forces the to be written on disk, making it “clean”
pincount
dirty
49
Buffer replacement policies
For operating systems, a common policy for choosing the page
to be replaced is the LRU (Least Recently Used), thus the page
chosen is the one which has been unused for the most time
In DBMSs LRU is often a bad choice, since for some queries
the “access pattern” to data is known, and it could thus be used
to operate more accurate choices, also able to greatly improve
performance
The hit ratio, or the ratio of requests needing no I/O operation,
is a synthetic indicator of the quality of a replacement policy
Example: we will see that join algorithms exist scanning tuples of
a relation for N times.
In this case, the best policy would be a MRU (Most Recently Used),
thus replacing the most recently used page!
… and this is another reason why DBMSs do not use (all)
services provided by OSs …
50
File organization
The way records are organized within the file affects both
efficiency of data access and storage occupation
In the following, we will see some basic organizations, namely:
Heap file
Sequential file
… and we will evaluate them according to some typical
operations
For the sake of simplicity, we will:
Consider fixed-length records
Evaluate “costs” as number of I/O operations, assuming that
every page request results in a single I/O operation
In order to evaluate costs we however need some basic
information…
51
SQL catalogs statistics
52
Cost model
53
Heap file
header
record 0 H. Fonda LA male 1-1-11
record 1
record 2 Basinger Chicago female 3-3-33
record 3
record 4 Baldwin NYC male 2-2-22
54
Page management: linked list
55
Page management: directory
56
The DB2 solution
57
Heap file: operations and costs
58
Sequential file
60