Вы находитесь на странице: 1из 42

Logical I/O

Julian Dyke
Independent Consultant

Web Version

1 2005 Julian Dyke juliandyke.com


Agenda

Introduction
Logical I/Os
Buffer Cache Behaviour
Statistics
Conclusion

2 2005 Julian Dyke juliandyke.com


Logical I/Os
Logical I/Os are read operations

Buffers are cached in shared memory

Most logical I/Os can be satisfied from cache

The remainder will result in physical I/Os

Logical I/Os include


current reads
consistent reads

3 2005 Julian Dyke juliandyke.com


Current Reads
Current reads
Current version of block
Can be updated
Can be dirty
Includes all changes
Only one current version of block in buffer cache
Only one current version of block across all instances
Can be used to construct consistent versions

4 2005 Julian Dyke juliandyke.com


Consistent Reads
Consistent reads
Potentially historic version of block
Consistent to a specific System Change Number (SCN)
Cannot be updated
Cannot be dirty
Can be used to construct consistent versions
Can have multiple versions of same block in buffer cache
Can be
single block (sequential reads)
multi block (scattered reads)
Can be traced using events 10200 / 10201

5 2005 Julian Dyke juliandyke.com


Logical I/O statistics
session logical reads statistic
Total number of logical reads in session
Unreliable at system level
At session level
session logical reads = db block gets + consistent gets

db block gets statistic


Number of current reads

consistent gets statistic


Number of consistent reads

6 2005 Julian Dyke juliandyke.com


Buffer Pools
There are up to eight buffer pools
DEFAULT
KEEP, RECYCLE Oracle 8.0 and above
2K, 4K, 8K, 16K and 32K Oracle 9.0 and above

32K not available on all platforms

Cannot have non-standard block size same as DEFAULT block


size

7 2005 Julian Dyke juliandyke.com


Buffer Pool Headers
One for each buffer pool (usable or unusable)

Externalized in
V$BUFFER_POOL
V$BUFFER_POOL_STATISTICS

Based on X$KCBWBPD

Created in shared pool permanent memory when instance is


started

Contain one or more working sets

8 2005 Julian Dyke juliandyke.com


X$KCBWBPD
Buffer Pool Name
Externalises buffer pool header
ADDR RAW(4) Buffer Pool ID
INDX NUMBER
Block Size
INST_ID NUMBER
BP_NAME VARCHAR2(20) Granule Size
BP_ID NUMBER
BP_BLKSZ NUMBER Buffers per Granule
BP_GRANSZ NUMBER
BP_BUFPERGRAN NUMBER Minimum Working Set ID
BP_LO_SID NUMBER
Maximum Working Set ID
BP_HI_SID NUMBER
BP_SET_CT NUMBER Number of Working Sets
BP_SIZE NUMBER
Number of Buffers
BP_STATE NUMBER
BP_CURRGRANS NUMBER
BP_TGTGRANS NUMBER
BP_PREVGRANS NUMBER
9 2005 Julian Dyke juliandyke.com
Hash Buckets
Hash value of each block calculated from
Data Block Address (DBA)
Block Class

Number of hash buckets dependent on number of buffers in


cache e.g.
# buffers 500 6000
# hash buckets 64 1024

Each hash bucket contains


Cache Buffers Chains latch
Pointer to array of double linked lists

10 2005 Julian Dyke juliandyke.com


Hash Buckets
BH BH
# hash chains

BH

cache
buffers
chain
latch
BH

11 2005 Julian Dyke juliandyke.com


Buffer Headers
Each buffer header describes contents of one buffer

All buffers accessed via buffer header

Buffer header contains pointers to


Buffer
Cache Buffers Chains latch

Buffer header includes double linked lists for


Cache Buffers Chain list
Replacement list
Users list
Waiters list

12 2005 Julian Dyke juliandyke.com


X$BH
Externalises buffer headers
ADDR RAW(4)
Hash List Address
INDX NUMBER
INST_ID NUMBER Block Size
HLADDR RAW(4)
Hash List
BLSIZ NUMBER
NXT_HASH RAW(4)
Replacement List
PRV_HASH RAW(4)
NXT_REPL RAW(4) Tablespace#
PRV_REPL RAW(4) Absolute File Number
TS# NUMBER
FILE# NUMBER Relative File Number
DBARFIL NUMBER Block Number
DBABLK NUMBER
Object ID
OBJ NUMBER
BA RAW(4) Buffer Address
CR_SCN_BAS NUMBER

13 2005 Julian Dyke juliandyke.com


Working Sets
Introduced in Oracle 8.1.5

Each buffer pool contains one or more working sets

Working set header


created in shared pool permanent memory
associated with one DBWn process
protected by cache buffers lru chain latch

Each working set maintains separate set of LRU lists

14 2005 Julian Dyke juliandyke.com


LRU Lists
In Oracle 9.2 each working set maintains 4 LRU lists

LRU - replacement list - normal blocks

LRU-W - write list - dirty blocks

LRU-XO - object list - buffers involved in


DROP
TRUNCATE

LRU-XR - range list - buffers involved in


ALTER TABLESPACE BEGIN BACKUP
ALTER TABLESPACE END BACKUP
ALTER TABLESPACE OFFLINE
ALTER TABLESPACE READ ONLY

15 2005 Julian Dyke juliandyke.com


Main and Auxiliary Lists
Each LRU contains
main list
auxiliary list

Auxiliary list includes


dirty buffers identified by DBWn processes
buffers being written

Buffers are moved from main to auxiliary list by DBWn


processes to avoid unnecessary scans

Processes scan auxiliary lists first for free buffers

Buffers also allocated to auxiliary list


at startup
after FLUSH_CACHE

16 2005 Julian Dyke juliandyke.com


Working Set Lists
Working Set
Header
Hot Cold

Replacement MAIN
List
AUX

Write Buffer
MAIN Header
List
AUX

Object MAIN
List
AUX

Range MAIN
List
AUX

17 2005 Julian Dyke juliandyke.com


Replacement List
In Oracle 8.1.5 and above a mid-point insertion algorithm is
used
Buffer cache has a hot end and a cold end
Buffers are inserted at mid-point
Mid-point is head of cold end
Starts at hot end - moves down cache
Maximum mid-point determined by _db_percent_hot_default
Default value is 50%
Head of Head of
Hot End Cold End

Hot End Replacement List Cold End

18 2005 Julian Dyke juliandyke.com


X$KCBWDS
Externalises working set header Working Set ID
ADDR RAW(4)
INDX NUMBER Database Writer Number
INST_ID NUMBER
SET_ID NUMBER MAIN Replacement List
DBWR_NUM NUMBER
BLK_SIZE NUMBER
AUX Replacement List

NXT_REPL RAW(4) Number of buffers on


MAIN Replacement List
PRV_REPL RAW(4)
NXT_REPLAX RAW(4) Number of buffers on
PRV_REPLAX RAW(4) AUX Replacement List
CNUM_REPL RAW(4)
ANUM_REPL RAW(4) Insertion Point
COLD_HD RAW(4) Maximum number of
HBMAX NUMBER Hot Buffers
HBUFS NUMBER Number of Hot Buffers
NXT_WRITE RAW(4)
19 2005 Julian Dyke juliandyke.com
Touch Count
Each buffer header maintains
touch count
timestamp

Touch count represents number of 3 second intervals in which


buffer has been accessed since
buffer last read into cache
touch count last reset

Each time buffer is accessed


if timestamp more than 3 seconds ago
increment touch count
set timestamp to current time

20 2005 Julian Dyke juliandyke.com


Touch Count
When buffer reaches tail of cold end
If touch count >= 2 then
buffer is moved to hot end
Otherwise used as next free buffer

Hot criteria determined by


_db_aging_hot_criteria
default value is 2 touches

Time interval determined by


_db_aging_touch_time
default value is 3 seconds

21 2005 Julian Dyke juliandyke.com


Single versus Multi-Block Reads
Single block reads
Used with current reads
Can be used with consistent reads
Waits recorded by db file sequential read

Multi block reads


Frequently used with consistent reads
Maximum number of physical blocks read specified by
DB_FILE_MULTIBLOCK_READ_COUNT
Waits recorded by db file scattered read
Blocks moved to cold end of buffer cache

22 2005 Julian Dyke juliandyke.com


Single-Block Reads

Head of Head of Block


Hot End Cold End Number

92
71
42 34
92
71 72
34
92 45
72
34 52
42
45
33
72
87
11 71
52
42
45
33
72
11 66
71
52
42
45
33
11 49
66
71
52
42
45
11

0 3
0 3
4
0 2
4 2
4
1 2
1
4 0
2
1 0
1
2

Read Block 87
11
42
33
34 Touch
42
33
87
11
Count
Get
Insert
Update
Move
Set
Gettouch
Insert
first
first
block
buffer
buffer
touch
buffer
available
available
count
71
42
at
atcount
contents
to
head
head
head
buffer
buffer
of
of cold
cold
from
end
for
of
on
from
endhot
block
block
cold
cold
end42
71
42
34
end
end
to zero 1

23 STOP
2005 Julian Dyke juliandyke.com
Consistent Reads
Current Block
Head of Head of Block Consistent Block
Hot End Cold End Number

40 56 17
27 34
17
27 27
34
17 95
27
34 33
95
27 85
33
95

132
128 132 150 150 150

System
Read Block 27 - SCN 128
132 Change
27
Number
Insert
Get
Read
Apply
first
current
consistent
buffer
undo
available
to
at
version
rollback
head
version
buffer
of
ofcold
of
from cold
block
end 27 SCN
to into
endbuffer
132
128 128
132
150

24 STOP
2005 Julian Dyke juliandyke.com
Multi-Block Reads
DB_FILE_MULTIBLOCK_READ_COUNT = 4

Head of Head of
Hot End Cold End

7
2
1
3
4
8
5
6 6
2
3
7
1
5 5
2
1
6 1
5 1
5 1
2
5
6 2
1
5
3
6
7 2
3
4
6
7
8
1
5

41
5 2
3
6 2
3
7 4
1
8
ReadBlock
Read Block7
28
5
6
1
3
4
Insert
Read
Move
Move
Get
Read
Insert
Move
Get next
next
block
buffers
next
block
block
buffers
first four
four
four
four 17
2
3
45
86available
at
to
available
to
to
at
blocks
blocks
head
cold
cold
cold
headend
end
of
end
into
of
into
cold
buffers
buffers
coldend
endfrom cold
from cold end
end

25 STOP
2005 Julian Dyke juliandyke.com
Dirty Blocks
When blocks are updated they are marked dirty

Changes immediately written to redo buffer

Changes written back to disk asynchronously by DBWn


process

DBWn process
scans from cold end of MAIN replacement list
moves dirty blocks to auxiliary list
writes dirty blocks back to disk

Written blocks remain on auxiliary list until re-used

26 2005 Julian Dyke juliandyke.com


Buffer Pinning
In Oracle 8.0 and above, Oracle uses pinning to reduce
number of logical I/Os

If buffer will be accessed again by the statement, it is pinned


in the buffer cache

Frequently used with index scans

Only appears to be used with consistent gets


not observed with current gets

If pinning was not implemented, number of logical I/Os would


significantly increase

27 2005 Julian Dyke juliandyke.com


Buffer Pinning Statistics
buffer is not pinned count statistic
Number of pin-able buffers not pinned by this session
when visited
Equivalent to number of logical I/Os (for that part of
statement)

buffer is pinned count statistic


Number of buffers already pinned by this session when
visited

Number of buffers visited =


buffer is not pinned count + buffer is pinned count

28 2005 Julian Dyke juliandyke.com


Consistent Gets Statistics
consistent gets - examination statistic
Number of consistent gets that could be immediately
performed without pinning the buffer
Generally apply to indexes
Require one latch get
Included in consistent gets statistic

no work - consistent read gets statistic


Number of consistent gets that could be performed without
requiring rollback or cleanout
Generally apply to tables
Require two latch gets
Included in consistent gets statistic

29 2005 Julian Dyke juliandyke.com


Full Table Scan
SELECT SUM(c2) FROM t1; session logical reads 19
20
12
13
15
4
2
23
22
18
21
10
16
17
11
14
1
3
5
6
7
8
9
consistent gets 1
5
9
12
13
15
19
20
10
16
17
11
14
4
2
22
23
18
21
8
3
6
7
0 SELECT STATEMENT 6
1 0 TABLE ACCESS (FULL) OF 'T1'
no work - consistent read gets 16
17
10
12
1
2
4
5
9
18
15
13
14
11
3
7
8
buffer is not pinned count 10
12
2
3
4
5
9
16
17
13
14
11
1
6
7
8
18
15
table scans (short tables) 1
Read Block 21
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
table scans rows gotten 20
36
40
48
12
16
24
28
32
52
56
44
8
64
68
4
72
60
Segment
Data
EmptyBlock
Block
Header 3
1
2 table scans blocks gotten 10
12
9
13
14
11
6
7
8
16
17
1
2
3
4
5
18
15

Segment Data Empty Unused


Header Blocks Blocks Blocks

Table T1
High Water
Mark

30 STOP
2005 Julian Dyke juliandyke.com
Full Table Scan - Summary
In Oracle 9.2
segment header initially read 3 times
segment header read again every 10 extents

All blocks are read up to high water mark

For longer tables blocks can be prefetched

Algorithm differs for Automatic Segment Space Managed


tablespaces

31 2005 Julian Dyke juliandyke.com


Unique Scan
SELECT c2 FROM t1 WHERE c1 = 42; session logical reads 2
1
3
consistent gets 1
2
3
0 SELECT STATEMENT
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T1' consistent gets - examination 3
1
2
2 1 INDEX (UNIQUE SCAN) OF 'I1'
buffer is not pinned count 2
1
index fetch by key 1
Read Index
Table Block 34
1 table fetch by rowid 1
Data Block
Branch
Leaf Block
Block rows fetched by callback 1

Index I1
Branch
Block

Leaf
Blocks

Table T1
32 STOP
2005 Julian Dyke juliandyke.com
Index Organised Table
SELECT c2 FROM t1 WHERE c1 = 42; session logical reads 2
1
consistent gets 1
2
0 SELECT STATEMENT
1 0 INDEX (UNIQUE SCAN) OF 'I1' consistent gets - examination 2
1
index fetch by key 1

Read Index Block 1


3
Leaf Block
Branch Block

Index I1
Branch
Block

Leaf
Blocks

33 STOP
2005 Julian Dyke juliandyke.com
Single Table Hash Cluster
SELECT c2 FROM t1 WHERE c1 = 42; session logical reads 1
consistent gets 1
0 SELECT STATEMENT
1 0 TABLE ACCESS (HASH) OF 'T1' no work - consistent read gets 1
cluster key scans 1
cluster key scan block gets 1
Read Table Block 7 buffer is not pinned count 1
Data Block

Table T1
Leaf
Blocks

34 STOP
2005 Julian Dyke juliandyke.com
Clustering Factor
Measures relationship between index entries and
corresponding data blocks

Used by CBO to calculate cost of using index

Good clustering factor approaches number of blocks in table;


Bad clustering factor approaches number of rows in table

CBO will favour indexes with a better clustering factor

Bad Clustering Factor Good Clustering Factor

35 2005 Julian Dyke juliandyke.com


Range Scan - Bad Clustering Factor
SELECT c2 FROM t1 WHERE c3 = 42; session logical reads 5
6
8
2
4
17
3
consistent gets 18
5
6
2
3
4
7
0 SELECT STATEMENT
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T1' consistent gets - examination 11
2 1 INDEX (RANGE SCAN) OF 'I2' 3
4
6
1
2
5
no work - consistent read gets
index scans kdiixs1 11
Read Table
Index Block 22
23
6
10
14
18
1 buffer is not pinned count 4
5
7
1
2
3
6
buffer is pinned count 3
5
1
2
4
Leaf Block
Data Block
Branch (Pinned)
Block
table fetch by rowid 3
4
6
1
2
5

Index I2
Branch
Block

Leaf
Blocks

Table T1
36 STOP
2005 Julian Dyke juliandyke.com
Range Scan - Good Clustering Factor
SELECT c2 FROM t1 WHERE c4 = 42; session logical reads 4
33
1
2
consistent gets 4
33
1
2
0 SELECT STATEMENT
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T1' consistent gets - examination 11
2 1 INDEX (RANGE SCAN) OF 'I3'
no work - consistent read gets 2
11
index scans kdiixs1 11
Read Table
Index Block 9
83
4
1 buffer is not pinned count 3
22
1
buffer is pinned count 9
6
7
8
42
3
51
Leaf Block
Data Block
Branch (Pinned)
(Pinned)
Block
table fetch by rowid 6
5
42
2
31

Index I3
Branch
Block

Leaf
Blocks

Table T1
37 STOP
2005 Julian Dyke juliandyke.com
Clustering Factor - Summary
Bad Good
Clustering Clustering
Factor Factor
session logical reads 8 4
consistent gets 8 4
consistent gets - examination 1 1
no work - consistent gets 6 2
index scans kdiixs1 1 1
buffer is not pinned count 7 3
buffer is pinned count 5 9
table fetch by rowid 6 6

Higher clustering factor


Reduces number of logical I/Os required
Increases number of buffers that can be pinned

38 2005 Julian Dyke juliandyke.com


Row Prefetching
For queries returning more than one row specify maximum
number of rows per round trip

If prefetch size too small


Increased number of round trips
Degrades performance

If prefetch size too large


Increased number of packets
May degrade performance

39 2005 Julian Dyke juliandyke.com


Row Prefetching
Applies to
OCI OCI_ATTR_PREFETCH_ROWS
Pro*C Host Array
JDBC setRowPrefetch ()
PL/SQL BULK COLLECT
SQL*Plus SET ARRAYSIZE

OCI default prefetch value is 1 (returns 2 rows per fetch)


res = OCIAttrSet
(
(dvoid *)stmt,
(ub4)OCI_HTYPE_STMT,
(dvoid *)&prefetchRows,
(ub4)0,
(ub4)OCI_ATTR_PREFETCH_ROWS,
(OCIError *)err
);

40 2005 Julian Dyke juliandyke.com


Row Prefetching
Example - full table scan
1000 row table
31 blocks (+ segment header)

Prefetch Size Consistent Gets


1 1003
2 518

Consistent Gets
3 337
4 276
5 227
10 130
20 82
50 53 Prefetch Size
100 43
250 37
500 35
1000 34

41 2005 Julian Dyke juliandyke.com


Thank you for your interest

For more information and to provide feedback


please contact me
My e-mail address is:

info@juliandyke.com

My website address is:

www.juliandyke.com

42 2005 Julian Dyke juliandyke.com

Вам также может понравиться