Вы находитесь на странице: 1из 64

Oracle I/O Topics

Oracle and Storage


Objectives for this unit
At the end of this module the student will understand the
following tasks and concepts.
 Understand how a disk drive works
 Understand disk drive performance
 Understand RAID
 Understand disk controllers
 Understand disk configuration options
 Understand the read and write mechanisms of the SAN
system
 Understand the read and write mechanisms of Oracle
 Understand how these mechanisms affect performance
Physical Parts in a Disk
Drive
 Disk Platters – tracks and sectors
 Disk Heads – read/write heads that
perform reads and writes to the
tracks on the platters
Disk Drive Components

Disk
Platter
Disk Drive Components

Disk
Platter

Track
Disk Drive Components

Disk
Platter

Track

Sector
Disk Drive Components
Head
Disk
Platter

Track

Armature

Sector
Disk Drive Components

Cylinder
How a Disk Works
 RPM – disks spin at a given RPM, such
as 15,000 rpm
 Seek Time – the time it takes for the
head to move to the track
 Rotational Latency – the time it takes
for the disk to rotate (spin) to the
sector where the needed data is
Disk Drive Seek and
Rotation
Disk Drive Seek and Rotation
Step 1: Seek to the Desired
Track
Time = Seek Time
Disk Drive Seek and Rotation
Step 2: Wait for Sector to
Rotate
Time = Rotational Latency
Disk Performance
Specifications
 Typical 18 Gbyte SCSI-3
 Rotational Speed - 15,000 RPM
 Avg. Seek for random I/O’s – 3.9 ms
read, 4.5ms write
 Transfer Rate – 40MB/sec
 Track to Track Seek for sequential I/O’s–
0.5ms read, 0.7 ms write
 Rotational Latency - 2.0 ms
Calculating Max Random
Seeks/Sec
 Maximum Random Seeks / sec
 max random seeks/sec = 1 / (sec per seek)
 sec per seek = (3.9ms+2.0ms)(1sec/1000ms)
 1 / 0.0059 (sec/seek) = 169.5 Seeks/sec
 What about queuing?
I/O Request Queuing
Graph
QUEUE LENGTH VS. UTILIZATION

20.000
18.000
16.000
QUEUE LENGTH

14.000
12.000
10.000
8.000
6.000
4.000
2.000
0.000
5% 10% 15% 20% 25% 30% 35% 40% 45% 50% 55% 60% 65% 70% 75% 80% 85% 90% 95%
UTILIZATION
Maximum Utilization for Best
Performance
 Maximum Seeks per second = 169.5
 Knee of Curve at 80%
 Configure for 125 I/Os per second per
disk for random I/O’s
 This is 75% of maximum capacity
Sequential vs. Random
I/Os
 Sequential I/O is much faster
 Seek time 5.5 ms → 0.7 ms
 Same calculation yields 370 I/Os per sec
 or 277 I/Os per sec @ 75%
 > 300+ I/O’s per sec is common for
sequential
 As I/Os increase so does Latency
Disk Layout for Sequential
I/Os
 Isolate Sequential I/Os
 Sequential Disk I/Os achieve maximum
performance
 255+ I/Os per second per disk
 Sequential Access in Oracle
 Redo Log
 Archive Log files
Disk Layout for Random I/Os
– OLTP
 OLTP (online transaction processing)
 I/O’s are fairly random
 Determine the frequency of access
 Spread out the data
 Balance
Disk Layout Random I/Os -
DSS
 DSS (Decision support system)
 I/Os are made up of many sequential
accesses
 I/Os may be isolated to certain tables
 Analyze the access patterns
 Use range partitioning
I/O Balancing
 Disk Arrays Provide Optimal Distribution
 Use Disk Striping (through RAID)
 Balance I/O’s among logical volumes
 Don’t exceed disk throughput limits
 Pushing too hard increases latencies
 Use all of the disks available
 Disk Drives can do 125 IOPS random
 10 Disks can do 1,250 IOPS random
 100 Disks can do 12,500 IOPS random
Oracle Data Files and
Tablespaces
 Use files and tablespaces in order to
spread data across multiple disk arrays
(LUNs)
 Spreading all your user database data
across all the data disks usually gives the
best I/O performance
 Separate the log disks from the data disks
Create Database Disk
View
C:\ C:\oracle\data\redo1.dbf

E:\ E:\oracle\data\ts1a.dbf

Data_TS F:\ F:\oracle\data\ts1b.dbf

G:\ G:\oracle\data\ts1c.dbf

H:\ H:\oracle\data\redo2.dbf
RAID Storage Subsystems
 RAID (Redundant Array of Inexpensive
Disks)
 RAID can provide Fault Tolerance
 Hardware Fault Tolerance is more efficient
and faster than software fault tolerance
 RAID Levels
 RAID Performance
Fault tolerant hints
 OS and binaries are hardest to replace
 Transaction log and transaction log backups
are essential for recovery
 Database recovery can be time consuming
 RAID will only protect you against disk failures
 RAID is not a substitute for backups
RAID-0
 Striping across disks
 No fault tolerance
 Use all of the disk space
 Best Performance
 Lose one disk and lose the data

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16
RAID-1
 Mirroring – provides a copy or mirror
 Good fault tolerance
 Disk Space Available = Disk Drives / 2
 RAID-1 does not have stripes

Disk 1 Disk 1
Mirror
RAID-0+1 (a.k.a. RAID 10)
 Mirroring with Striping
 Good fault tolerance
 Disk Space Available = Disk Drives / 2

1a 2a 3a 4a
1b 2b 2b 2b
5a 7a 7a 8a
3b 4b 4b 4b

9a 10a 11a 12a


5b 6b 6b 6b

13a 14a 15a 16a


6b 8b 8b 8b
RAID-1 and RAID-10 I/O
Characteristics
 Physical Reads = Logical reads
 Physical Writes = Logical Writes x 2
 Good read performance – can
simultaneously read from both disks
in the mirrored pair
RAID-5
 Striping
 Distributed Parity
 Disk space = Disk Drives - 1

1 2 3 Parity

4 5 Parity 6

7 Parity 8 9

Parity 10 11 12
RAID-5 Write
Step 1: Data and Parity are Read

1 2 3 Parity

4 5 Parity 6

7 Parity 8 9

Parity 10 11 12
RAID-5 Write
Step 2: New Parity is
Calculated XOR Engine Calculates
parity

1 2 3 Parity

4 5 Parity 6

7 Parity 8 9

Parity 10 11 12
RAID-5 Write
Step 3: Data and Parity are
Written

1 2 3 Parity

4 5 Parity 6

7 Parity 8 9

Parity 10 11 12
RAID-5 I/O Characteristics
 Reading simply reads off of the disk
that contains the data
 Writing
 Reads the data stripe
 Reads the parity stripe
 Xors the data and writes it out
 Calculates a new parity and writes it out
Raid Cost Summary
Protection

RAID-1

RAID-5

RAID-0

Cost
Raid Performance
Summary
Protection

RAID-1

RAID-5

RAID-0

Performance
RAID Summary
 RAID-0
 No Protection
 Best Performance
 Least Cost
 RAID-1
 Best Protection
 Good Performance
 Most Expensive
 RAID-5
 Good Protection
 Worst Performance
 Least Expensive Fault Tolerant
Fault Tolerance Tips
 Apply Fault tolerance to selected
Components
 RAID-1 on OS and Transaction Log
 RAID-5 on data drives if mostly reads or read-only
 RAID-1 or RAID-10 on data drives if heavy update
activity (lots of writes occur)
 RAID-5 on backup disks (RAID 1 or RAID 10 if
backup is too slow)
Controller Options
 Read and Write Caches
 Read ahead
 Stripe Size
Read and Write Caches
 Memory cache on the controller
 For performance
 Cache Modes
 Read
 Write
Read Ahead Cache
 Used when sequential access is detected
 Will read more data than is requested and
put it in cache
 If the data is requested, the response is
immediate
Read-ahead - Controller
Read Cache Option
 May or may not be useful
 Oracle does its own read-ahead
 Excessive read-ahead by controller can
cause unnecessary I/Os
 Higher performance can be achieved
through reading from cache, but won’t
usually help Oracle performance
 500 GB database
 7 GB Oracle buffer cache
 1 GB of SAN Controller cache
 What do you think?
Controller Write Cache
 Write is acknowledged as soon as it gets
to the cache
 Write performance is increased
 If cache fills, write performance degrades
 Hides the RAID 5 latency penalty
 Back end must keep up or cache will fill
 The controller gives priority to writing out
the data in write cache over performing
reads from disk
 Therefore, read performance sometimes
suffers at the expense of writes
 This is true for data files, NOT for log files
Oracle Writes to Data Files
 How an Oracle write occurs:
 Page to be modified is read from disk into SQL buffer
cache
 Data is modified (written to) in the buffer cache
 Oracle user needing that write can now continue with
other work – considers the write finished
 Since memory is a limited quantity, eventually modified
data pages must be written to disk (by the Database
Writer Process)
 When write to disk is made, the controller write cache
then comes into play – the user does not know
Significance of Oracle
Writes
 Since Oracle writes occur in the
Since Oracle writes occur in the
buffer cache, no user waits on data
file writes to disk!!!
 Thus read performance for data files is
more important than write performance
 For log files, write performance is more
important
When to Use the Controller
Write Cache
 The controller write cache is not
useful for database files and may hurt
read performance
 The controller write cache is useful
for the redo log – logwriter process
 Write caching can be configured on a
per-LUN basis
Stripe Size
 How much data is on each disk
 Important for Oracle
 Seek time is part of total transfer time
 Configure stripe size large enough to
avoid causing extra I/Os
 You only want one seek per request
 You want to only go to one disk drive
 Default of 64K is usually pretty good
 If you want to try different stripe sizes,
increase, don’t decrease the stripe size
Stripe Size
 The data is distributed across all of the disk
drives in 64K pieces
64K Pieces

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16
Review of Oracle Storage
Concepts
 Oracle performance is extremely
dependent on the performance of the
I/O subsystem
 Redo Log performance limits transaction
processing
 Read performance of data files limits
query response time
 Write performance of data files slows
down the DBWR processes
Redo Log Performance
 Redo log entries must be written whenever a
change is made to the data files or whenever
a commit is issued
 Redo log entries are written into the redo log
buffer
 The LGWR process takes redo log entries and
writes them out to the redo log
 If the LGWR cannot keep up, transactions wait
Read Performance

 Most critical
 Logical reads (to the buffer cache)
takes CPU and bus resources
 Each physical read takes time and
resources
 Query processing time is directly
related to I/O latency
 Each individual I/O operation takes a
finite time
 Many operations are serial in nature
Read Performance
Example
 An index lookup might take 50 individual reads
of data from the data files
 Each I/O takes 10 ms
 Total index read time would be 500 ms
 What if I/O latency were 50 ms
 Total index read time would be 2.5 seconds
 This is totally unacceptable
Oracle I/O performance
issues on SAN
 All I/O operations are going through one 100
MByte link
 Or 200 MByte link for FC2
 Other systems in the SAN might be causing
load on the storage array
 SAN software may be causing additional
overhead
 Performance monitoring of I/O operations on
each system only gives you part of the story
Fibre Channel Bandwidth
 Depending on the type of I/O operations being
done you may be disk bound or Fibre Channel
Bandwidth bound
 Other system’s load may be taking up your
bandwidth
 Switches help, but shared storage in a single
storage unit can be a bottleneck
 Use multiple paths to the storage if necessary
SAN I/O Performance
 Fibre Channel bus is 100 MB/sec (FC1)
 Theoretical maximum of 351 GB/hr
 Storage Processor can do 50 MB/sec
 Theoretical maximum of 175 GB/hr
 Throw in software (varies by OS) overhead
 More like 25 MB/sec = 87 GB/hr
SAN I/O Performance (cont.)
 Network
 Gigabit
 125 MB/sec = 439 GB/hr
 100baseT
 12.5 MB/sec = 44 GB/hr
 10baseT
 1.25 MB/sec = 4.4 GB/hr
 Tape can do 4 MB/sec
 Theoretical Maximum of 14 GB/hr
GB/hr

100
150
200
250
300
350
400
450
500

50
0
Fi ig
br ab
S e it
to C
ra ha
ge n
ne
P l
ro
ce
Comparison

ss
or

N
T
Transfer Speed

F
10 S
0b
as
eT

Ta
pe
10
ba
se
T
SAN Performance

 20 drive SAN might perform to 2,500


IOPS at 10 ms
 20 drive SAN is capable of 5,000 IOPS,
but at 100 ms latency
 Latency is much more important than
throughput
 The performance of the SAN at the
lowest level relates to the performance
of the disk drives and SAN configuration
What About RAID?
 RAID overhead can be significant
 RAID 0 offers no protection and no
overhead
 RAID 1 or 0+1 offers good protection
but with a 2x write overhead
 RAID 5 offers moderate protection
but with a 4x write overhead
 RAID overhead can significantly
affect performance
Storage Consolidation Might
Cause Subsystem Overload
 Since several or many systems may share the same
storage the processor managing that storage might
become overloaded
 The CPUs in the storage unit is finite and can only process
a finite amount of data
 Monitoring your storage CPUs is useful
 RAID overhead = more CPU processing
SAN Software Could Cause
Additional Latencies
 Many SAN systems require additional
software that might increase latencies
 The more layers the I/Os have to go through
the more overhead involved
 Oracle is very sensitive to I/O latency
I/O Monitoring Might Be
Inaccurate or Wrong
 Each system can only see the I/Os
that it generates
 The load on the storage system is the
cumulative I/O operations of all the
servers that access it
 Additional monitoring is necessary
Review
 Since most transactional databases use an 8K block size,
the optimum stripe depth for a striped RAID group is 8K.
What is wrong with this statement?
 What is the maximum I/Os per second that you should try to
push through a single 15,000 rpm disk?
 True or False: Slow data writes to disk are a significant
cause of Oracle user waits
 What is the difference between seek time and rotational
latency?
 What types of Oracle files are sequentially accessed?
 Under what conditions would RAID-5 be undesirable? Under
what conditions would RAID-5 be acceptable?
 True or False: Controller write cache is important in tuning
Oracle data write performance.
Summary
 Sizing is very important - # of disks
 Choose the right RAID level
 Choose the right stripe size
 Don’t always count on the SAN
features helping
 A poorly designed SAN will perform
poorly
 The performance of the disk drives
determines the performance of the
I/O subsystem

Вам также может понравиться