Академический Документы
Профессиональный Документы
Культура Документы
Oracle 11G
Worked with Oracle databases for over two decades (starting with version 4)
Work history includes time at both Oracle Education and Oracle Consulting
Academic Background:
Data Modeling
Database Benchmarking
Articles for:
Oracle Magazine,
Oracle Informant
PC Week (eWeek)
Articles for:
www.linux.com
www.orafaq.com
Books by Author
Coming in 2008
Agenda
Partitioning Benefits
Partitioning History
Partitioning Options
Partitioning Advisor (if youre licensed)
Typical Data Warehousing Environment
TPC-H Data Warehouse Benchmark
Results TPC-H with Various Partition Strategies
What about OLTP Environments and the TPC-C/E
Lessons Learned (and their relevance/application)
Questions & Answers
4
Availability
More granular online/offline options
More granular rebuild/reorganization options
More granular object level backup/restore options
Capacity Management
Enables a Tiered Storage Architecture approach
More granular storage cost management decision points
Performance
Partition Pruning
Partition-Wise Joins
5
Manageability
40%
Availability 20%
Capacity Management 20%
Performance
20%
Why to
Partition
Different Flavors:
All of these
affect the
explain plan
Local Prefixed/Non-Prefixed
Global
8
10
11
12
Very
exciting
new
options
13
14
TPC-H
15
Typical Environments
OLTP
ODS
OLAP
DM/DW
Business
Focus
Operational
Operational
Tactical
Tactical
Tactical
Strategic
End User
Tools
Client
Server Web
Client Server
Web
Client Server
Client Server
Web
DB
Technology
Relational
Relational
Cubic
Relational
Trans Count
Large
Medium
Small
Small
Trans Size
Small
Medium
Medium
Large
Trans Time
Short
Medium
Long
Long
Size in Gigs
10 200
50 400
50 400
400 - 4000
Normalization
3NF
3NF
N/A
0NF
Data Modeling
Traditional
ER
Traditional ER
N/A
Dimensional
16
TPC-H Benchmark
Industry Standard Data Warehouse Benchmark
URL: www.tpc.org/tpch
Spec: http://tpc.org/tpch/spec/tpch2.7.0.pdf
8 Tables
22 Queries (answer complex business questions)
Database scaling:
Factor = 1, 10, 30, 100, 300, 1000, 3000, 10000, 30000, 100000
Size GB = 1, 10, 30, 100, 300, 1000, 3000, 10000, 30000, 100000
17
Sub-Partitions
SF *
6,000,000
SF *
800,000
SF *
10,000
Partitions
25
SF *
150,000
SF *
1,500,000
18
19
Disclosure Reports
http://tpc.org/tpch/results/tpch_perf_results.asp
20
22
23
24
---------------------------------------------------------------------------------------------| Id
| Operation
| Name
| Rows
---------------------------------------------------------------------------------------------|
0 | SELECT STATEMENT
| 42533 |
5648K|
641K
(1)| 01:57:41 |
1 |
| 42533 |
5648K|
105M|
641K
(1)| 01:57:41 |
|*
2 |
715K|
92M|
631K
(1)| 01:55:51 |
3 |
| H_NATION
25 |
725 |
|*
4 |
HASH JOIN
715K|
72M|
631K
(1)| 01:55:51 |
5 |
| H_SUPPLIER |
100K|
781K|
646
(1)| 00:00:08 |
|*
6 |
HASH JOIN
720K|
68M|
68M|
631K
(1)| 01:55:44 |
|*
7 |
751K|
59M|
232M|
589K
(1)| 01:48:10 |
|*
8 |
3004K|
197M|
4984K|
485K
(1)| 01:28:57 |
|*
9 |
100K|
3808K|
| 11805
(1)| 00:02:10 |
60M|
1716M|
(1)| 01:02:52 |
15M|
200M|
10 |
11 |
SORT GROUP BY
HASH JOIN
HASH JOIN
HASH JOIN
342K
| 72122
(0)| 00:00:01 |
(1)| 00:13:14 |
25
Method of Attack
Since many data warehouses are utilized for data
mining, we cant always know every possible query
likely to run thus aggregate measure for success
Thus well compare the benchmarks weighted
performance scores for the TPC-H using various
partitioning schemes (all within spec of course)
Goal will be to find the best overall partitioning
Then well examine some specific explain plans
26
28
Is that It?
No just six very obvious high-level scenarios
Your selections and actual mileage will vary
Experimentation usually yields the best results
Always trust empirical results over conjecture
So improved response-time beats better explain plan
Remember, DWs usually have unpredictable queries
So dont tune for just a few queries, look for the best
overall and/or more generic performance solution
29
32
TPC-C Benchmark
Historical Industry Standard OLTP Benchmark
URL: www.tpc.org/tpchc
Spec: http://tpc.org/tpcc/spec/tpcc_current.pdf
Probably the most used & widely quoted benchmark
But suffers from overly simplistic design & code logic
Generally considered unreliable with modern RDBMS
But still a decent rough sounding board for many .
Being replaced by the newer TPC-E (later slides)
33
# Terminals/Warehouse
(i.e. concurrent users)
Clustered
Partitions
Sub-Partitions
TPC-E Benchmark
Emerging Industry Standard OLTP Benchmark
URL: www.tpc.org/tpche
Spec: http://tpc.org/tpce/spec/TPCE-v1.5.1.pdf
Very new and still evolving but highly promising
Not too many published TPC-E results as of yet
Design not compromised by RDBMS features
Much more realistic (i.e. real world) in nature
Nowhere near as easy as the old TPC-C test
35
36
582
579
578
37
Single block
reads 32 ms!
DISTRICT Table
needs clustered
40
Architecture Findings
OLTP
ODS
OLAP
DM/DW
Business
Focus
Operational
Tactical
Strategic
End User
Tools
Client
Server Web
Client Server
Web
DB
Technology
Relational
Relational
Trans Count
Large
Small
Trans Size
Small
Trans Time
Short
Size in Gigs
10 200
400 - 4000
Normalization
3NF
0NF
Data Modeling
Traditional
ER
Mostly
Partition
Elimination
Design to
eliminate
per object
Mostly
Partition
Wise Join
Design to
parallelize
across
objects
Large
Long
Dimensional
43
Thank you
Presenter: Bert Scalzo
E-mail: Bert.Scalzo@Quest.com