Академический Документы
Профессиональный Документы
Культура Документы
com)
Server Technology
System Management & Performance
Oracle Denmark APS
July 2000
July 2000
Page 1
Structure of paper
The rest of the paper will use following outline:
1. Basic Method for Query Tuning
2. Initial Examples
3. Defining Throwaway
4. Oracle Version Dependencies
5. Detailed Description of Throwaway in Row Sources
6. Secondary Throwaway
7. Tuning SQL by Eliminating Throwaway, the Method
8. Examples on Applying the Method
9. Using SQL Trace and tkprof
10. Tracing Issues
11. Miscellanous Issues
If you already are familiar with using the quantification of row sources found in the tkprof
output you can skip directly to section 3 and then continue with sections 6, 7 and 8 and use
sections 4, 5, 9, 10 and 11 as reference.
July 2000
Page 2
July 2000
Page 3
2. Initial Examples
Lets demonstrate the concept of throwaway by looking at two simple examples:
Query 1:
Select CLASS_ID
From CLASSES
Where TYPE = 'S/EC'
Output from tkprof:
Rows
---2386
2387
Execution Plan
---------------------------------------------TABLE ACCESS (BY INDEX ROWID) OF 'CLASSES'
INDEX (RANGE SCAN) OF 'CLASSES_TYPE' (NON-UNIQUE)
Execution Plan
----------------------------------------------TABLE ACCESS (BY INDEX ROWID) OF 'CLASSES'
INDEX (RANGE SCAN) OF 'CLASSES_TYPE' (NON-UNIQUE)
July 2000
Page 4
The experienced SQL tuner will of cause immediately spot the probable cause: the index used is
an single column index and by adding an extra column - DAYS - to the index the predicate
selectivity in the query will be matched by the selectivity in the index.
Now, a more tricky example:
Query 3:
Select *
From TABLE2
Where COL1 = 102
and COL2 = 23
(Index: I_COL1_COL2 on TABLE2(COL1, COL2))
A simple execution plan (by SQL*Plus AUTOTRACE or manual explain plan) would show:
TABLE ACCESS (BY INDEX ROWID) OF TABLE2
INDEX ACCESS (RANGE SCAN) OF I_COL1_COL2 (NON-UNIQUE)
This would lead most SQL tuners to the conclusion that this query was performing as good as
it can (all predicates can be found in the leading columns of an concatenated index).
The tkprof output is this case could show:
Rows
---71
3401
Execution Plan
--------------------------------------------------TABLE ACCESS (BY INDEX ROWID) OF TABLE2
INDEX ACCESS (RANGE SCAN) OF I_COL1_COL2 (NONUNIQUE)
According to the previous example this shows that there is work done that gets thrown away
before output.
This should only happen if not all predicates are used for the index retrieval.
Now the problem is to determine why only a part of the predicates are used for index retrieval.
A likely candidate would be to check for a implicit type conversion on the predicate
corresponding to the non-leading column in the index - COL2 (the leading column will be used otherwise the index would not have been used at all),
This example shows that by using the Rows column from the execution plan in the tkprof
output more information about the performance of a particular query can be obtained than by
just using the execution plan alone.
July 2000
Page 5
3. Defining Throwaway
3.1 Formal definition of throwaway
After the initial examples it is time to formalize a definition of the term throwaway:
There are two types of throwaway:
Primary throwaway is the amount of rows from the input to a row source that can not
be found in the output from that particular row source.
Secondary throwaway is the amount of rows generated as output from a row source
which are then thrown away in a following row source with primary throwaway.
3.2 Primary vs. secondary throwaway
To illustrate the difference between primary throwaway and secondary throwaway consider
following graphical representation of an execution plan:
#R1
Row Source 1
#R2
Row Source 2
Row Source 2 generates #R2 number of rows which are then processed in Row Source 1
resulting in a final set of rows - #R1.
Primary throwaway in Row Source 1 can be described as this row source has to process all the
input rows (#R2), but only delivers #R1 worth of output - it is the row source itself that in
processing the input rows is throwing away the ones not making it to the output.
Row Source 2 will in this case generate #R2 number of rows as output - which represents a
certain amount of work.
The secondary throwaway is then the amount of this work already done which is thrown away in
a following row source (Row Source 1 in this case).
The distinction between these two types of throwaway is very important when analysing
complex queries (for example joins).
To evaluate the total work thrown away by an operation the implied secondary throwaway
should be included in the evaluation. This will be further described in 6. Secondary
Throwaway.
July 2000
Page 6
July 2000
Page 7
July 2000
Page 8
Execution Plan
----------------------------------------------TABLE ACCESS (BY INDEX ROWID) OF 'CLASSES'
INDEX (RANGE SCAN) OF 'CLASSES_TYPE' (NON-UNIQUE)
July 2000
Page 9
Throwaway in a table access (full) operation can only be caused by predicates that does
not use an index and the only solution is to use an index for all the predicates that
results in (a significant part of) the throwaway.
Limitation:
Not all predicates can be indexed.
July 2000
Page 10
Like the table access (full) row source throwaway can only be caused by non-indexed
predicates - and the cure is to add the non-indexed predicates to the index already in
use.
Limitation:
Not all predicates can be indexed.
July 2000
Page 11
July 2000
Page 12
In the case where not all leading columns in the index is used for the index lookup (case
1 above) the solution could be to rearrange the sequence of the columns in the index to
match the predicates used.
July 2000
Page 13
In the case with a function disabling the full usage of predicates for index lookup (case
2 above) the only solution is to change the statement so that the function is not
performed on the indexed column side of the predicate.
In the case with a range scan on a leading column (case 3 above) the solution could be
to rearrange the sequence of the indexed columns to move the column being range
scanned to be the last column in the index.
Limitations:
Rearranging the column sequence in concatenated indexes migth result in unwanted
behaviour for other statements previosly using the same index.
Not all function calls can be avoided.
With range scans on more columns in the index it can prove impossible to find a
sequence that can not give any throwaway.
July 2000
Page 14
Nothing can be done by changing the statement or by adding (or removing) indexes
except if a predicate limiting the number of duplicate rows can be squeezed into the
query to be evaluated before the sort(unique) operation.
Some times it is a possibility is to change the data model or use the existing data model
in another manner to avoid the possible duplicate rows and thereby avoiding the
sort(unique) operation..
Limitations
The cures described might - if possible at all - be very complex to implement.
July 2000
Page 15
July 2000
Page 16
B-D
D - A (Outer joins: always 0)
C-A
Note that the output from a nested loops join operation is the number of row
retrieved from the inner table and not the number of rows from the join
operation. For normal (inner) joins these two numbers will be identical.
In case of an outer join these two row count will not be identical and to
determine the number of rows output from the nested loops operation use the
number of rows from the outer table minus throwaway before join:
B - (B - D) = D (and remember that this is for an index unique scan on the inner
table).
July 2000
Page 17
Card(TABLE_O) - B
B-A
D-B-C
Calculating throwaway, scenario 2 (inner table access via non-unique scan on index):
Rows Execution plan
---- ----------------A NESTED LOOPS
B
TABLE ACCESS (FULL) OF TABLE_O
C
TABLE ACCESS (BY INDEX ROWID) OF TABLE_I
D
INDEX ACCESS (RANGE SCAN) OF (NON-UNIQUE)
For Oracle 7.3 and 8.0 the throwaway can be calculated as:
Throwaway from outer table (before join):
Throwaway from outer table (during join):
B - (D - C)
If B > A then:
At least B - A
Otherwise indeterminable
Always 0 for outer joins
C-A
Card(TABLE_O) - B
If B > A then:
At least B - A
Otherwise indeterminable
Always 0 for outer joins
D-B-C
Calculating throwaway, scenario 3 (inner table access using full table scan):
Rows Execution plan
---- ----------------A NESTED LOOPS
B
TABLE ACCESS (FULL) OF TABLE_O
C
TABLE ACCESS (FULL) OF TABLE_I
For Oracle 7.3 and 8.0 the throwaway can be calculated as:
Throwaway from outer table (before join):
Throwaway from outer table (during join):
July 2000
Page 18
B - (C / Card(TABLE_I) )
If B > A then:
At least B - A
Otherwise indeterminable
Always 0 for outer joins
C-A
Card(TABLE_O) - B
If B > A then:
At least B - A
Otherwise indeterminable
Always 0 for outer joins
B * Card(TABLE_I) - A
In most pratical cases of this particular scenario only the throwaway from the inner
table is important.
Above formulas takes following peculiarity in Oracle8i into account:
All index scans (both unique and range scans) reports one row extra per lookup (in Oracle
7.3 and 8.0 this only happens for range scans).
Calculating throwaway, generic scenario:
Not all nested loops falls into above categories (especially when dealing with joins with
one or both row sources not being table accesses) and it is necessary to be able to
determine throwaway from a generic nested loops operations:
Rows
---A
B
C
Execution plan
----------------NESTED LOOPS
Outer Row Source
Inner Row Source
If the join operation is part of a larger join with several join operations, please refer to
the section 6. Secondary Throwaway.
Throwaway from the outer table before the nested loops join operation can be cured
using the same principles as for curing throwaway from a table access (indexed or full
table scan).
The cure for throwaway from the inner table is the same as for throwaway from a table
access (indexed or full table scan).
If the join operation is throwing rows from the outer table (and thereby subjecting this
table access to secondary throwaway) this can only be counteracted by reversing the
join order.
July 2000
Page 19
Whether this reversal will indeed reduce the number of rows processed depends on the
predicates on both the inner and the outer table..
Limitations:
Change of join order can not always remove (join based) throwaway from the outer
table.
Removal of throwaway from the inner table and the the outer table (before the join) is
subjected to same limitations as throwaway from a table access.
July 2000
Page 20
C-B
If C > A then:
At least C - A
Otherwise indeterminable
Always 0 for outer joins
E-D
Card(TABLE_O) - C
If C > A then:
At least C - A
Otherwise indeterminable
Always 0 for outer joins
E-D
Only throwaway from one table can be cured and this is done by changing the join
operation to a nested loops operation with the row source being subjected to throwaway
as the inner table and at the same time ensure that all join and non-join predicates on
this table can be evaluated using an index.
Limitations:
Changing a sort/merge join operation to a nested loops operation with fully indexed
predicates on the inner table is not always possible - for example not all predicates are
indexable.
Furthermore a change from a sort/merge join to a nested loops operation with indexed
inner table access will change the I/O pattern from multiblock reads to single block
access (by going from full table scans to indexed lookups) which easily can result in
more time spend waiting for I/O.
July 2000
Page 21
Card(TABLE_O) - B
If B > A then:
At least B - A
Otherwise indeterminable
Always 0 for outer joins
C-A
Cure and limitations: see 5.7 Throwaway in row source: SORT/MERGE join operation
July 2000
Page 22
Execution Plan
----------------------------------------------FILTER
TABLE ACCESS (FULL) OF 'COURSES'
INDEX (RANGE SCAN) OF 'CLASS_CRS' (NON-UNIQUE)
The primary row source to the filter operation is the full table scan of COURSES (this is the
row source being filtered).
The filtering can be done using one or more subqueries. These subqueries are secondary input
row sources to the filter row source and are executed very much like the row retrieval from the
inner table in a nested loops join operation.
In above example the index range scan on CLASS_CRS is executed for each row in the primary
row source to determine what rows should be filtered away.
The filter row source will always introduce throwaway - except when it does not filter anything
away.
Another source of throwaway from filter operations is the number of rows from the secondary
row sources (those used for filtering).
It can be disputed whether these secondary row sources are thrown away (they do not appear in
the output) or are just separate queries requering work to be evaluated.
No matter what terms are used these secondary row sources represents work done - and has been
seen on several occations to process a bigger number of rows than the primary query itself.
Therefore a separate tuning effort on the secondary row sources can be beneficial.
Calculating throwaway:
The amount of throwaway can be found by determing how much smaller the output
rows is than the number of rows from the primary input row source.
In above example this results in a throwaway of 432 rows (1019 - 587).
Cure:
Limitations:
The filter operation is often caused by the data model not being able to satisfy specific
requirements, and to change the filter into a join or removing it often requires major
changes in the application.
July 2000
Page 23
6. Secondary Throwaway
In more complex statements it becomes less and less simple to quantify the amount of secondary
throwaway.
Consider following example:
Rows #1
Operation
#1
Rows #4
Operation
#4
Rows #2
Rows #3
Operation
#2
Operation
#3
Rows #5
Operation
#5
If Operation #1 throws away rows from Operation #2 and thereby not only subjecting Operation
#2, but also Operation #4 and Operation #5 to secondary throwaway, then it can be difficult to
determine precisely how many rows from Operation #4 and Operation #5 have been thrown
away.
To adopt a pragmatic approach to quantify this secondary throwaway it is assumed that the
secondary throwaway is distributed evenly across all the row sources contributing to the row
source being subjected to primary throwaway and that the fraction of secondary throwaway
equals the primary throwaway.
This leads to following formula for calculating secondary throwaway:
Primary Throwaway (fraction of input) * sum(rows in contributing row sources)
Based on the above example the secondary throwaway generated by the primary throwaway in
Operation #1 can then be quantified by:
Primary Throwaway:
Rows #2 - Rows #1
Primary Throwaway fraction:
(Rows #2 - Rows #1)/Rows #2
Sum(rows in contributing row sources): Rows #3 + Rows #4 + Rows #5
This leads to:
Secondary throwaway: (Rows #2 - Rows #1)/Rows #2 * (Rows #3 + Rows #4 + Rows #5)
July 2000
Page 24
July 2000
Page 25
Execution plan
-------------NESTED LOOPS
NESTED LOOPS
NESTED LOOPS
TABLE ACCESS (BY INDEX ROWID) OF <A>
INDEX ACCESS (RANGE SCAN) OF
TABLE ACCESS (BY INDEX ROWID) OF <B>
INDEX ACCESS (RANGE SCAN) OF
TABLE ACCESS (BY INDEX ROWID) OF <C>
INDEX ACCESS (RANGE SCAN) OF
TABLE ACCESS (BY INDEX ROWID) OF <D>
INDEX ACCESS (RANGE SCAN) OF
To ease the interpretation of what the execution plan shows following graphical representation
is helpful:
10
1) Nested
Loops #3
10
20560
10) Table
<D>
2) Nested
Loops #2
10920
4) Table
<A>
111
5) Index
Range
11) Index
Range
8) Table
3) Nested
Loops #1
110
20570
20560
<C>
31480
10920
9) Index
Range
6) Table
<B>
11030
7) Index
Range
First lets determine the amount of throwaway in the row sources (the primary throwaway).
Assuming Oracle8i the throwaway can be determined using the descriptions found in 5.6
Throwaway in row source: NESTED LOOPS join operation.
July 2000
Page 26
(TA: ThrowAway)
The only row source subjected to significant (primary) throwaway is marked in bold.
As it now has been determined that there is throwaway from a row source which is generated by
the combined work of several other row sources these row sources are subjected to secondary
throwaway.
As the primary throwaway fraction is at least 20.550/20.560 = 0.9995 then according to above
quantification of secondary throwaway at least the fraction 0.9995 of all the rows in the two first
join operations, the three first table accesses (tables <A>, <B> and <C>) and the corresponding
index accesses are thrown away.
According to above formula the secondary throwaway can be calculated as:
primary throwaway fraction * sum(rows in contributing row sources)
= 0.9995* sum(row sources 2, 3, 4, 5, 6, 7, 8 and 9)
= 0.9995 * (111+110+11.030+10.920+10.920+31.480+20.560+20.560) rows
= 105.638 rows
This means that the total throwaway introduced in the last join operation is (at least):
20.550 rows (primary throwaway) + 105.638 rows (secondary throwaway)
= 126.188 rows out of the total 126.281 rows processed in the query , which equals a
throwaway of 99.93% of all the rows processed.
July 2000
Page 27
Analyze query
Suggest changes to improve performance
Implement changes
Test improvement
Repeat until goal reached
By using identification and elimination of throwaway for analysis and suggested changes the
method can be expanded:
1. Calculate throwaway (primary and secondary) for relevant row sources in the execution plan
2. Identify row sources that introduce a significant amount of throwaway
3. Map row sources to the part of the statement they correspond to and determine the predicates
for relevant row sources
4. Suggest changes in execution plan to reduce secondary throwaway. This must be done by
moving the row source to be evaluated earlier in the execution plan
5. If a row source is introducing primary throwaway (or will be expected to do so after reducing
secondary throwaway) then analyze whether it is possible to reduce the amount of
throwaway
6. Identify the means to change the existing execution plan into the proposed one
7. Implement changes
8. Verify that expected execution plan is obtained
9. Test improvement
10. Repeat until goal reached or all throwaway has been eliminated
Re 1): The calculation of throwaway follows the formulas described for the different row
sources (for primary throwaway) and for complex statements (for secondary throwaway).
Re 2): A row source can be said to introduce a significant amount of throwaway if the total
amount of throwaway (primary + secondary throwaway) generated by that particular row source
is a significant part of all the rows processed within the statement.
Re 3): This part will not be described in sufficient detail to solve all possible scenarios in this
version of the paper.
The meaning of the step is to take the all row sources in the execution plan and then identify
what part of the statement these row sources correspond to (: what part of the statement the row
source is solving).
It could be argued that only the relevant row sources needs to be mapped to the statement, but in
practise this limitation will invariably lead to mistakes - especially when dealing with complex
statements.
It is normally straight forward to map row sources to the statement when the statement only
addresses each table once, but in the cases where the same table is referenced several times in
the statement (including views) it can be a challenge to identify exactly where a particular table
access row source belongs in the statement.
July 2000
Page 28
The interesting part of this mapping is the mapping of what predicates are in action for what
row source. This might seem trivial, but consider following simple join:
select cls.start_date, crs.short_name
from classes cls, courses crs
where cls.instr_id = crs.dev_id
and cls.status = AVAI
and crs.cat_id = 34;
If the execution plan for above statement looks like:
Execution plan
-------------NESTED LOOPS
TABLE ACCESS (BY INDEX ROWID)
INDEX ACCESS (RANGE SCAN) OF
TABLE ACCESS (BY INDEX ROWID)
INDEX ACCESS (RANGE SCAN) OF
OF CLASSES
OF COURSES
Then the join order would be CLASSES - COURSES and the predicates for CLASSES and
COURSES would be:
CLASSES:
status = AVAI
COURSES:
dev_id = <cls.instr_id>
cat_id = 34
If on the other hand the join order is COURSES - CLASSES then the predicates for CLASSES
and COURSES would be:
COURSES:
cat_id = 34
CLASSES:
instr_id = <crs.dev_id>
status = AVAI
Above demonstrates that the predicates and therefore the optimal indexes (for eliminating
primary throwaway) will depend on the join order and that focus on the predicates is important.
Re 4): Secondary throwaway - which is often seen to play the biggest part in work wasted by
throwaway in complex joins - needs as described above changes in the execution plan so that
the row source introducing the (primary) throwaway is evaluated earlier in the execution plan.
Moving a row source to an earlier evaluation in the execution plan will in most cases also
change the predicates for that particular row source and potentially all other row sources
between the old and the new position in the execution plan.
Changing the predicates for a row source has a direct influence on the ability of the row source
to throw rows away.
A prerequisite for reducing secondary throwaway by moving the row source with the primary
throwaway is that this row source is still capable of throwing rows away after having been
moved to the new position in the execution plan.
A second prerequsite is that the row sources having previously being subjected to secondary
throwaway has predicates that after the change in execution plan can limit the number of rows
processed.
An other way of stating this second prerequsite is that there must be some predicates connecting
the row source introducing the primary throwaway and the row source(s) subjected to secondary
July 2000
Page 29
throwaway and that these predicates by being reversed by the proposed change in the
execution plan can actually limit the number of rows processed in the row source(s) originally
subjected to secondary throwaway.
Therefore it is necessary to evaluate what predicates are available after the proposed change in
execution plan for the both the row source introducing the primary throwaway and the row
source(s) subjected to secondary throwaway.
This is a partial repetition of step 3, but using the new order of evaluation in the execution plan.
Re 5): As primary throwaway is local to one row source it is rather straight forward to verify if
there is a cure available and what means can be used to implement the cure.
The local nature of primary throwaway also makes it more likely that the cure can be
implemented without side-effects in other parts of the execution plan which has the pleasant
consequence of easing testing.
Re 6) The means to implement the wanted execution plan includes:
Creation of indexes
Adding/changing hints (for specifying join execution plans the ORDERED,
USE_NL, USE_HASH and INDEX hints often are quite usefull)
Changing optimizer mode
Changing other parameters influencing the optimizer
Changing statistics
Changing the where clause (for example: forcing join order by disabling index usage
under the Rule Based Optimizer)
Changing the statement to allow other access paths
Changing table definitions (for example by changing a NULL column to NOT
NULL in order to allow anti-join optimization)
Re 8) It is recommended that the expected execution plan is verified before testing the result especially if there is a possibility of getting long response times.
For this purpose the AUTOTRACE TRACE EXPLAIN setting in SQL*Plus (which only
outputs the execution plan) is sufficient and rather easy to use.
Re 9) The actual test should be performed using SQL trace as this will verify that the expected
elimination of throwaway has been successfull.
July 2000
Page 30
July 2000
Page 31
Row sources:
TABLE ACCESS (BY INDEX ROWID) OF 'CLASSES'
INDEX (RANGE SCAN) OF 'CLASS_CRS'
Corresponds to table CLASSES in the from clause with (join and non-join) predicates:
crs_id
= <courses.crs_id>
days = 15
Step 4,1 (Suggest changes in execution plan to reduce secondary throwaway)
As the only possible solution to the secondary throwaway from the COURSES table access is to
move the evaluation of the row source introducing the primary throwaway which causes the
secondary throwaway to an earlier evaluation in the execution plan.
This means that the join order should be reversed:
CLASSES - COURSES
Lets analyze this join order for cardinalities and throwaway:
Join order: CLASSES - COURSES
Cardinalities after join operation:
CLASSES:
Predicates:
days = 15
Cardinality:
4 (select count(*) from classes where days = 15)
COURSES:
When only looking at throwaway from the tables this join order will result in a total of 8
retrieved rows (4 + 4) which compares favorably against to the original querys 1022 rows (1018
+ 4).
Step 5,1 (Reduce primary throwaway)
The chosen join order (CLASSES - COURSES) will result in following predicates (as described
above):
CLASSES:
days = 15
Available indexes: None
Optimal index: (DAYS)
COURSES:
crs_id = <classes.crs_id>
Available indexes: (CRS_ID)
Optimal index: (CRS_ID)
To reduce primary throwaway the aim is to have fully indexed predicates where beneficial:
The optimal index on CLASSES is not available and as this index - CLASSES(DAYS) will be very selective (only retrieving 4 rows out of all the rows in CLASSES) it is
assumed that the creation of this particular index will be beneficial.
The optimal index on COURSES is available and no further considerations are needed.
Steps 4 and 5 then results in following first suggestion for changes to reduce throwaway to
obtain a better performing query:
July 2000
Page 32
July 2000
Page 33
July 2000
Page 34
CLASSES:
Non-join predicates:
Basic cardinality:
July 2000
None
11163 (select count(*) from classes or obtained from table
statistics)
Page 35
COURSES:
Non-join predicates:
Basic cardinality:
dev_id = 121688
7 (select count(*) from courses where dev_id = 121688 or
obtained from table and column statistics)
With a total of 40 rows output from the query CLASSES does not seem like a good candidate
for choice of driving table (this would result in a throwaway of at least 11123 rows which is
much worse than the total number a rows thrown away in the original query).
This leaves the join orders:
A) LOCATIONS - COURSES - CLASSES
B) COURSES - LOCATIONS - CLASSES
C) COURSES - CLASSES - LOCATIONS
Now, lets analyze the join orders for cardinalities and throwaway:
Join order A): LOCATIONS - COURSES - CLASSES
Cardinalities after join operations
LOCATIONS: 6 (determined under basic cardinalities)
COURSES:
CLASSES:
loc_id = <locations.loc_id>
crs_id = <courses.crs_id>
Cardinality after join: 40 (result of query)
loc_id = <locations.loc_id>
crs_id = <courses.crs_id>
Cardinality after join: 40 (result of query)
A) and B) are similar in the sense that the two leading tables are identical (but reversed) and are
not joined togther, resulting in cartesian products with an identical set of rows (42).
Join order C): COURSES - CLASSES - LOCATIONS
Cardinalities after join operations
COURSES:
7 (determined under basic cardinalities)
July 2000
Page 36
CLASSES:
crs_id = <courses.crs_id>
Cardinality:
Needs a test and cant be determined by analysis, but
must be at least 40 as the join predicate to
LOCATIONS is referencing the primary key in
LOCATIONS.
dev_id = 121688
Available indexes: (DEV_ID)
Optimal index: (DEV_ID)
CLASSES:
loc_id = <locations.loc_id>
crs_id = <courses.crs_id>
Available indexes: (CRS_ID), (LOC_ID)
Optimal index: (CRS_ID, LOC_ID)
To reduce primary throwaway the aim is to have fully indexed predicates where beneficial:
As the throwaway caused by the non-optimal index on LOCATIONS (8 rows as calculated
above) is small compared to the total number of rows processed in the statement, it does not
seem worthwhile to add an index on (NAME, CAPACITY).
The optimal index on COURSES is available and no further considerations are needed.
During the initital calculation of throwaway it was determined that the retrieval from CLASSES
using the LOC_ID index (named CLASS_NAME) resulted in 597 rows retrieved - which means
that this index alone will introduce a level of primary throwaway.
It is not easily determinable whether the simple index on CRS_ID will be enough for
eliminating primary throwaway from this last table access.
July 2000
Page 37
First recommendation is therefore to perform at test without the optimal index and then verify
whether an acceptable result can be obtained without building new indexes - which normally is
the best solution as indexes are expensive (maintenance, space).
Steps 4 and 5 then results in following first suggestion for changes to reduce throwaway to
obtain a better performing query:
New join order: LOCATIONS - COURSES - CLASSES
Join operations: No change (use nested loops)
New indexes: None
Step 6,1 (Identify means to implement suggested changes)
Above join order and usage of join operations can be obtained by rewriting (and hinting) the
query as follows:
select /*+ORDERED USE_NL(cls crs) */
cls.start_date, loc.capacity
from locations loc,
courses
crs,
classes
cls
where loc.name = 'Belmont Shores Ed Center'
and loc.capacity between 20 and 30
and loc.loc_id = cls.loc_id
and cls.crs_id = crs.crs_id
and crs.dev_id = 121688
Steps 7,1, 8,1 and 9,1 (Implement, verify execution plan and test improvement)
Which gives the result:
Rows
Execution Plan
----- -------------------------------------------------40 NESTED LOOPS
42
NESTED LOOPS
6
TABLE ACCESS (BY INDEX ROWID) OF 'LOCATIONS'
16
INDEX (RANGE SCAN) OF 'LOC_NAME' (NON-UNIQUE)
42
TABLE ACCESS (BY INDEX ROWID) OF 'COURSES'
48
INDEX (RANGE SCAN) OF 'CRS_DEV' (NON-UNIQUE)
40
TABLE ACCESS (BY INDEX ROWID) OF 'CLASSES'
82
AND-EQUAL
383
INDEX (RANGE SCAN) OF 'CLASS_CRS' (NON-UNIQUE)
363
INDEX (RANGE SCAN) OF 'CLASS_LOC' (NON-UNIQUE)
This is a much better result than the original, but there is still a significant amount of throwaway
- therefore an new iteration is executed (described somewhat more compact):
Step 1,2 and 2,2 (Calculate significant throwaway):
There is no throwaway to speak of (actually no throwaway exists) in the two join operations.
The only operation with throwaway is the AND-EQUAL operation on the two index range scan
operations on CLASS_CRS and CLASS_LOC indexes where in total 746 rows (383 + 363) are
input to the AND-EQUAL operation, but only 82 are output.
Step 3,2 (Map row sources to statement)
July 2000
Page 38
Row sources:
July 2000
Page 39
July 2000
Page 40
Row sources:
TABLE ACCESS (BY INDEX ROWID) OF 'EMPLOYEES'
INDEX (UNIQUE SCAN) OF 'EMP_PK'
Corresponds to table EMPLOYEES in the from clause with (join) predicates:
emp_id = <courses.dev_id>
mgr_id = <classes.instr_id>
Step 4,1 (Suggest changes in execution plan to reduce secondary throwaway)
To cure the secondary throwaway the join operation joining the EMPLOYEES table
(introducing the primary throwaway) should be moved to an earlier evaluation in the join order.
This would result in one of the following four join orders:
A) EMPLOYEES - CLASSES - COURSES
B) EMPLOYEES - COURSES - CLASSES
C) CLASSES - EMPLOYEES - COURSES
D) COURSES - EMPLOYEES - CLASSES
To find the optimal join order the predicates and cardinalities for each table in above join orders
are analyzed.
As there are more candidates for join orders it is usefull to discard join orders where the
cardinality of the driving table is significantly larger than the number of rows from the total
query.
To do this the basic cardinalities (cardinality after applying non-join predicates, if any) of the
three involved tables can be determined as:
EMPLOYEES:
Non-join predicates:
Basic cardinality:
None
15132 (select count(*) .. or table statistics)
CLASSES:
Non-join predicates:
Basic cardinality:
None
11163 (select count(*) .. or table statistics)
COURSES:
Non-join predicates:
Basic cardinality:
None
1018 (select count(*) .. or table statistics)
With a total of 1 row output from the query the tables EMPLOYEES and CLASSES seems like
the worst possible choices for driving table.
This leaves the join order:
D) COURSES - EMPLOYEES - CLASSES
It would now be tempting to continue to next step (reducing primary throwaway), but for
completeness lets analyze this join order for cardinalities and throwaway:
Join order D): COURSES - EMPLOYEES - CLASSES
Cardinalities after join operations:
COURSES:
1018 (determined under basic cardinalities)
July 2000
Page 41
When only looking at throwaway from the tables this join order will result in a total of 2037
retrieved rows (1018 + 1018 + 1) which compares favorably against to the original querys
22327 rows (11163 + 11163 + 1).
Step 5,1 (Reduce primary throwaway)
The chosen join order (COURSES - EMPLOYEES - CLASSES) will result in following
predicates (as described above):
COURSES:
None
instr_id = <employees.mgr_id>
crs_id = <courses.crs_id>
Available indexes: (CRS_ID)
Optimal index: (INSTR_ID, CRS_ID)
To reduce primary throwaway the aim is to have fully indexed predicates where beneficial:
The optimal index on EMPLOYEES is available and no further considerations are needed.
Regarding indexes on CLASSES it must be assumed that the available index is insuffient.
This can be argued as follows: As all courses are retrieved, also all 11163 classes must be
retrieved using the available index - resulting in having to throw away 11162 rows to return the
one correct row from the query.
This assumption however relies on still using nested loops join operations and having 1018
lookups via an index resulting n only one row found.
This can very well be more resource consuming than executing a hash join operation between
the 1018 from joining COURSES and EMPLOYEES and the 11163 rows from CLASSES
(without any indexes on CLASSES).
So, again with the wish to avoid unnecesary index creations it is recommended to test this latter
approach before creating an index on CLASSES(INSTR_ID, CRS_ID).
Steps 4 and 5 then results in following first suggestion for changes to reduce throwaway to
obtain a better performing query:
July 2000
Page 42
July 2000
Page 43
July 2000
Page 44
Execution Plan
-----------------------------------------------SELECT STATEMENT
GOAL: FIRST_ROWS
TABLE ACCESS GOAL: ANALYZED (BY INDEX ROWID) OF
'EMPLOYEES'
BITMAP CONVERSION (TO ROWIDS)
BITMAP INDEX (FULL SCAN) OF 'EMP_JOB'
The first partial execution plan is generated directly from the STAT lines in the trace file thus
showing the actual execution plan when the statement was executed (having no name for the
index as this index does not exist anymore).
July 2000
Page 45
The second execution plan shows what the optimizer would have chosen with the current state
of the database and the row counts should not be taken seriously.
You run on Oracle 8.0.4
On several platforms there was a bug in this release where the STAT lines was written with a
wrong number for the corresponding cursor (#0 instead of #1). This only happened if you
traced from an SQL*Plus session.
If you can not upgrade to a later version the problem can be corrected by editing the trace file
and do a global change of STAT #0 to STAT #1
July 2000
Page 46
In this raking it should be noted that the focus is rows - not blocks (as more work is required to
handle a block during a full table scan than for example handling a block during a rowid
retrieval).
Ideally this ranking should be included in the determination of how significant a throwaway is.
This would require a reasonable precise quantification of above ranking (aka: getting a row
from an index is 3.5 times more expensive than retrieving a row from a table via a rowid).
For the time being there has not been made such a quantification of this ranking and it is
doubtful if it is possible at all to do without adding to much complexity.
July 2000
Page 47
Seen from a pragmatic point of view however experience has sofar shown that it is a sufficiently
precise assumption that all rows require the same amount of work disregarding what row source
is handing the row.
This however makes the prediction of the expected result of the query tuning unprecise.
11.3 PL/SQL functions called from row sources
Above pragamatic view on rows representing more or less the same amount of work should not
be adopted for row sources calling PL/SQL functions.
As just the task switch between the SQL and the PL/SQL engines is rather expensive then
reducing the number rows from row sources calling PL/SQL functions can reduce the CPU
consumption significantly - even if the result of this change is more rows throwaway from
normal row sources.
11.4 Optimizer mode
It should be noted that this paper does not deal with the optimizer.
The optimizer (disregarding if it is the RBO or the CBO) just generates the execution plan (with
the small twist that the CBO has some more row sources to choose from): When the execution
plan first has been determined it does not matter for performance what optimizer mode was
used.
The idea of throwaway is purely a view on the result of applying the exection plan on the
involved database objects and the goal of tuning by eliminating throwaway is obtained by
finding a better execution plan. In this the choice of optimizer just becomes a means to generate
the wanted execution plan.
11.5 Accuracy of reported number of rows from row sources
During the preparation of this paper it was several times experienced that the numbers reported
for row source counters in the trace files were not always precise - in the sense that the rules for
how the counters are updated might change for a row source depending on where in the
execution plan the row source is used.
Some good examples on this are:
1. There seems to be a consistent extra row reported somewhere in the execution plan
above a full table scan (not always on the full table scan itself) when using Oracle8i
2. The number of rows from hash joins in Oracle 7.3 and 8.0 resembles (!) the number of rows
from the hash join, but are higher and the difference changes according to how the tables are
ordered in the hash join operation.
3. Using Oracle8i the number of rows from a table access (by rowid) could be double as high
as physically possible if the table access was used as the inner table of a nested loops join but not always - this even happenes if other parts of the execution plan changes.
This can make the precise quantification of throwaway impossible, but in all cases seen during
the preparation of this paper and in real life it does not prevent the identification of where in a
query the work is wasted by throwaway.
In many cases an evaluation of the query can identify these discrepancies and in some cases
even an determination of the correct numbers can be done.
July 2000
Page 48