You are on page 1of 15

Tuning SQL Queries

Version : 0.1
Date : 21.05.2010
Status : Draft
Author : Midhun M
Pages : 14

For internal use only


History
Version Date Author Remark
0.1 21.05.2010 Midhun M Draft

References
1. Oracle® Database Performance Tuning Guide 10g Release 2 (10.2)
2. http://www.orafaq.com/wiki/Oracle_database_Performance_Tuning_FAQ
3. http://www.akadia.com/services/ora_interpreting_explain_plan.html

For internal use only


0.1 / Error: Reference source not found10

SQL Queries

Table of Contents

1 Introduction ......................................................................................................4
2 Objective............................................................................................................4
3 Basic information about query tuning.............................................................4
3.1 How does Oracle access data? .................................................................................4
3.2 Full table scan..........................................................................................................4
3.3 Index lookup..............................................................................................................5
4.3.1 Index Unique Scans.....................................................................................5
4.3.2 Index Range Scans......................................................................................5
4.3.3 Index Full Scans...........................................................................................5
4.3.4 Index Fast Full Scans...................................................................................5
3.4 EXPLAIN PLAN Usage..............................................................................................6
3.5 SQL Hints..................................................................................................................6
4.3.5 Why using hints............................................................................................7
3.6 Managing Statistics for Optimal Query Performance.................................................7
4.3.6 How to Generate Oracle Statistics................................................................7
4 Case Study..........................................................................................................8
4.1 About the view and base tables..................................................................................8
4.3.7 View: VW_VSP_NONZEBRA_REMAP........................................................8
4.3.8 Table: CHECKSHOT_SURVEY_...............................................................10
4.3.9 Table: CHECKSHOT_SURVEY_..............................................................10
4.3.10 Table: CHECKSHOT_SURVEY_.............................................................10
4.2 Execution plan before the modifications made.........................................................10
4.3 Modifications made..................................................................................................12
4.4 Execution plan after modifications .........................................................................12
5 Best practices..................................................................................................14

For internal use only Status: Draft Page 3 of 15


0.1 / Error: Reference source not found10

SQL Queries

1 Introduction
This document describes the basic information about the query tuning with an example.
Since we are dealing with information management it more often deals with retrieving
and persisting data through queries. So it is most important to optimize / tune the query
performance. Otherwise this will lead to poor performance of system, taking long time to
retrieve data. When I analyzed one of the views which was taking more time to retrieve, I
found this is because of improper scripting, and needs optimization/tuning for improving
performance. By this document I mean to address tuning basics, impact of that, some
recommendations, best practices etc.

2 Objective
• To provide basic information about query tuning.
• To provide some tips and information on query tuning with an example based on
VW_VSP_NONZEBRA_REMAP view.

3 Basic information about query tuning


3.1 How does Oracle access data?

At the physical level Oracle reads blocks of data. The smallest amount of data read is a
single Oracle block, the largest is constrained by operating system limits (and multiblock
i/o). Logically Oracle finds the data to read by using the following methods:

• Full Table Scan (FTS)


• Index Lookup (unique & non-unique)
• Rowid

3.2 Full table scan

Full table scan happens when oracle has to scan entire table to return the result for the
query. In Oracle, a full-table scan is performed by reading all of the table rows, block by
block, until the high-water mark for the table is reached. As a general rule, full-table
scans should be avoided unless the SQL query requires a majority of the rows in the
table.

Any one of the following conditions will cause Oracle to invoke a full-table scan:
• When no indexes exist for the table
• When a query does not contain a where clause
• When an indexed column is invalidated by placing it inside a BIF
• When a query uses the like operator, and it begins with a ‘%’
• With the cost-based optimizer, when a table contains a small number of rows
• When the optimizer_mode=all_rows in the initialization file

For internal use only Status: Draft Page 4 of 15


0.1 / Error: Reference source not found10

SQL Queries

3.3 Index lookup

In this method, a row is retrieved by traversing the index, using the indexed column
values specified by the statement. An index scan retrieves data from an index based on
the value of one or more columns in the index. To perform an index scan, Oracle
searches the index for the indexed column values accessed by the statement. If the
statement accesses only columns of the index, then Oracle reads the indexed column
values directly from the index, rather than from the table.

There are 4 methods of index lookup:

• index unique scan


• index range scan
• index full scan
• index fast full scan

4.3.1 Index Unique Scans


This scan returns, at most, a single rowid. Oracle performs a unique scan if a
statement contains a UNIQUE or a PRIMARY KEY constraint that guarantees
that only a single row is accessed.
This access path is used when all columns of a unique (B-tree) index or an index
created as a result of a primary key constraint are specified with equality
conditions.

4.3.2 Index Range Scans


An index range scan is a common operation for accessing selective data. It can
be bounded (bounded on both sides) or unbounded (on one or both sides). Data
is returned in the ascending order of index columns. Multiple rows with identical
values are sorted in ascending order by rowid.

4.3.3 Index Full Scans


In certain circumstances it is possible for the whole index to be scanned as
opposed to a range scan (i.e. where no constraining predicates are provided for
a table). Full index scans are only available in the CBO as otherwise we are
unable to determine whether a full scan would be a good idea or not. We choose
an index Full Scan when we have statistics that indicate that it is going to be
more efficient than a Full table scan and a sort.

4.3.4 Index Fast Full Scans


Fast full index scans are an alternative to a full table scan when the index
contains all the columns that are needed for the query, and at least one column
in the index key has the NOT NULL constraint. A fast full scan accesses the data
in the index itself, without accessing the table.

For internal use only Status: Draft Page 5 of 15


0.1 / Error: Reference source not found10

SQL Queries

3.4 EXPLAIN PLAN Usage

When an SQL statement is passed to the server the Cost Based Optimizer (CBO) uses
database statistics to create an execution plan which it uses to navigate through the
data. Once you've highlighted a problem query the first thing you should do is EXPLAIN
the statement to check the execution plan that the CBO has created. This will often
reveal that the query is not using the relevant indexes, or indexes to support the query
are missing. Interpretation of the execution plan is beyond the scope of this article.

The EXPLAIN PLAN method doesn't require the query to be run, greatly reducing the
time it takes to get an execution plan for long-running queries.

First the query must be explained:

SQL> EXPLAIN PLAN FOR


2 SELECT *
3 FROM emp e, dept d
4 WHERE e.deptno = d.deptno
5 AND e.ename = 'SMITH';

Explained.

SQL>

The explain plan process stores data in the PLAN_TABLE. Then the execution plan
displayed using,

SQL> SELECT PLAN_TABLE_OUTPUT FROM TABLE(DBMS_XPLAN.DISPLAY());

For more information about the explain plan and how to interpret the explain plan please
refer Oracle® Database Performance Tuning Guide 10g Release 2 (10.2).

3.5 SQL Hints

With hints one can influence the optimizer. The usage of hints causes Oracle to use the
Cost Based optimizer.

The following syntax is used for hints:

select /*+ HINT */ name


from emp
where id =1;

Where HINT is replaced by the hint text.


When the syntax of the hint text is incorrect, the hint text is ignored and will not be used.

Example:

SQL> select /*+ index(scott.emp ix_emp) */ from scott.emp ;

This forces the oracle to use the index ix_emp.

For internal use only Status: Draft Page 6 of 15


0.1 / Error: Reference source not found10

SQL Queries
4.3.5 Why using hints
It is a perfect valid question to ask why hints should be used. Oracle comes with
an optimizer that promises to optimize a query's execution plan. When this
optimizer is really doing a good job, no hints should be required at all.
Sometimes, however, the characteristics of the data in the database are
changing rapidly, so that the optimizer (or more accuratly, its statistics) are out of
date. In this case, a hint could help.

3.6 Managing Statistics for Optimal Query Performance

The optimizer’s job is to take SQL statements and decide how to get the data that is
being asked for in the SQL statement and how to get it in the quickest way possible.

When a SQL statement is executed, the database must convert the query into an
execution plan and choose the best way to retrieve the data. These execution plans are
computed by the Oracle cost-based SQL optimizer commonly known as the CBO.

The choice of executions plans made by the Oracle SQL optimizer is only as good as
the Oracle statistics. To always choose the best execution plan for a SQL query, Oracle
relies on information about the tables and indexes in the query. The optimizer program
uses statistics on tables and on the indexes surrounding those tables, so it’s important to
have statistics on both.

If new objects are created, or the amount of data in the database changes the statistics
will no longer represent the real state of the database so the CBO decision process may
be seriously impaired.

4.3.6 How to Generate Oracle Statistics


Oracle provides a stored procedure (or program) for you to run that will generate
the statistics is needs. To generate statistics we use the dbms_stats stored
package. There are two procedures contained within the dbms_stats package
that you will mostly be interested in, dbms_stats.gather_schma_stats and
dbms_stats.gather_table_stats.

SQL> EXEC dbms_stats.gather_schema_stats(’SCOTT’,cascade=>TRUE);

This command will generate statistics on all tables in the SCOTT schema. Since
we included the cascade command, the indexes will also have statistics
generated on them. This is important, you need statistics on indexes as well as
on tables in Oracle!
If you create a new table, or the amount of data in the table changes then it may
not be practical or desirable to re-generate statistics on the entire schema if the
schema is quite large and the database is very busy. Instead you will use the
dbms_stats.gather_table_stats command to generate statistics for a single table,
and optionally for related table indexes. Here is an example:

SQL> EXEC dbms_stats.gather_table_stats(‘SCOTT’,’EMP’,


cascade=>TRUE);

In this case we are generating statistics for the EMP table in the SCOTT schema.
Again we use the cascade parameter to insure all of the indexes get analyzed.

For internal use only Status: Draft Page 7 of 15


0.1 / Error: Reference source not found10

SQL Queries

4 Case Study
4.1 About the view and base tables

4.3.7 View: VW_VSP_NONZEBRA_REMAP

View Name : VW_VSP_NONZEBRA_REMAP


Number of records : 31326
Base Tables : CHECKSHOT_SURVEY_, TIME_DEPTH_PAIR_ and
BOREHOLE_
View creation script :

CREATE OR REPLACE VIEW VW_VSP_NONZEBRA_REMAP


( UBHI,
BOREHOLE_NAME,
ACQUISITION_DATE,
SOURCE,
CORRECTION_VELOCITY,
DATUM_ELEVATION,
INDEX_CORRECTION,
REMARKS,
MD_BRT,
TVD_BRT,
TVD_SS,
TWT_SS )
AS SELECT
hdr.ubhi,
hdr.name,
hdr.acquisition_date,
max(hdr.source) source,
max(hdr.correction_velocity) correction_velocity,
max(hdr.datum_elevation) datum_elevation,
max(hdr.index_correction) index_correction,
max(hdr.remarks) remarks,
tvdbrt.md md_brt,
max(tvdbrt.tvd_brt) tvd_brt,
max(tvdss.tvd_ss) tvd_ss,
max(twt.twt_ss) twt_ss
FROM
(select cs.id,
br.ubhi,
br.name,
cs.acquisition_date,
cs.correction_velocity,
cs.datum_elevation,
cs.index_correction,
cs.remarks,
cs.work_order_number,
br.id borehole_id,
cs.version,
cs.source
from borehole br,
checkshot_survey cs
WHERE

For internal use only Status: Draft Page 8 of 15


0.1 / Error: Reference source not found10

SQL Queries
br.id = cs.borehole_id
and cs.name='Checkshot_Hdr-NonZebra_Remap'
order by cs.id,br.ubhi,cs.acquisition_date,cs.correction_velocity,
cs.datum_elevation,cs.index_correction, cs.remarks) hdr,
(select cs.id,
br.ubhi,
cs.acquisition_date,
tdp.depth md,
tdp.travel_time tvd_brt,
br.id borehole_id,
tdp.version
from borehole br,
checkshot_survey cs,
time_depth_pair tdp
where br.id=cs.borehole_id
and cs.name='Checkshot_MD-TVDBRT-NonZebra_Remap'
and cs.id=tdp.checkshot_id
and tdp.name='MeasuredDepth_TVDBRT-NonZebra_Remap'
order by cs.id,br.ubhi,cs.acquisition_date,tdp.depth,tdp.travel_time) tvdbrt,
(select cs.id,
br.ubhi,
cs.acquisition_date,
tdp.depth md,
tdp.travel_time tvd_brt,
br.id borehole_id,
tdp.version
from borehole br,
checkshot_survey cs,
time_depth_pair tdp
where br.id=cs.borehole_id
and cs.name='Checkshot_MD-TVDSS-NonZebra_Remap'
and cs.id=tdp.checkshot_id
and tdp.name='MeasuredDepth_TVDSS-NonZebra_Remap'
order by cs.id,br.ubhi,cs.acquisition_date,tdp.depth,tdp.travel_time) tvdss,
(select cs.id,
br.ubhi,
cs.acquisition_date,
tdp.depth md,
tdp.travel_time tvd_brt,
br.id borehole_id,
tdp.version
from borehole br,
checkshot_survey cs,
time_depth_pair tdp
where br.id=cs.borehole_id
and cs.name='Checkshot_MD-TWTSS-NonZebra_Remap'
and cs.id=tdp.checkshot_id
and tdp.name='MeasuredDepth_TWTSS-NonZebra_Remap'
order by cs.id,br.ubhi,cs.acquisition_date,tdp.depth,tdp.travel_time) twt
WHERE hdr.ubhi = tvdbrt.ubhi and
hdr.acquisition_date = tvdbrt.acquisition_date and
hdr.work_order_number=tvdbrt.borehole_id||'-NonZebra_Remap' and
tvdbrt.ubhi=tvdss.ubhi and
tvdbrt.acquisition_date=tvdss.acquisition_date and
tvdbrt.md=tvdss.md and
tvdbrt.version=tvdss.version and
tvdbrt.ubhi=twt.ubhi and
tvdbrt.acquisition_date=twt.acquisition_date and

For internal use only Status: Draft Page 9 of 15


0.1 / Error: Reference source not found10

SQL Queries
tvdbrt.md=twt.md and
tvdbrt.version=twt.version
GROUP BY hdr.ubhi,
hdr.name,
hdr.acquisition_date,
tvdbrt.md,
tvdbrt.version
ORDER BY hdr.ubhi,
hdr.name,
hdr.acquisition_date,
tvdbrt.md;

4.3.8 Table: CHECKSHOT_SURVEY_


Number of rows: 24

4.3.9 Table: CHECKSHOT_SURVEY_


Number of rows: 95145
INDEXES:
• TIME_DEPTH_PAIR_FI01 - Non unique index on CHECKSHOT_ID
• TIME_DEPTH_PAIR_FI02 - Non unique index on
SEISMIC_ENERGY_SOURCE_ID
• TIME_DEPTH_PAIR_PK - Unique index on ID

4.3.10 Table: CHECKSHOT_SURVEY_


Number of rows: 9

4.2 Execution plan before the modifications made

A SQL statement can be executed in many different ways, such as full table scans,
index scans, nested loops, and hash joins. The query optimizer determines the most
efficient way to execute a SQL statement after considering many factors related to the
objects referenced and the conditions specified in the query. This determination is an
important step in the processing of any SQL statement and can greatly affect execution
time.

Execution plan is basically a step by step instruction for how the statement must be
executed. That is, the orders in which tables are read, if indexes are used, which join
methods are used to join tables and so on.

SQL> select count(*) from VW_VSP_NONZEBRA_REMAP;

COUNT(*)
----------
31326

Elapsed: 00:56:09.48

Note: The time taken for executing this query is approximately 1 hour.

For internal use only Status: Draft Page 10 of 15


0.1 / Error: Reference source not found10

SQL Queries

Execution plan for select count(*) from VW_VSP_NONZEBRA_REMAP;

From the execution plan it is very clear that the 3 select queries in the view creation
script is going for INDEX RANGE SCAN (TIME_DEPTH_PAIR_FI01) and the number of
rows returned is 13431 for each select statement.

• Since TIME_DEPTH_PAIR_ is the biggest table involved in the view, we should


focus more on this table.
• An Index Range Scan that returns 500 rows is hardly cause for alarm. But if it is
going to be returned more rows then we have a problem.
• TIME_DEPTH_PAIR_ is a driving table in the view creation statement. If this
returns a lot of rows then this can have a negative impact on all subsequent
operations.
• Full table scans are cheaper than index range scans when accessing a large
fraction of the blocks in a table. This is because full table scans can use larger

For internal use only Status: Draft Page 11 of 15


0.1 / Error: Reference source not found10

SQL Queries
I/O calls, and making fewer large I/O calls is cheaper than making many smaller
calls.

Since we are accessing a large number of rows, full table scan on TIME_DEPTH_PAIR_
is always better than INDEX RANGE SCAN.

This can be achieved by either


• Using sql hints ( force the query to choose a full table scan for the specified
table) or
• Drop TIME_DEPTH_PAIR_FI01 index if it is not required.

4.3 Modifications made


SQL> create index idx_test_cs on checkshot_survey_(borehole_id,name);
# Optional .

Index created.

SQL> drop index TIME_DEPTH_PAIR_FI01;

Index dropped.

Recommends you to gather the statistics of all the base tables involved. These statistics
are used by the query optimizer to choose the best execution plan for each SQL
statement.

SQL> EXEC dbms_stats.gather_table_stats ('PROJECT_RAVVA',

'CHECKSHOT_SURVEY_',cascade=>TRUE);

PL/SQL procedure successfully completed.

SQL> EXEC dbms_stats.gather_table_stats ('PROJECT_RAVVA',

'TIME_DEPTH_PAIR_',cascade=>TRUE);

PL/SQL procedure successfully completed.

SQL> EXEC dbms_stats.gather_table_stats ('PROJECT_RAVVA',

'BOREHOLE_',cascade=>TRUE);

PL/SQL procedure successfully completed.

4.4 Execution plan after modifications


SQL> select count(*) from VW_VSP_NONZEBRA_REMAP;

COUNT(*)

For internal use only Status: Draft Page 12 of 15


0.1 / Error: Reference source not found10

SQL Queries
-----------
31326

Elapsed: 00:00:04.51

• From the execution plan it is very clear that the query optimizer is going for a full
table scan on TIME_DEPTH_PAIR_ rather than INDEX RANGE SCAN.
• Rows returned from the driver table (TIME_DEPTH_PAIR_) is around 310.

Result: The time taken for the select query has been reduced significantly (1 hour
to 5 seconds)

For internal use only Status: Draft Page 13 of 15


0.1 / Error: Reference source not found10

SQL Queries

5 Best practices
• Indexes are created when frequently less than 15% rows of a table are retrieved
or on columns used in joins to improve join performance.
• Columns that are recommended to be indexed are those, whose values are
unique or there are few duplicate entries, having a wide range of values (suitable
for regular index), or having a small range of values (suitable for bitmap indexes).
• The database can use indexes more effectively when statistical information about
the tables involved in queries are gathered.
• Indexes can be dropped if they are no longer required. In circumstances when
index do not speed up queries or the queries in the application do not use the
index, index is usually dropped.
• Gather the table statistics if a new table is created or altered.
• Gather the table statistics when the amount of data changes in the table is huge.

For internal use only Status: Draft Page 14 of 15


0.1 / Error: Reference source not found10

SQL Queries

For internal use only Status: Draft Page 15 of 15