Академический Документы
Профессиональный Документы
Культура Документы
B u f f e r
P o o l
Updating Statistics
Simple negligence is a common cause of performance problems in many DB2 subsystems. DB2 needs to have accurate database object statistics in order to create efficient strategies for data
access. Indeed, at IDUG in Prague last year an IBM speaker said,
Of customer problems with bad access paths, around half of
them were because of bad or missing statistics. That is quite a
telling indication of the reason many performance problems occur.
Put quite simplyit is our fault.
To avoid such problems make sure you run a RUNSTATS
utility on a regular basis. As the volume and nature of data in your
databases changes, DB2 must be made aware of the changes or
performance will suffer. The RUNSTATS utility collects statistical
information for DB2 tables, tablespaces, partitions, indexes and
columns. When it populates this information into the DB2 Catalog the statistics are available for subsequent use by the DB2 optimizer in formulating efficient access paths. Of course, the statistics
in these tables also can be used by DBAs to help determine when
to reorganize. Without up-to-date statistics, both DB2 and the
DBAs managing DB2 are at a disadvantage.
So just how frequently should RUNSTATS be run? Of
course, the answer is it depends. But here is what it depends
www.idug.org
upon: how frequently the data changes, how large the object is,
and the data access activity. The cost of RUNSTATS usually is
negligible for small- to medium-size table spaces. Of course, RUNSTATS will take longer to execute for larger objects, so plan wisely before executing RUNSTATS for very large tablespaces and
indexes. You cannot avoid running RUNSTATS for larger objects
because the statistics are perhaps even more important the larger
the object becomes. If the data in a large object changes slowly you
can probably run RUNSTATS once, and then delay running it
again for a long time (until the data changes significantly). But for
very volatile data, be sure to execute the RUNSTATS utility at
least monthly. You should consider sampling to reduce the runtime duration for larger objects.
You can run RUNSTATS with SHRLEVEL CHANGE to
accumulate statistics without limiting concurrent activity to an
object. Of course, the statistics will be more accurate if SHRLEVEL REFERENCE is used, but in todays 24/7 world such a luxury is not usually possible. At any rate, it is wise to run RUNSTATS
during periods of low activity to reduce the impact of the concurrent access to both the applications and the statistics gathering
process.
After running RUNSTATS the newly updated statistics are
available for use. Of course, if you do not REBIND your applications your access paths for static SQL will not change. Dynamic
SQL can take advantage of the new statistics immediately. And
you can examine the new statistics to determine whether your
objects need to be reorganized. This is the next simple thing you
need to keep under control.
Coding Appropriately
Another pervasive problem permeating DB2 systems is the
flat file development mentality. What I mean by this is when a
programmer tries to access DB2 data the same way that he would
access data from a flat file. DB2 is relational in nature and, as
such, operates on data a set at a time, instead of the record at a time
processing used against flat files. In order to do justice to DB2, you
need to change the way you think about accessing data.
To accomplish this, all users of DB2 need at least an overview
education of relational database theory and a moderate to extensive amount of training in SQL. Without such a commitment
your programmers are sure to develop ugly and inefficient database access codeand who can blame them? Programmers are
used to working with files so they are just doing what comes naturally to them.
SQL is designed so that programmers specify what data is
needed but they cannot specify how to retrieve it. SQL is coded
without embedded data-navigational instructions. The DBMS
analyzes SQL and formulates data-navigational instructions
behind the scenes. This is foreign to the programmer who has
never accessed data using SQL.
Every SQL manipulation statement operates on a table and
results in another table. All operations native to SQL, therefore,
are performed at a set level. One retrieval statement can return
multiple rows; one modification statement can modify multiple
rows. This feature of relational databases is called relational
closure.
When accessing data, a programmer needs to think about
what the end result should be and then code everything possible
into the SQL. This means using the native features of SQLjoins
and subselects and functions, etc.instead of coding procedural
COBOL or Java and processing tables like files.
Educating programmers how to use SQL properly is probably
the single most important thing you can do to optimize performance of your DB2 applications.
your tables. Building indexes to support predicates of the most frequently executed queries and most important queries should be
your first indexing step after building the unique indexes required
to support primary keys and unique constraints.
Of course, this requires knowledge of how your tables will be
accessed. And when you are first creating tables, you will not have
any SQL. Sometimes you may have vague pseudo-code descriptions of potential queries, but you wont have an accurate picture
of access. Therefore, indexing has to be an incremental task, performed on an ongoing basis as code is written against your databases.
As you continually monitor and build new indexes, be sure to
review the old ones that were created. Remember, although indexes can improve SELECT access, they will degrade the performance
of INSERTs and DELETEs (as well as any UPDATEs of indexed
columns). So, be sure to drop those unused indexes.
You should also take care to learn your applications and the
type of indexes that can best serve them. Of course, this coverage
of indexing techniques has been necessarily high-level. Skillful,
performance-oriented indexing is a skill that can take time to master. So, maybe I was a bit rash in calling it a simple thing.
Index to Advertisers
BMC Software, Inc. . . . . . . . . . . . . . IBC
CopperEye . . . . . . . . . . . . . . . . . . . . IFC
Embarcadero. . . . . . . . . . . . . . . . . . . . 7
HLS Technologies . . . . . . . . . . . . . . . . 5
IBM. . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
IDUG Euope 2005 . . . . . . . . . . . . . . . 40
Summary
Be aware that this article has offered a simplified list of things
to focus onbut that is the point. Yes, DB2 management is a complex, arduous task. But by paying attention to the basics and making sure you do not take shortcuts around necessary processes, the
complex things can be addressed more easilybecause you can be
sure that the simple things have been handled appropriately.
Yes, I know, there always seems to be time to do it over, but
never time to do it right. By paying attention to the details and
making sure that the small things dont become big problems, the
time needed to do it over should diminish. And then youll be
able to spend more time on the big issues when they inevitably
arise.
A B O U T T H E AU T H O R
Craig S. Mullins is president and principal consultant with
Mullins Consulting, Inc. Craig is an IBM Gold Consultant, the
author of two books, DB2 Developers Guide and Database
Administration: Practices and Procedures, and can be reached via
his Web site at www.CraigSMullins.com.
www.idug.org
39