Вы находитесь на странице: 1из 60

DB2 9.7 presents you with important new features for diagnosing and solving your SQL problems.

Through a series of practical examples, this session will introduce you to the latest DB2 capabilities available for SQL diagnostics and show you how you can leverage them to solve query problems in your own environment. You will learn how to use the new time spent capabilities to identify that you have an SQL problem on your system, and how to nd your most expensive statements using the new package cache table functions. You will also learn how to use the new runtime explain capabilities to perform an explain using the actual compiled sections in the package cache or catalogs. Finally, you will be introduced to the new section actuals feature in DB2 9.7 FP1 that allows you to examine the actual plan cardinalities for your statement executions.

Learn how to leverage the latest DB2 performance monitoring features and time spent metrics to identify that you have an SQL problem
Learn how to pinpoint your most expensive statements using the package cache table functions
Learn about the new runtime explain capabilities and how you can perform an explain from the actual compiled section in the catalogs or the package cache to understand the query plan.
Learn how to leverage the new section actuals explain capabilities to examine the actual cardinalities from the execution of your problem queries
Take away practical examples you can try out in your own environment.

In 9.7 we introduced a new monitoring infrastructure that both has less overhead in terms of data collection, as well as provides interfaces that carry signicantly less overhead and impact than the SNAP_ class of table functions (which are effectively fenced wrappers over top of snapshot API calls). The key points to take away on the new infrastructure are that:
Data is pushed up to accumulation levels incrementally (every 10 seconds) during execution rather than collected and maintained at the individual thread / agent level
Monitor queries need only access the metrics directly at the accumulation level rather than performing extensive drilldown across all agents and applications belonging to the object we are reporting on (as is required by snapshot requests)
Unlike snapshot, the table functions allow ltering of data at source, so for focused monitoring on DPF or PureScale we only need to ow back the requested data to the coordinator (the snapshot UDFs required all data to be sent back to the coord and ltering is performed there)

These three perspectives allow you to answer questions along the lines of:
(a) What work is being done by applications or groups of applications, and what is their impact
(b) What activity is occurring on individual data objects like tables, bufferpools, tablespaces
(c) What is the impact of the execution of individual SQL statements, or tracked items like Load

A request comes into an agent and is processed. The agent gathers the metrics as it is running and at regular intervals propagates them upwards to the different aggregation points in the system hierarchy.

We report data at several different aggregation levels. We allow drilldown across connections, and current unit of work. We also now allow monitoring based on WLM objects like workloads and service classes. This gives you a lot of power in terms of monitoring since you can create user dened groups of applications (workloads), and monitor the impact they are having in terms of the group of resources they are assigned (service classes).
The XML interfaces are intended to provide an additional exible format for the monitoring data, allowing us to easily store the whole set of metrics as a group, and also perform useful transformations to the data, such as translating to relational format using XMLTABLE for analysis, formatting for display in a browser, or for use in one or more of the row based metrics formatters we will show later on.
The XML interfaces also provide a extra granularity in terms of the time spent in DB2 9.7.

A request comes into an agent and is processed. As the request is processing, the agent increments the appropriate metrics at the different aggregation points in the object hierarchy. Above we show the core hierarchy of data objects involved in query processing. Other data object interfaces report data thats collected and aggregated independently.

The data object interfaces provide aggregate data for various data objects of interest in the system. Index metrics are new available for the rst time via an SQL interface as of DB2 9.7.
The core set of data objects (table, index, bufferpool, tablespace, container) collect metrics through the common hierarchy shown on the previous slide. Other objects shown in the list above are relatively independent and therefore each have data collected independently of each other.

10

A request comes into an agent and is processed. If the request is related to an activity, then the agent gathers the metrics from the start of activity execution and at regular intervals aggregates them in the activity control block. When the activity completes, those activity execution metrics are propagated to the package cache and aggregated under the specic cached section that was executed (static and dynamic SQL).

11

MON_GET_PKG_CACHE_STMT reports metrics on a per section basis for all SQL in the package cache. Unlike the existing dynamic SQL snapshot, this includes both dynamic as well as static entries. The package cache metrics for a given statement are updated when the statement completes execution.
MON_GET_ACTIVITY_DETAILS is provided to allow the user to obtain metrics for an SQL statement that is currently executing (useful for long running queries in problem determination scenarios).

12

13

The time spent metrics represent the sum of the time spent by agent threads processing requests on behalf of client applications. We say they provide a relative breakdown of time spent because the times do not correspond directly to wall clock time (i.e. 10 threads will accumulate 10 seconds worth of time in DB2 for every 1 second of clock time), so the most meaningful way to analyze these is to look at the amount of time spent in a given area relative to the total time spent by all agent threads.
In terms of understanding the basic time metrics, you should imagine any given interval of time to be broken into two pieces the time spent performing processing, and the time spent waiting for some event to occur (the completion of an I/O request, or the obtaining of a lock for example).
Each time metric (with a few exceptions) also has a corresponding count, which indicates the number of times the wait or component metric has occurred. This allows the user to perform calculations such as the average time spent per wait, and tell you if you are dealing with elephants vs. mosquitoes (i.e. a few large occurrences vs. many small occurrences).

14

This is a representative picture of how a breakdown of wait + component times can be visualized. The times shown here are a subset of the full set of times available. In the above example you will see our total time involves both waits on particular areas (I/Os, lock waits) as well as processing in different areas (query compilation, section processing, commit / rollback, and some unknown time).

15

Here we show the basic hierarchy of times included in the activity dimension. This is a slightly different hierarchy than we would see when examining the times from the system perspective (there are two general hierarchies one for the system perspective, and one for the activity perspective).
This session will focus on the times within the activity perspective as these provide a method to look at where the time is spent during actual query execution, excluding one time setup and other system infrastructure costs such as commit and rollback processing (whereas the system perspective presents the total hierarchy of time spent in the server). The basic activity hierarchy consists of an overall statement execution time, which is then subdivided into both a set of waits and processing times incurred by threads performing work for the statement.
One important thing to note is that a given statement or compiled query plan may involve the invocation of an arbitrary number of nested query plans in the case of SQL routines and procedures. In the hierarchy, the stmt_exec_time represents the entire amount of time spent executing the parent statement including any time spent in nested statements. The breakdown of waits and processing times however only pertain to the specic statement being executed and not nested statements it invokes, as the package cache aggregates metrics on a per section basis (i.e. per executable query plan), and each nested query has a separate executable plan / section. To examine the times within nested statements you need to look at the metrics for those individual statements separately in MON_GET_PKG_CACHE_STMT.

16

The row based formatting functions provide an easy to navigate generic format for examining and ranking time spent metrics. These formatters accept as input the XML metrics data returned from any of the monitoring interfaces that produce XML (including event monitors). Three of the formatters produce different views of the time metrics specically, whereas the fourth is a generic formatter for all the metrics encapsulated in the XML document. The basic usage of these functions can best be illustrated through several examples.

17

The example above obtains the full set of wait and processing times for a specic statement. The query itself obtains a listing of all static statements in the package cache, including fetching the related XML details. We then format the XML using the row based formatter, and qualify the statement we are looking for in the where clause.
A few important points to note:
The row based format is generic so that metrics can easily be sorted and identied with no specic knowledge by the application programmer on what the metrics of interest are this also means no extra code is needed when new metrics are added in the future as the function schema does not change.
Metrics include both the time spent as well as the number of occurrences of a time interval to allow users to determine relative impact of the individual intervals as well as the overall impact on the system
The relationship of metrics to each other in the hierarchy is shown via the parent_metric_name eld allowing the user to understand and query relationships without prior knowledge of the hierarchy. This also allows programmatic discovery of the hierarchy in applications if desired.
The row based formatters in general will work on the XML output produced by any of the new 9.7 Monitoring interfaces or event monitors.

18

This example shows a similar query but this time we are only querying the wait time metrics / wait time hierarchy.

19

This example shows us obtaining the times for each individual component area in for our statement of interest. This provides a different perspective on where time is spent in the statement as we are showing the total times broken down by component, and the processing time spent in each individual component (as opposed the individual processing + wait times). This view can be useful to isolate which components are of interest irrespective of whether the time spent in those components are processing vs wait times. Both the processing + wait times can be identied within the individual component, but we do not capture the individual waits within a specic component due to memory / performance overhead. Also note that the output above shows two hierarchies; both the relationship of total times to each other, as well as the relationship of processing times to each other (which is a subset of the waits + processing times hierarchy).
A few additional points that apply to all these examples:
In this example we are ltering by comparing against the statement text. Starting in 9.7 we have further introduced an executable_id identier that allows you to uniquely identify a section in the package cache. If we had obtained this identier previously we could have passed it directly as input to MON_GET_PKG_CACHE_STMT_DETAILS in order to lter out the statement we wanted at source (note that in this example this was done in order to retrieve the actual statement text for each statement in the cache).
Note here that we used MON_GET_PKG_CACHE_STMT in order to retrieve the statement text for each statement in columnar form. This data is already contained in the details eld from MON_GET_PKG_CACHE_STMT_DETAILS so we could alternately have chosen to extract this using the XMLTABLE function.

20

21

This section might alternately have been titled looking for system level performance problems before proceeding with query tuning.

22

This rst example shows us two things - how much of the time on the system is being spent during executions of recognized activities (these would be SQL statements, and load utilities in DB2 9.7 and 9.8). It also shows us how much of that time is spent performing actual query plan execution vs. just statement processing (total_section_time vs. total_activity_time).
If our system is running well we would expect to see a large percentage of our time spent here, indicating we do not have any unexpected bottlenecks outside of query execution. If however we see signicant time is being spent outside of query execution it may be an indication that further investigation and tuning needs to be performed at the system level before delving deeper into a query tuning exercise.
In our example above we see that the activity time represents about half of the total request time on the server, indicating that we are spending a reasonably large portion of our time outside of activity execution. This tells us that we might want to perform a bit more investigation at the system level before we start chasing individual rogue queries. Activity total time is a superset of section time, and here we also see that all the activity time is being spent performing query plan execution. This indicates that our query performance is purely a factor of the speed at which the query plan executes. If the activity time had been signicantly larger than the section time, it would generally be an indication that we are spending a portion of our time in non-sql routines when executing statements, or that we are running Loads on the system. A quick check of the total_load_time would allow us to conrm that the majority of our time is being spent in statement execution.
For this example we used an aggregate of metrics at the connection level to look at the division of time for work done by any connections currently on the system, but its

23

Youll immediately notice that this example is fairly busy, but at a high level we are doing only a few simple things
First we have created some simple views to help us obtain a delta for our wait time metrics over a 60 second period. This consists of creating a view that obtains the aggregate metrics in row based form for the database by summing all the service subclasses, creating a temporary table where we can cache our initial sample, and creating a third view that will compute a delta for our metrics from the baseline weve stored.
Note that this example shows one key benet of the row based time metrics format which is that it allows deltas to be computed generically without knowledge of the specic metrics returned.

24

To obtain our sample we then simply cache a baseline of our database metrics, wait for the sampling period, and then select the delta values from our delta view (in this case we select the top 5 times only in order to t onto a single slide) Looking at the results, what we see is that we have some signicant wait times to investigate.
The rst wait we need to address is client idle wait time- this time represents both the think time on the client side between requests, but also the time spent processing results send from the server to the client. In this case we were running an OLTP workload, and the data the queries were processing was comparable to the results returned to the client side, so long as our applications are not experiencing response time problems, we should be able to safely ignore this time as normal for this case.
Note also that although we didnt cover the system time spent hierarchy in this session, that client_idle_wait_time is actually not included as a component of total_wait_time since it is not incurred during request processing. Notice also from the output that the client_idle_wait_time is neither considered part of total_rqst_time nor total_wait_time for this reason.
Moving on we can see that our request time appears to basically be 50% wait time, and 50% section processing, with all the wait time being log disk wait time, indicating its most likely time spent performing commit related processing. In this case though we are running a heavy OLTP workload with frequent commits and not very powerful disks, so once again for our environment this level of waits is expected.
Having ruled out these two waits as being indicative of a problem, the majority of our time spent is left in section / query processing. This gives us further condence that there are no general system-wide issues affecting our query performance and that our investigation should now center on what the queries themselves are doing.
As a counter-example if we had determined instead that most of our time was going towards query compilation, it would have been a strong indicator that the issue was not related to query execution itself, and that would have led our investigation towards package cache tuning, and or excessive usage of literals in the SQL from certain applications / enablement of statement concentrator rather than looking at the query plans.

25

26

27

The next several examples show a few basic Top N type queries for statements in the package cache using time spent metrics and the MON_GET_PKG_CACHE_STMT table function. Its important to note that this function will show data for both static and dynamic statements unless specic ltering is performed, and therefore offers some signicant advantages over the older dynamic SQL snapshot (it is also signicantly more lightweight).
The rst example nds the 5 statements with the most request time in the server these represent the statements that have spent the longest amount of time on the server in comparison to other statements. Note that the way we have constructed the example this may show statements with long response times, or statements executed many many times. Renements can be made using the num_executions count to separate these cases.

28

The second example here ranks statements in terms of CPU time consumption. This shows the most CPU intensive statements on the database and can be very useful to help determine who the biggest contributors to CPU bottlenecks are.
The third example ranks statements by the amount of time they have spent performing I/O operations (including both bufferpool and direct read / read lob activity). This will show the statements that are driving the most I/O activity in the server.
Similar queries can be constructed based on any of the available metrics, but the above represent some of the more common ones you might be interested in.

29

This example shows another useful metric we can compute with time spent the relative velocity. Here we are looking at the relative velocity at the statement level. The relative velocity shows the degree of progress of a statements execution relative to the progress it would make if it was not spending any time blocking / waiting on subsystems. In the example above we have selected the 5 statements with the worst relative velocities. You can see that only 1-2% of the request time on the server for these statements is actually spent making progress! Also keep in mind that our computation is based on the total_act_time which includes only the time spent actually executing the statement (and not other costs such as compilation / optimization). So this is specically reecting the cost of subsystem waits within the statement execution (such as I/O operations, lock waits, etc).
A logical next step if we wanted to drill down further to understand the execution prole of the statements would be to turn around and rank the individual wait and component times for each statement of interest. We will look at this in the next section.

30

Another common query we can perform using MON_GET_PKG_CACHE_STMT is to look at the ratio of rows read vs. rows returned for a given statement. This is an excellent indication of index effectiveness in our queries. A signicantly high ratio of rows read for every row returned tells us that we are doing extensive table scanning, and that we should probably investigate the tables underlying the query for possible missing indexes (depending on how frequently the query is executed). Also by keeping track of such data for our workloads over time we can detect problematic changes in existing queries (such as a dropped index for example suddenly leading to a degradation in the ratio).
In the example above we have selected the top 10 sql statements in the package cache with the worst ratio of rows read to rows returned. The values shown indicate signicant inefciency in two of our queries that would warrant further investigation.
In the rst one though we appear to be doing a count(*) would account for the table scan. In the second one however we are selecting the minimum value for a particular column this could be signicantly speeded up by an index, so it would be worth investigating if this is a query that is executed on a regular basis which might justify such a change (or whether there was already supposed to be an index on this column).
We may not always be able to get to the root cause of the problem in such an easy manner, but this type of query can be very effective as a rst line of investigation.

31

In our previous examples using MON_GET_PKG_CACHE_STMT we could have trivially extracted the executable id column in addition to the statement text which gives us an efcient way to identify or reference that particular statement in future queries (or for joins).

32

33

34

Here we query the breakdown of wait and processing times for a statement of interest identied by one of our previous examples to see where we are spending our time. Passing the executable ID for the statement as input to the package cache statement causes us to lter only the statement of interest at source (particularly useful on DPF or PureScale congurations), and is a very efcient way to retrieve these metrics.
This statement is really two queries combined into one - we are combining the query to nd the statement with the most time in the server and then passing the executable id from this directly as input into the mon_get_pkg_cache_stmt_details function to obtain the metrics for that statement.

35

From the actual output we can see that our statement spends a signicant time in non-section processing, and specically in routine user code processing. This is an indication that the bulk of the statements costs are coming from code the statement is executing outside of DB2 (generally user dened stored procedures or UDFs). Depending on whether the cost of routines is expected to be this high for the statement, this may indicate an investigation is needed into the implementation of the routines themselves. We can also see from the example here that the statement has only one wait time, spending a modest percentage of its overall execution time blocking on I/Os. This means the query has been able to fulll its processing without any blocking on lock waits or on other subsystems.
An additional note on routine times worth mentioning is that for performance reasons TOTAL_ROUTINE_USER_CODE_PROC_TIME is not collected for NO SQL routines by default. If investigation into routines is performed it is therefore recommended that the mon_act_metrics db cong parameter (or the collect activity metrics clause for the associated workload object) be set to EXTENDED.

36

A surface discussion is provided here on the characteristics of wait times and processing times in the activity / statement dimension to give further background on time spent. Due to time limitations we will not go into detail on the individual metrics, however a table of the metrics in the activity hierarchy is included in the presentation for further reference.

37

The table above gives a brief cheatsheet on the major wait times added in 9.7 related to query execution and their interpretation. Most represent subsystem waits where tuning actions may be taken. Others help complete the picture of where server threads spend their time waiting.

38

39

40

The table above gives a brief cheatsheet on the major component areas in 9.7 related to query execution and their interpretation. The components represent general areas in the query execution pipeline where we are spending our time. Spending signicant time in unexpected component areas may indicate unexpected / inefcient application behavior on the system. By looking at the processing vs. total time spent in components one can also differentiate whether the time is actually spent processing in that component, or whether the time is spent waiting on some subsystem while in that component area. For the latter one would typically want to switch to an examination of the specic wait times in different dimensions to see which subsystems might be a bottleneck and offer opportunities for tuning.

41

A few other key metrics bear mention in the discussion of time spent that are general summary metrics not found in the summary of metrics appendix of this presentation. The rst three represent summary times within the time spent hierarchy, while the third is an important general metric.
STMT_EXEC_TIME represents the total time spent by all foreground / agent threads in the engine working on behalf of the statement in question, and therefore represent the net effort expended by DB2 towards executing this statement. Note that while the section execution metrics are for individual sections only (and therefore dont include the cost of nested SQL statements), the STMT_EXEC_TIME specically *does* include the time spent executing nested SQL in order to provide an indication of the aggregate cost of the statement.
TOTAL_ACT_WAIT_TIME represents the total time threads spent blocking on subsystems within the section, and give an indication of how efciently the section processing is proceeding. Computing (TOTAL_ACTIVITY_TIME TOTAL_WAIT_TIME) will give you the total amount of time threads were eligible to process or consume CPU during this section execution. This is a proxy for the CPU time consumed by agents on the server, but is not exactly the same thing. This is because it measures the time spent both when threads are consuming CPU, or are queued to run on the CPU.
If the actual CPU consumption is desired we provide the TOTAL_CPU_TIME metric, which indicates the actual CPU consumed by agent threads when performing statement processing. Note that obtaining CPU time on the system is performance intensive so therefore do not provide CPU times per component only in aggregate.

42

43

The key aspect of runtime explain is that it allows an explain to be performed against the actual executable byte-code or section the compiler / optimizer produces when building a query plan. This guarantees the explain you are seeing matches the actual plan used to execute your statement and also opens up some interesting capabilities in terms of integrating runtime data with explains, which we will discuss in a few slides.

44

The runtime explain capabilities can be accessed via one of 4 stored procedures which allow you to explain section data from multiple sources and store the results in the standard explain tables.
EXPLAIN_FROM_CATALOG allows you to perform an explain on a section in a previously compiled static package stored in the database catalog tables
EXPLAIN_FROM_SECTION allows you to perform an explain on a section found in the current package cache, or from the historical package cache table data collected by a package cache event monitor (when detailed data is collected)
EXPLAIN_FROM_ACTIVITY allows you to perform an explain on a section captured along with a specic execution of an SQL statement from table data collected using an activity event monitor
EXPLAIN_FROM_DATA provides an alternate formatting mechanism that takes a generic binary section object from an unspecied source and formats it. The binary section data can be obtained, for example, from the catalogs, the package cache or activity event monitor tables. With this function though you have the exibility to extract and store this data elsewhere for future use without needing to keep the original event monitors around.

45

The section actuals or runtime statistics capability is one of the key benets we can add by leveraging runtime explain. That is, we can integrate the query plan information provided by explain with actual statistics collected at the operator level.
In DB2 9.7 FP1 we specically introduced the ability to capture cardinalities for individual operators within the plan for comparison with estimates.
Future versions will be able to leverage this infrastructure to provide additional detailed metrics within the plan.

46

47

In order to capture and examine section actuals for statements we rst need to do a bit of setup:
If we havent already created explain tables on our database we need to execute the DDL provided with DB2 to set these up. These are the output tables where the explain data produced when running one of the new explain procedures as well as the compiler based EXPLAIN.
-We then need to create an activity event monitor to capture our statements and their associated section data in this case we assume the user already has a specic tablespace set aside to collect this data. A separate tablespace is advisable in cases where this event monitor is used heavily as by capturing signicant information at the statement level it may generate large volumes of data. We have also made this a manualstart event monitor so that we can enable it for periods of SQL tracing without the DBMS implicitly enabling it on us if the database is shut down.

48

We now need to actually capture the execution of our SQL statement(s) of interest using the activity event monitor. In our case we will use the activity event monitor to capture what is effectively an SQL Trace of statements run over a brief period of time. Well show two approaches you can use to accomplish this. The rst, shown here, involves a case where we want to run a specic piece of SQL and capture the actuals for it. In order to accomplish this there are several steps to follow:
We rst enable the event monitor this will cause it to start capturing activity data that is collected on the system
Second we use WLM_SET_CONN_ENV to update the attributes for our connection we are specifying three things here:
(a) collectactdata indicates that each activity that executes should collect both details and section data to the event monitor (section data being the key element well need to obtain the actuals)
(b) We have specied collection will occur on all partitions. In our case we are running on a single node instance so this parameter is superuous, but if we were running on DPF we would want to ensure the actuals are collected at each partition since the data is collected on a per partition basis
(c) Finally we enable is the collection of section actuals so that we actually capturing the cardinality values in memory during query execution at runtime
Once these steps are done any statements executed on this connection will be captured using the activity event monitor, so we now run the statement we are interested in capturing. When we have run our statement of interest, we call wlm_set_conn_env once again to disable collection of activity data, and of the section actuals on this connection. If we do not intend to use the event monitor further we would also shut it down.

49

The second approach we can use involves the case where you want to capture data for statements of interest that others are running on the database system, as opposed to statements you run directly on your own connection. This approach would be appropriate if you were troubleshooting a live system, or if you were testing a workload where it would be difcult to enable collection on an individual connection basis. The steps wed take for this approach are the following:
As before we must rst enable the event monitor so that it will capture activity data
The next step is to enable the collection of section actuals on the system. In this case we assume we do not know which connections will be running the statements of interest so we enable this for the entire database.
The nal step here is to enable collection of the appropriate activity data events once again we enable collection of data with details and with the section. Here we have enabled this collection for the entire sysdefaultuserclass for simplicity (which encompasses the entire user workload unless customized WLM service classes are congured). In practice it may be more desirable to enable the collection for a custom dened workload or service class, as this allows the volume of data captured to be better managed.
Once these steps are done any statements executed in the default service class will be captured using the activity event monitor. If we are manually running the test we would run the workload now, otherwise we must wait for the event of interest to occur.
Once the statement(s) of interest have completed we alter the sysdefaultsubclass to stop collecting activity data. If we do not intend to use the event monitor or section actuals further, we would also shut down the event monitor, and turn off the collection of actuals.

50

Having captured the statements of interest, we now need to identify the captured activity entries and their IDs in order to perform an explain and examine the section actuals. Depending on the scenario in question there are multiple ways we can locate the activity of interest. We will show two methods here.
In our example above we select the unique set of activity identiers, the application id, uow id, and activity id for all those entries in the activity tables that match the statement text for our statement of interest. In this particular case there is only one entry for the statement we captured, so we can note the identiers for subsequent use in generating the explain.
A variation on this approach would be to actually set a specic client info eld such as the application name or accounting string when executing the query, and use the presence of those tags to nd the query of interest.

51

An alternate way to approach our statement problem investigation is to capture a trace of the sql running on the system over a period of time using the activity event monitor, and then pinpoint the statements of interest using the associated captured metrics. Note that the activity event monitor captures all the time spent and related metrics we have previously been examining and stores them in XML form, so they can easily be accessed using the previously demonstrated row based formatters.
In our example above we again select the unique set of activity identiers, the application id, uow id, and activity id, but this time we are using the generic row based metrics formatter on the captured metrics XML document in order to obtain the TOTAL_CPU_TIME metric for each statement captured. We then rank by the CPU time and obtain the top 5 most expensive statements captured in terms of CPU consumed.
Note that we also display the statement text as well as the executable id for illustrative purposes. Yet another way to locate our activity of interest would be to perform analysis to nd our statement of interest using MON_GET_PKG_CACHE_STMT and, assuming we had captured activity history data over an appropriate period, use the executable id from our statement of interest to locate the captured activity entry (and many further variations exist).

52

Now that weve captured and identied the activity of interest we can proceed to format it to obtain the explain output with section actuals.
We have already created the explain tables so we can proceed to call the EXPLAIN_FROM_ACTIVITY procedure to generate the activity data. We pass in the activity identiers we collected in the previous step as well as the name of the activity event monitor. The procedure will look up the event monitor tables and proceed to search for the relevant activity and section data. If any of the required data cannot be found an error will be returned. Assuming we performed all the previous steps properly the procedure will indicate that the formatting succeeded.
The nal step in the process is to use the db2exfmt tool to format the generated data in the explain tables into a human readable text report. We show the default options here required to format the most recent explain captured on our database (here we are just using the sample database).

53

There are several parts to the formatted explain which go beyond the scope of this presentation. For our purposes here we will simply focus on the plan graph itself. If you are familiar with the explain facility you will see that this appears to be a standard explain graph with one specic exception; in this case since this was an explain from section on an activity event monitor, and we had captured the section actuals, we actually see two numbers above each operator.
The top number is the cardinality estimate, but there is an additional number underneath it (highlighted in blue here) which is the actual cardinality from our query execution.
Looking at our specic example you can see that the actuals were quite far off from the estimates this is in no small part due to the fact that we created fresh tables for the example and inserted data without performing any runstats.
In a production system however, discrepancies between actuals and estimates can provide you with important clues on inefciencies affecting your query performance. Often these will point to stale table statistics or inaccuracies that can be rectied by altering the way table statistics are produced using runstats. Also of course, as with SQL Compiler Explain variant, the more general plan data can be used to identify opportunities for tuning that can point to things such as missing indexes and also inefcient sections of the plan due to certain query characteristics (many of which can be addressed by simply modifying the way the query is written). All said the topic of query tuning based on explain data is easily large enough to warrant its own presentation (or two), so I will not even attempt to do it justice in a single slide.
Hopefully what we have accomplished though is to show you how this data can be captured and examined in order to provide additional ammunition when searching for query problems using explain.

54

55

In closing I hope this presentation has given you some ideas on how to leverage some of the new DB2 diagnostics for analyzing SQL performance to solve problems in your system. As always there is no hard and fast rule when approaching performance problems, but the methods outlined in this presentation provide approaches that will yield useful query performance information in a wide variety of contexts.

56

57

58

59

David Kalmuk is a technical lead and architect for the DB2 for LUW product, specializing in the process model, workload management, and monitoring areas. David has played a lead role in the design and architecture of numerous technologies in DB2 over the years including the new Performance Monitoring features in DB2 9.7 and PureScale, the threaded engine, DB2's cross node messaging architecture for partitioned environments, and the connection concentrator. Most recently he has been leading the development of new Workload Management features and technologies. David has been a member of the DB2 team since 2000.

60

Вам также может понравиться