Вы находитесь на странице: 1из 7

Instance efficiency The instance efficiency shows information for many of the common hit ratios.

DBAs should compare them historically. Two approaches exist ratio tuning (proactive) and waits tuning (reactive, immediate). Look for these things: A Buffer NoWait % of less than 99%. This is the ratio of hit request for a specific buffer, where the buffer was immediately available in memory. If the ratio is low, the hot blocks are being contended for what should be found in Buffer Wait section. A Buffer Hit % of less than 95% - this is the ratio of hits on a request for a specific buffer and the buffer was in memory, instead of needing to do a physical I/O. Although originally one of the few methods of measuring efficiency it is still an excellent method for showing how often you would need to do a physical I/O, which merits further investigation as to the cause. Unfortunately, if you have unselective indexes that are frequently accessed, it will drive your hit ratio higher which could be a misleading indication of good performance. When you effectively tune your SQL and have effective indexes on your system, the issue is encountered as frequently and the hit ratio is a better performance indicator. A Library Hit of less than 95%. A lower library cache hit ratio indicates that the SQL is being pushed out of the shared pool too early (shared pool too small). A lower ratio could indicate that bind variables are not being used or some other issue is causing SQL not to be reused (in which case a smaller shared pool may only be a band aid that will potentially fix a library latch problem that may result). You must first fix the problem (use bind variables or CURSOR_SHARING) and then appropriately size the shared pool. An In-memory sort % of less than 95% in OLTP in an OLTP system you really do not want to do disk sorts. Setting the PGA_AGGREGATE_TARGET (or SORT_AREA_SIZE) eliminates this problem. A soft Parse % of less than 95% - as covered in the Load Profile section (last section), a soft parse ratio that is below around 80% indicates that SQL is not being reused and needs to be investigated. A Latch Hit % of less than 99% is usually a big problem. Finding the specific latch leads you to solving the issue.

The hit ratios are a great proactive barometer. If an index was dropped on a frequently accessed column, the buffer hit ratio could drop dramatically. If an index was added to a table, it could cause the buffer hit ratio to soar if it causes a table join to occur in the wrong order, causing massive buffers to be read. A library hit ratio that rises or falls greatly from one day to another will give you indications of changing SQL patterns. Latch hit ratios changes can reveal contention issues that need to be investigated more. Top Wait events This section is the most revealing if you need to quickly eliminate a bottleneck. This section shows top five wait events, the fuill list of wait events and the background wait events. If TIMED_STATISTICS is TRUE, then the events are ordered in time waited; if FALSE then the events are ordered by number of waits. Here is an example of problems on a high volume system: Events Db file sequential read Latch free LOG FILE SYNC Log file parallel write Waits 18,9777,104 4,016,773 1,057,224 1,054,006 Wait Time (cs) 22,379,571 2,598,496 733,490 503,695 Wait Time % 82.29 9.55 9.55 1.85

Db file parallel write

1,221,755

404,230

1.49

In this listing, we see a large number of waits related to a single block (db file sequential reads) and also waits for latches (latch free). We also see some pretty high waits for some of the writing to both data files and redo log files, as well as other issues with log file contention. To solve these problems, we must further investigate the granular reports in other sections of statspack. The following are most common problems: DB file scattered read this generally indicates waits related to full tablescans. As full tablescans are pulled into memory, they are scattered throughout the buffer cache because it is unlikely they will fall into contiguous buffers. The large numbers usually mean that indexes are suppressed or missing. This could also be preferred because it may be more efficient to do a full table scan instead of an index scan. Check to see full table scans are necessary when you see these waits. Try to cache small tables to avoid reading them over and over again. DB file sequential read this indicates a single block read (an index read, for example). A large number can indicate poor joining orders of tables or unselective indexing. This value will be normally large for high transaction system. You should correlate this wait with other known issues within STATSPACK, such as inefficient SQL. Check to ensure index scans are necessary and check join orders for multiple joins. The DB_CAHE_SIZE is a determining factor in how often these waits show up. Hash-area joins causing problems should show up in the PGA memory, but they are memory hogs that could cause high wait numbers for sequential reads or they can also show up as direct path read/write waits. Buffer busy wait this is a wait for a buffer that is being used in an unsharable way or is being read into the buffer cache. Buffer busy waits should not be greater than 1%. Check the buffer wait statistics section or V$WAITSTAT to find out if the wait is on segment header. If so, increase the freelists and/or groups or increase the PCTUSED to PCTFREE gap. (if you are using ASSM, Oracle odes this for you). If the wait is on an undo header, you can address this by adding rollback segments. If it is on an undo block, you need to decrease data density of the table driving this consistent read or increase the DB_CACHE_SIZE.If the wait is on a data block, you can move data to another block to avoid this hot block, increase the freelists on the table or use Locally Managed Tablespaces LMTs. If it is on an index block rebuild the index, partition it or use a reverse key index. To prevent buffer busy waits related to data blocks, you can also use a smaller block size fewer records fall onto this block making it less hot. When a DML occurs, ORACLE writes information into the block, including all users who are interested in the state of the block (Interested Transaction List ITL). To decrease waits on this area, you can increase INTRANS, which will create the space in the block to allow multiple ITL sorts. You can also increase the PCTFREE on the table where the block exists. Latch free latches are low level queuing mechanisms (they are actually referred to as mutual exclusion mechanisms) used to protect shared memory structures in the SGA. Latches are like locks on memory that are quickly obtained and released. Latches are used to prevent concurrent access to a shared memory structure. If the latch is not available, a latch miss is recorded. Most latch problems are related to the failure to use bind variables (library cache latch), redo generation issues (redo allocation latch), buffer cache contention (buffer cache Iru chain), and hot blocks in the buffer cache (cache buffers chain). There are also cache latches related to bugs; check the Metalink for bug reports. Enqueue an enqueue is a lock that protects a shared resource. Locks protect shared resources such as data in a record. It includes a queuing mechanism which is FIFO (first in, first out).Note that Oracles latching mechanism is not FIFO. Enqueue waits usually point to ST enqueue, HW enqueue, and the TX4 enqueue. The ST enqueue is used for space management and allocation for dictionary managed tablespaces. Use LMTs or try

to manage tablespaces. HW enqueues are used for are used for a high watermark of a segment; manually allocating the extents can circumvent this wait. TX4 are the most common enqueue waits. The TX4 enqueue waits are usually the results of one of 3 issues: a) duplicates in a unique index (commit or rollback to free the enqueue); b) the second is multiple updates to the same bitmap index fragment; c) the most common when several users update the same block. If there are no free ITL blocks, a block level lock would occur. You can easily avoid this scenario by increasing INTRANS and/or MAXTRANS to allow multiple slots and or increasing PCTFREE on the table. Finally, there is a way to get TM locks, which are table locks. If you have foreign keys, be sure to index them to avoid this general locking issue. You can also use a smaller block size so that there are fewer rows in the block; thus greater data concurrency is allowed. Logfile Switch all commit requests are waiting for logfile switch (archiving needed) or logfile switch (checkpoint incomplete). Ensure that archiving disk is not full or slow. DBWR may be too slow due to I/O. You may need to more or larger redo logs, and you may potentially need to add DBWR if it is the problem. Log buffer space the waits occur because you are writing the log buffer faster than LGWR can write it to the redo log files, or because log switches are too slow. To address this problem, increase the size of the redo log files, increase the size of the redo log buffer, or get faster disks to write to. You may even consider using solid-state disks for their high speed. Log file sync when a user commits or rolls back data the sessions redo is flushed to the redo logs from the redo log buffer by LGWR. The users process must wait for this flush to successfully complete. To reduce Log File Sync waits, try to commit more records. Try to commit a batch of 50 instead of one at a time if possible. Put redo logs on faster disks or alternate redo logs on different physical disks to reduce the archiving effect on LGWR. DO not use RAID% because it is very slow for applications that write a lot; potentially consider file system direct I/O or raw devices which are very fast at writing information. Idle events these are also several wait events that are on bottom of the report and might include SQL*Net message to/from client.

STATSPACK Wait Problem Resolutions Wait Problem Sequential read Scattered read Free buffer Buffer busy segment header Buffer busy data blocks Buffer busy undo header Buffer busy undo block Latch free Enqueue ST Enqueue HW Enqueue TX Enqueue - TM Log buffer space Potential Fix Indicates many index reads. Tune the code, especially joins Indicate many full tablescans, tune the code, cache small tables Increase the DB_CACHE_SIZE; shorten the checkpoint; tune the code Add freelists or freelist groups Separate hot data, use reverse key indexes, smaller blocks, or increase initrans or maxtrans Add rollback segments or areas Commit more; larger rollback segments Investigate the details Use LMT or preallocate large extents Preallocate extents above HW Increase initrans and/or maxtrans on the table or index Index foreign keys, check application locking of tables Increase the Log Buffer; faster disks for the redo logs

Log file switch Write complete waits Idle event Log file sync

Archive destination slow or full; add more or larger redo logs Add database writers, checkpoint more often; buffer cache too small Ignore it Commit more records at a time, faster redo log disks, raw devices

Top SQL Statements The most resource-intensive SQL statements in the database are listed next, in descending order of buffer gets. Evaluate the explain plan for the query to find out why it performs so many disk reads. If the disk reads are not high, but the buffer gets are high and the executions are low, then the query is either using a bad index or performing a join in the wrong order (buffer gets 166,000,000, executions 6,500, gets per exec 25,000, % total 2%, hash value 1577172347). After listing the physical commands with the most cumulative buffer gets, a second listing of the command is provided, this time ordered by the greatest number of physical reads. A third listing of commands orders the executions and the fourth orders by parse calls. TIP. Tuning the top 25 buffer get and top 25 physical get queries has yielded system performance gains from 5% to 5,000%. The SQL section of the STATSPACK report tells you which queries to potentially tune first. The top 10 percent of your statements should not be more than 10% of your buffer gets or disk reads. Instance Activity Statistics The V$SYSSTAT statistics are useful for identifying performance issues not shown in the prior sections. Things to look for: Compare the number of sorts performed on disk to the number performed in memory. Increase the PGA_AGGREGATE_TARGET (or sort are size) to reduce disk sorts. If physical reads are high, you are performing full table scans and you need to add indexes. A large number of consistent gets signals potentially overindexed or nonselective index use. If dirty buffers inspected is too high (over 5%) relative to free buffers requested, the DB_CACHE_SIZE may be too small or you may not be checkpointing often enough. If leaf node splits are high, consider rebuilding indexes that have grown and fragmented. Example of sort and table scan statistics Statistic Sorts (disk) Sorts (rows) Table scans (long tables) Table scans (short table) Total 89 7,659 0 10 0.3 26.1 0.0 0.0 Per Second Per Trans 44.5 3,829.5 0.0 5.0

Well tuned memory-sorting area Statistic Sorts (disk) Sorts (memory) Sorts (rows) Total 2 1,333.888 43,545,4455 Per Second 0.3 34.1 422.7 0 1.2 13.6 Per Trans

TIP. If many sorts are being performed to disk (greater than 1-5% of all rows being sorted), you may need to adjust sort area size or PGA_AGGREGATE_TARGET. Consider key areas: -Consistent gets the number of blocks accessed in the buffer cache for queries without the SELECT FOR UPDATE clause. The valid value for this statistic plus the value of the db block gets statistic constitute what referred as logical read. -Db block gets the number of blocks in the buffer cache that were accessed for DML. -Physical reads The number of data blocks that were read from disk to satisfy SELECT FOR UPDATE and DML. By adding the consistent gets and db block gets, you get the number of Logical Reads. Data cache hit ratio. Hit ratio=(Logical Reads Physical Reads) / Logical Reads TIP. The buffer cache hit ratio should be above 95%. If it is less, increase DB_CACHE_SIZE, provided physical memory is available. -Dirty buffers inspected the number of modified data buffers that were aged out of LRU list. A value here indicates that DBWR is not keeping up. Add more DBWR. TIP. The number of dirty buffers inspected is greater than 0 increase database writers. -Enqueue timeouts the number of times an enqueue (lock) was requested and the specific one was not available. If this statistic is above zero, investigate the locking issues. -Free buffers inspected the buffers that were skipped because they were dirty, pinned or busy. If you subtract those values (dirty buffers inspected and buffer is pinned count) from this statistic, it will leave the buffers that could not be reused due to latch contention. A large number could be a good indicator that the buffer cache is too small. -Parse count the number of times a SQL statement was parsed (total count) -Recursive calls the number of recursive calls to the database. This type of call occurs for a few reasons misses in the dictionary cache, dynamic storage extentions, and when PL/SQL statements are executed. Generally, if the number of recursive calls is four per process, you should check dictionary cache hit ratio and see if there are tables or indexes with a large number of extents. Unless there is a significant number of PL/SQL the recursive calls should not exceed 10%. - Redo size the size of redo information that was written to the redo log files. This information can be used to to help to resize redo logs. - Sorts (disk) the number of sorts that were unable to be performed in memory and therefore required the creation of temp segment in the temporary tablespace. It should not be over 5%. Increase the PGA_AGGREGATE_TARGET (or sort are size) to reduce disk sorts. - Sorts (memory) the number of sorts that were performed in memory. - Sorts (rows) the total number of rows that were sorted. TIP. The sort (disk) statistic divided by the sorts (memory) should not be above 1-5%. If it is, you should increase the PGA_AGGREGATE_TARGET (or SORT_AREA_SIZE) parameter in the initialization file (given that physical memory is available). Remember that memory allocated for SORT_AREA_SIZE is a per-user value, and PGA_AGGREGATE_TARGET is across all sessions. - Table fetch by rowid the number of rows that were accessed by using a ROWID> This ROWID either came from an index or a WHERE ROWID=statement. A high number usually indicates a well tuned application as far as the data goes. - Table fetch continued row the number of rows that were fetched, that were chained, or migrated. TIP. I f chained rows are indicated in the previous parameter, the problem needs to be fixed ASAP. Chained rows can cause severe degradation to performance if a large number of rows are chained. -Table scans (long tables) a table that is larger than _SMALL_TABLE_THRESHOLD (hidden) with no CACHE clause. The default value of small table threshold is 2% of the buffer cache, or 20 blocks in 9i. Full table scans on short tables are preferred. It is very dangerous to adjust it.

-Table scans (short tables) a table that is shorter than _SMALL_TABLE_THRESHOLD with no CACHE. Tablespace and File I/O Statistics If the I/O is not properly distributed among your files, you may encounter performance bottlenecks during pick activity. As a result, you do not want more than 100 I/Os per second per 10,000RPM disk(even with RAID array). The INIT.ORA parameter that can improve the read time is DB_FILE_MULTIBLOCK_READ_COUNT (number of blocks that can be read during one I/O when a full table scan is performed). Unfortunately, the optimizer might do more full table scans as a result of setting DB_FILE_MULTIBLOCK_READ_COUNT (you do not want this behavior), so you may also need to set the OPTIMIZER_INDEX_COST_ADJ to a number, such as ten, to eliminate this problem and drive the use of indexes. Term Tablespace Reads Av Blks/Rd Writes Definition The number of the tablespace The number of physical reads that were performed on the database to retrieve data The number of blocks per that were read from thedatafile to satisfy an average read The number of writes to the datafile

If one datafile gets most of reads balance data by creating datafile on separate disks or striping data across multiple disks. Also, stay away from RAID4. Additional Memory Statistics Following the I/O stats, the report lists the buffer cache statistics, instance recovery statistics (the number of redo blocks), and the PGA memory statistics. Rollback/UNDO Statistics First, it lists the activity in the rollback segments (writes, shrinks, extents) and the waits encountered. Make sure that number of bytes written per rollback segments is equal (should be gathered for a long time). Rollback segment wraps do not cause significant performance problems, but extends indicate that rollback segment was not big enough. If rollback segment waits exceed 5%, you may need to add more rollback segments. Term RBS No Shrinks Wraps Definition Rollback segment number The number of times the rollback segment eliminated one or more extents because it was large than optimal size The number of times rollback segment wrapped from one extent to another

Following the rollback segment sections, the report lists the undo segment statistics for systemmanager undo. If you use automatic undo management mode, then there will be ten rollback segments and one system rollback segment (segment #0). If you use automatic undo, ten rollback segments will be created in addition to the single system segment. Latch Statistics

Latches are low level queue mechanisms (mutual exclusion mechanisms) used to protect shared memory structures in the SGA. Latches are like locks memory that are very quickly obtained and released. Latches are used to prevent concurrent access to a shared memory structure. If the latch is unavailable, then a latch free miss is recorded. Most of latches are

Вам также может понравиться