Академический Документы
Профессиональный Документы
Культура Документы
1
Today’s Agenda
• The Honda Civic and the Nissan GT-R
• Metrics, Trace, Mitigations
• A New Secret Weapon
• Resources and Q&A
From http://flickr.com/photos/stevekeys/2755142278/
2
But The Faster You Want To Go
3
Windows “Check Engine” Light
4
Two Approaches to Detection
• Exceptions Monitoring: Check Engine
• Proactive Monitoring: Detailed Gauges
Where Do We Start?
Hardware
Windows
SQL Server
Tables,
Indexes
Query
5
Capture Metrics With Perfmon
• Performance Monitor, aka Perfmon
• Ships with all Windows versions
• Polls any server from your desktop
• Pulls performance metrics
• Writes them to a file
• Requires some OS permissions
• Does not include alerts or analytics
11
6
Memory Counters
• Memory – Available Mbytes
• Paging File - % Usage
• SQLServer:Buffer Manager –
– Buffer cache hit ratio
– Page life expectancy
• SQLServer:Memory Manager – Memory
G
Grants Pending
P di
7
Storage Metrics: Physical Disk
• % Disk Time
• Avg. Disk Queue Length
• Avg. Disk sec/Read
• Avg. Disk sec/Write
• Disk Reads/sec
• Disk Writes/sec
CPU Metrics
• Processor - % Processor Time
• System – Processor Queue Length
• SQLServer:General Statistics – User
Connections (not CPU, just “other”)
8
The Raw Output: CSV Files
16
17
9
That’s a Lot of Zeroes!
18
19
10
What To Look For, In Order
• System – Processor Queue Length
• Memory – Available Mbytes
• Lock pages in memory!
11
Got Everything on One Drive?
• Narrow it down with the DMV
sys.dm_io_virtual_file_stats
y _ _ _ _
12
Table Analysis Tools For The Cloud
13
Capture Queries with a Trace
Columns to Capture
14
Profiler’s Results: A Trace Table
15
Casting and Grouping
16
Another Way: Perf Dashboard
17
Sample Problem #1
• Metrics tell us:
– Very high disk queue
lengths on data drive
• Trace tells us:
– Report queries doing
table scans w/o indexes
– Many scheduled reports
run simultaneously
i lt l
18
Sample Problem #2
• Metrics tell us:
– Page file drive queue lengths
average >20
– Page file use averages >1%
– Available memory averages
<200mb
– Buffer cache hit ratio and
page life expectancy are high
• Trace tells us:
– No unusual queries
Memory Configuration
Server: 4gb
g ram
19
Ways We Can Mitigate It
• Add memory and enable AWE/PAE
• Add memory and upgrade to 64-bit
64 bit
• Reduce SQL’s min/max memory sizes
• Move the app to its own server
Sample Problem #3
• Metric looks OK, but
everyy 15 minutes:
– Long drive queues on the
log file drive
– Page life expectancy
drops near zero
– Network traffic jumps
• Trace
T tells
ll us:
– Transaction log backups
are running
20
Ways We Can Mitigate It
• Stop doing log backups
• Put the databases in simple mode
• Add drives to the transaction log array
• Throttle the transaction log backups
Sample Problem #4
• Metrics tell us:
– CPU average is high
– Disk, memory look OK
• Trace tells us:
– Queries are using
cursors
– Operating on
individual records, not
sets
21
How We Can Mitigate It
• Change cursor to set-based query
• Buy really fast processors
• Spend a lot on licensing
Wrapping Things Up
• Double-check the event log first
• Don
Don’tt get overwhelmed: focus with the
Metric – Trace – Mitigation process
• Show a clear cause and effect
• Use cloud-based BI to get an edge
22
Resources On The Web
• My posts about Perfmon and analytics:
www.BrentOzar.com/perfmon
p
www.BrentOzar.com/perfmoncloud
23