Вы находитесь на странице: 1из 74
MySQL Performance Optimization and Troubleshooting with PMM Peter Zaitsev, CEO, Percona Percona Technical Webinars 9
MySQL Performance Optimization
and Troubleshooting with PMM
Peter Zaitsev, CEO, Percona
Percona Technical Webinars
9 May 2018
Optimization and Troubleshooting with PMM Peter Zaitsev, CEO, Percona Percona Technical Webinars 9 May 2018

Few words about Percona Monitoring and Management (PMM)

Few words about Percona Monitoring and Management (PMM) 100% Free, Open Source database troubleshooting and performance
100% Free, Open Source database troubleshooting and performance optimization platform for MySQL and MongoDB Based
100% Free, Open Source database troubleshooting and performance optimization platform for MySQL and MongoDB Based
100% Free, Open Source database troubleshooting and performance optimization platform for MySQL and MongoDB Based

100% Free, Open Source database troubleshooting and performance optimization platform for MySQL and MongoDB

Based on Industry Leading Technology

Roll your own in and out of the Cloud

platform for MySQL and MongoDB Based on Industry Leading Technology Roll your own in and out
platform for MySQL and MongoDB Based on Industry Leading Technology Roll your own in and out

2

Exploring Percona Monitoring and Management

Exploring Percona Monitoring and Management You should be able to install PMM in 15 minutes or
You should be able to install PMM in 15 minutes or less Would like to
You should be able to install PMM in 15 minutes or less Would like to
You should be able to install PMM in 15 minutes or less Would like to

You should be able to install PMM in 15 minutes or less

Would like to follow along in the demo ?

You should be able to install PMM in 15 minutes or less Would like to follow
• http://bit.ly/InstallPMM
• http://bit.ly/InstallPMM
• https://pmmdemo.percona.com
• https://pmmdemo.percona.com
or less Would like to follow along in the demo ? • http://bit.ly/InstallPMM • https://pmmdemo.percona.com 3

3

In the Presentation
In the Presentation

Practical approach to deal with some of the common MySQL Issues

In the Presentation Practical approach to deal with some of the common MySQL Issues 4

4

PMM is not just for MySQL
PMM is not just for MySQL
Supports MongoDB as well Other databases can be added via External Exporters This Presentation is
Supports MongoDB as well Other databases can be added via External Exporters This Presentation is
Supports MongoDB as well Other databases can be added via External Exporters This Presentation is

Supports MongoDB as well

Other databases can be added via External Exporters

This Presentation is MySQL Focused

Supports MongoDB as well Other databases can be added via External Exporters This Presentation is MySQL

5

Assumptions
Assumptions
You’re looking to Have your MySQL Queries Run Faster You want to troubleshoot sudden MySQL
You’re looking to Have your MySQL Queries Run Faster You want to troubleshoot sudden MySQL
You’re looking to Have your MySQL Queries Run Faster You want to troubleshoot sudden MySQL

You’re looking to Have your MySQL Queries Run Faster

You want to troubleshoot sudden MySQL Performance Problem

You want to find way to run more efficiently (use less Resources)

to troubleshoot sudden MySQL Performance Problem You want to find way to run more efficiently (use

6

How to Look at MySQL Performance
How to Look at MySQL Performance

Query Based Approach

• All the users (developers) care is how quickly their queries perform

Resource Based Approach

Queries use resources. Slow Performance often caused by resource constraints

perform Resource Based Approach • Queries use resources. Slow Performance often caused by resource constraints 7

7

Primary Resources
Primary Resources
CPU Disk IO Memory Network
CPU Disk IO Memory Network
CPU Disk IO Memory Network

CPU

Disk IO

CPU Disk IO Memory Network
Memory Network
Memory Network
Memory Network

Memory

Network

Memory Network
Primary Resources CPU Disk IO Memory Network 8

8

Low Resource Usage + Poor Performance
Low Resource Usage + Poor Performance

Contention

• Table Locks/Row Level Locks • Locking/Latching in MySQL and Kernel

Mixed Resource Usage

Single worker spending 33% on CPU

33% Waiting on Disk

33% on Network

• Will not be seen as directly constrained by any resource

• 33% Waiting on Disk • 33% on Network • Will not be seen as directly

9

Load Average
Load Average
• What can you tell me about server load ?
• What can you tell me about server load ?
Load Average • What can you tell me about server load ? 10

10

Problems with Load Average
Problems with Load Average
Mixes CPU and IO resource usage (on Linux) Is not normalized for number of CPU
Mixes CPU and IO resource usage (on Linux) Is not normalized for number of CPU
Mixes CPU and IO resource usage (on Linux) Is not normalized for number of CPU

Mixes CPU and IO resource usage (on Linux)

Is not normalized for number of CPU cores available

Does not keep into account Queue Depth Needed for optimal storage performance

for number of CPU cores available Does not keep into account Queue Depth Needed for optimal

11

CPU Usage
CPU Usage
• Can observe overall or per core • Matching Load Average in the previous screen
• Can observe overall or per core
• Matching Load Average in the previous screen
CPU Usage • Can observe overall or per core • Matching Load Average in the previous

12

Saturation Metrics
Saturation Metrics
• Good to understand where waits are happening • IO Load is not normalized
• Good to understand where waits are happening
• IO Load is not normalized
Saturation Metrics • Good to understand where waits are happening • IO Load is not normalized

13

Looking at CPU Saturation Separately
Looking at CPU Saturation Separately
• Can normalize CPU Saturation based on number of threads
• Can normalize CPU Saturation based on number of threads
Looking at CPU Saturation Separately • Can normalize CPU Saturation based on number of threads 14

14

Row Locks – Logical Contention
Row Locks – Logical Contention
• Row Locks are often declared by transaction semantics • But more transactions underway also
• Row Locks are often declared by transaction semantics
• But more transactions underway also mean more locks
• Row Locks are often declared by transaction semantics • But more transactions underway also mean

15

Zooming in on Row Locks Wait Load
Zooming in on Row Locks Wait Load
• How many MySQL Connections are Blocked because or Row Level Lock Waits
• How many MySQL Connections are Blocked because or Row Level Lock
Waits
Zooming in on Row Locks Wait Load • How many MySQL Connections are Blocked because or

16

“Load at MySQL Side”
“Load at MySQL Side”
• “threads_running” - MySQL is busy handling query • CPU ? Disk ? Row Level
• “threads_running” - MySQL is busy handling query
• CPU ? Disk ? Row Level Locks ? Need to dig deeper
Side” • “threads_running” - MySQL is busy handling query • CPU ? Disk ? Row Level

17

MySQL Questions – Inflow of Queries
MySQL Questions – Inflow of Queries
• Are we serving more queries or less queries ? • Any spikes or dips
• Are we serving more queries or less queries ?
• Any spikes or dips ?
MySQL Questions – Inflow of Queries • Are we serving more queries or less queries ?

18

Innodb Rows – Actual Work Being Done
Innodb Rows – Actual Work Being Done
• Better number to think re system capacity • Not all rows are created equal,
• Better number to think re system capacity
• Not all rows are created equal, but more equal than queries
Done • Better number to think re system capacity • Not all rows are created equal,

19

Commands – What kind of operations
Commands – What kind of operations
• Note if prepared statements are used MySQL is “double counting”
• Note if prepared statements are used MySQL is “double counting”
Commands – What kind of operations • Note if prepared statements are used MySQL is “double

20

MySQL “Handlers” low lever row access
MySQL “Handlers” low lever row access
• Works for all storage engines • Gives more details on access type • Mixes
• Works for all storage engines
• Gives more details on access type
• Mixes Temporary Tables and Non-Temporary tables together

21

Memory usage by MySQL
Memory usage by MySQL
Leave some memory available for OS Cache and other needs
Leave some memory available for OS Cache and other needs
Memory usage by MySQL Leave some memory available for OS Cache and other needs 22

22

Innodb in Depth
Innodb in Depth
Innodb in Depth
Innodb in Depth
Innodb in Depth
Innodb Checkpointing
Innodb Checkpointing
• The log file size is good enough as Uncheckpointed bytes are fraction of log
• The log file size is good enough as Uncheckpointed bytes are fraction of
log file size
Innodb Checkpointing • The log file size is good enough as Uncheckpointed bytes are fraction of

24

Innodb Checkpointing
Innodb Checkpointing
• Very Close – Innodb Log File Size too small for optimal performance
• Very Close – Innodb Log File Size too small for optimal performance
Innodb Checkpointing • Very Close – Innodb Log File Size too small for optimal performance 25

25

Innodb Transaction History - not yet Purged Transactions

Innodb Transaction History - not yet Purged Transactions • Short term spikes are normal if some
• Short term spikes are normal if some longer transactions are ran on the system
• Short term spikes are normal if some longer transactions are ran on the
system
- not yet Purged Transactions • Short term spikes are normal if some longer transactions are

26

Innodb Transaction History
Innodb Transaction History
• Growth over long period of time without long queries in the processlist • Often
• Growth over long period of time without long queries in the processlist
• Often identifies orphaned transactions (left open)
long period of time without long queries in the processlist • Often identifies orphaned transactions (left

27

Transaction History Recovery
Transaction History Recovery
• If Backlog is resolved quickly it is great • If not you may be
• If Backlog is resolved quickly it is great
• If not you may be close to the limit of purge subsystem
Recovery • If Backlog is resolved quickly it is great • If not you may be

28

Is your Innodb Log Buffer Large Enough?
Is your Innodb Log Buffer Large Enough?
• You will be surprised to see how little log buffer space Innodb needs
• You will be surprised to see how little log buffer space Innodb needs
Is your Innodb Log Buffer Large Enough? • You will be surprised to see how little

29

Another way to look at Logging Performance
Another way to look at Logging Performance
Another way to look at Logging Performance 30
Another way to look at Logging Performance 30

30

Innodb IO
Innodb IO
• Will often roughly match disk IO • Allows to see the writes vs fsyncs
• Will often roughly match disk IO
• Allows to see the writes vs fsyncs
Innodb IO • Will often roughly match disk IO • Allows to see the writes vs

31

Hot Tables
Hot Tables
• It is often helpful to know what tables are getting most Reads • And
• It is often helpful to know what tables are getting most Reads
• And Writes
Hot Tables • It is often helpful to know what tables are getting most Reads •

32

Hot Tables through Performance Schema
Hot Tables through Performance Schema
• Even more details available in Performance Schema • Load is a better measure of
• Even more details available in Performance Schema
• Load is a better measure of actual cost than number of events

33

Most Active Indexes
Most Active Indexes
• See through which index queries access tables
• See through which index queries access tables

34

What about Queries causing the most load?

What about Queries causing the most load? • Can examine through Query Analytics application 35
• Can examine through Query Analytics application
• Can examine through Query Analytics application
What about Queries causing the most load? • Can examine through Query Analytics application 35

35

Latency Details Explored
Latency Details Explored
• Not enough to look at Average Latency
• Not enough to look at Average Latency
Latency Details Explored • Not enough to look at Average Latency 36

36

What are Top Queries ?
What are Top Queries ?
Queries Sorted by their “Load” Query ran 10 times over second each time taking 0.2
Queries Sorted by their “Load” Query ran 10 times over second each time taking 0.2
Queries Sorted by their “Load” Query ran 10 times over second each time taking 0.2

Queries Sorted by their “Load”

Query ran 10 times over second each time taking 0.2 sec will be load 2

Not making a difference between queries “causing” the load or just impacted by it

0.2 sec will be load 2 Not making a difference between queries “causing” the load or

37

Whole Server Summary #1
Whole Server Summary #1
• Server Summary Gives a good idea what is going on query wise
• Server Summary Gives a good idea what is going on query wise
Whole Server Summary #1 • Server Summary Gives a good idea what is going on query

38

Whole Server Summary #2
Whole Server Summary #2
Whole Server Summary #2 39
Whole Server Summary #2 39

39

Specific Query – Update Query
Specific Query – Update Query
• Significant part of response time comes from row level lock waits
• Significant part of response time comes from row level lock waits
Specific Query – Update Query • Significant part of response time comes from row level lock

40

Expensive SELECT Query
Expensive SELECT Query
• Examining lots of rows per each row sent
• Examining lots of rows per each row sent
Expensive SELECT Query • Examining lots of rows per each row sent 41

41

Check Query Example
Check Query Example
• Expensive Query not poorly optimized one
• Expensive Query not poorly optimized one
Check Query Example • Expensive Query not poorly optimized one 42

42

Explain and JSON Explain
Explain and JSON Explain
Explain and JSON Explain 43
Explain and JSON Explain 43

43

Explore Any Captured Metrics
Explore Any Captured Metrics
• Standard Dashboards are only tip of the iceberg • You can also use Prometheus
• Standard Dashboards are only tip of the iceberg
• You can also use Prometheus directly
Any Captured Metrics • Standard Dashboards are only tip of the iceberg • You can also

44

Lets Look at Couple of Case Studies
Lets Look at Couple of Case Studies
Lets Look at Couple of Case
Studies
Lets Look at Couple of Case Studies
Lets Look at Couple of Case Studies
Impact Of Durability ?
Impact Of Durability ?
Running sysbench with rate=1000 to inject 1000 transactions every second System can handle workloads with
Running sysbench with rate=1000 to inject 1000 transactions every second System can handle workloads with
Running sysbench with rate=1000 to inject 1000 transactions every second System can handle workloads with

Running sysbench with rate=1000 to inject 1000 transactions every second

System can handle workloads with both settings

System previously running with sync_binlog=0 and

innodb_flush_log_at_trx_commit=0

Set them to sync_binlog=1 and innodb_flush_log_at_trx_commit=1

=0 and innodb_flush_log_at_trx_commit =0 Set them to sync_binlog =1 and innodb_flush_log_at_trx_commit =1 46

46

IO Bandwith
IO Bandwith
• IO Bandwidth is not significantly impacted
• IO Bandwidth is not significantly impacted
IO Bandwith • IO Bandwidth is not significantly impacted 47

47

IO Saturation Jumps a Lot
IO Saturation Jumps a Lot
IO Saturation Jumps a Lot 48
IO Saturation Jumps a Lot 48

48

Read and Write Latencies are Impacted
Read and Write Latencies are Impacted
• This SSD (Samsung 960 Pro) Does not like fsync() calls
• This SSD (Samsung 960 Pro) Does not like fsync() calls
Read and Write Latencies are Impacted • This SSD (Samsung 960 Pro) Does not like fsync()

49

More Disk IO Operations
More Disk IO Operations
• Frequent Fsync() causes more writes of smaller size to storage
• Frequent Fsync() causes more writes of smaller size to storage
More Disk IO Operations • Frequent Fsync() causes more writes of smaller size to storage 50

50

Increase In Disk IO Load
Increase In Disk IO Load
• IO Avg Latency Increase + More IOPs = Load Increase
• IO Avg Latency Increase + More IOPs = Load Increase
Increase In Disk IO Load • IO Avg Latency Increase + More IOPs = Load Increase

51

Disk IO Utilization jumps to 100%
Disk IO Utilization jumps to 100%
• There is at least one disk IO Operation in flight all the time
• There is at least one disk IO Operation in flight all the time
Disk IO Utilization jumps to 100% • There is at least one disk IO Operation in

52

Average IO Size is down
Average IO Size is down
• Large block writes to binlog and innodb transaction logs do not happen any more
• Large block writes to binlog and innodb transaction logs do not happen
any more

53

Number of Running Threads Impacted
Number of Running Threads Impacted
• Need higher concurrency to be able to drive same number of queries/sec
• Need higher concurrency to be able to drive same number of queries/sec
Number of Running Threads Impacted • Need higher concurrency to be able to drive same number

54

MySQL Questions
MySQL Questions
• Why does it increase with same inflow of transactions ?
• Why does it increase with same inflow of transactions ?
MySQL Questions • Why does it increase with same inflow of transactions ? 55

55

Because of Deadlocks
Because of Deadlocks
• Some transactions have to be retried due to deadlocks • Your well designed system
• Some transactions have to be retried due to deadlocks
• Your well designed system should behave the same
• Some transactions have to be retried due to deadlocks • Your well designed system should

56

Higher Row Lock Time
Higher Row Lock Time
• Rows Locks can be only released after successful transaction commit • Which now takes
• Rows Locks can be only released after successful transaction commit
• Which now takes longer time due to number of fsync() calls
be only released after successful transaction commit • Which now takes longer time due to number

57

And Load Caused by Row Locks
And Load Caused by Row Locks
And Load Caused by Row Locks 58
And Load Caused by Row Locks 58

58

Log Buffer Used even less with durability on

Log Buffer Used even less with durability on 59
Log Buffer Used even less with durability on 59
Log Buffer Used even less with durability on 59

59

Is Group Commit Working ?
Is Group Commit Working ?
• Do we relay on Group Commit for our workload
• Do we relay on Group Commit for our workload
Is Group Commit Working ? • Do we relay on Group Commit for our workload 60

60

Top Queries Impacted
Top Queries Impacted
• Commit is now the highest load contributor
• Commit is now the highest load contributor
Top Queries Impacted • Commit is now the highest load contributor 61

61

Changing Buffer Pool Size
Changing Buffer Pool Size
Changing Buffer Pool Size
Changing Buffer Pool Size
Changing Buffer Pool Size
MySQL 5.7 Allows to change BP Online
MySQL 5.7 Allows to change BP Online
• Changing buffer pool from 48GB to 4GB online mysql> set global innodb_buffer_pool_size=4096*1024*1024; Query
• Changing buffer pool from 48GB to 4GB online
mysql> set global
innodb_buffer_pool_size=4096*1024*1024;
Query OK, 0 rows affected (0.00 sec)
to 4GB online mysql> set global innodb_buffer_pool_size=4096*1024*1024; Query OK, 0 rows affected (0.00 sec) 63

63

QPS Impact
QPS Impact
• While resizing is ongoing capacity is limited – Queueing happens • After resize completed
• While resizing is ongoing capacity is limited – Queueing happens
• After resize completed backlog has to be worked off having higher
number of queries
limited – Queueing happens • After resize completed backlog has to be worked off having higher

64

Saturation spike and when stabilizing on higher level

Saturation spike and when stabilizing on higher level • Guess why the spike with lower QPS
• Guess why the spike with lower QPS Level ?
• Guess why the spike with lower QPS Level ?
Saturation spike and when stabilizing on higher level • Guess why the spike with lower QPS

65

Two IO Spikes
Two IO Spikes
• First to Flush Dirty Pages • Second to work off higher query rate
• First to Flush Dirty Pages
• Second to work off higher query rate
Two IO Spikes • First to Flush Dirty Pages • Second to work off higher query

66

What is about Disk IO Latency ?
What is about Disk IO Latency ?
• Higher Number of IOPS does not always mean much higher latency
• Higher Number of IOPS does not always mean much higher latency
What is about Disk IO Latency ? • Higher Number of IOPS does not always mean

67

Longer Transactions = More Deadlocks
Longer Transactions = More Deadlocks
Longer Transactions = More Deadlocks 68
Longer Transactions = More Deadlocks 68

68

More IO Load Less Contention ?
More IO Load Less Contention ?
• Unsure why this is the case • Note not ALL contention is shown in
• Unsure why this is the case
• Note not ALL contention is shown in those graphs
More IO Load Less Contention ? • Unsure why this is the case • Note not

69

Now we see query 80% IO Bound
Now we see query 80% IO Bound
Now we see query 80% IO Bound 70
Now we see query 80% IO Bound 70

70

Summary
Summary
Can get a lot of Insights in MySQL Performance with PMM Great tool to have
Can get a lot of Insights in MySQL Performance with PMM Great tool to have
Can get a lot of Insights in MySQL Performance with PMM Great tool to have

Can get a lot of Insights in MySQL Performance with PMM

Great tool to have when you’re challenged troubleshoot MySQL

A lot of insights during benchmarking and evaluation

Great tool to have when you’re challenged troubleshoot MySQL A lot of insights during benchmarking and

71

72
Percona to Support PostgreSQL
Percona to Support PostgreSQL
Percona to Support PostgreSQL 73
Percona to Support PostgreSQL 73

73

Thank You!
Thank You!
Thank You!
Thank You!
Thank You!