Академический Документы
Профессиональный Документы
Культура Документы
7
All-Flash
Primera Publicación: 05-10-2018
Última Actualización: 05-17-2018
Índice
1. Executive Summary
1.1.Business Case
1.2.Solution Overview
1.3.Key Results
2. Introduction
2.1.Purpose
2.2.Scope
2.3.Audience
3. Technology Overview
3.1.VMware vSphere 6.7
3.2.VMware vSAN 6.7
3.3.MySQL 5.7.21
3.4.Quanta QuantaPlex T41S-2U
4. Solution Configuration
4.1.Solution Architecture
4.2.Hardware Resources
4.3.Software Resources
4.4.Network Configuration
4.5.MySQL Virtual Machine Configuration
4.6.MySQL Single Instance Deployment
4.7.Group Replication Deployment
5. Solution Validation
5.1.Test Overview
5.2.Testing Tools
5.3.MySQL on vSAN Scalability Test
5.4.vSAN SPBM Based Test
5.5.Group Replication Test
5.6.vSAN Resiliency Test
6. Best Practices
6.1.vSAN All-Flash Best Practices
6.2.MySQL Parameters Best Practices
7. Conclusion
7.1.Conclusion
8. Reference
8.1.Reference
9. About the Author and Contributors
9.1.About the Author and Contributors
1. Executive Summary
This section covers the business case, solution overview, solution benefits, and
key results of MySQL on VMware vSAN 6.7 solution.
1. 1 Business Case
The “LAMP”, which is originally popularized from the phrase “Linux, Apache,
MySQL and PHP”, refers to a generic software stack model that has
dramatically accelerated the wide adoption of open source technologies.
MySQL is the world’s most popular open source database management system
and is a key part of the LAMP stack. MySQL offers deployment flexibility, on-
demand scalability, high performance and better cost-efficiency open source
database solutions in a customized LAMP stack or a Turnkey Linux
environment.
1. 2 Solution Overview
1. 3 Key Results
2. Introduction
This section provides the purpose, scope, and intended audience of this
document.
2. 1 Purpose
2. 2 Scope
2. 3 Audience
The SME administrator most often deals with MySQL in the form of turn-key
LAMP stacks, as they are a convenient way for any administrators in a small to
medium enterprise environment to deploy an application stack. Due to the
deployment of LAMP stacks, these environments often have one MySQL
instance to one application.
The large enterprise administrator likely has a larger set of resources added to
accommodate a farm of MySQL servers. These environments have a more
traditional one-to-many arrangement. Large enterprise administrators may be
3. Technology Overview
This section provides an overview of the technologies used in this reference
architecture:
• MySQL 5.7.21
With vSphere 6.7, customers can run, manage, connect, and secure their
applications in a common operating environment, across clouds and devices.
VMware vSAN, the market leader HCI, enables low-cost and high-performance
next-generation HCI solutions, converges traditional IT infrastructure silos
onto industry-standard servers and virtualizes physical infrastructure to help
customers easily evolve their infrastructure without risk, improve TCO over
traditional resource silos, and scale to tomorrow with support for new
hardware, applications, and cloud strategies. The natively integrated VMware
infrastructure combines radically simple VMware vSAN storage, the market-
leading VMware vSphere Hypervisor, and the VMware vCenter Server ® unified
management solution all on the broadest and deepest set of HCI deployment
options.
10
3. 3 MySQL 5.7.21
MySQL is the most popular open source database system which enables the
cost-effective delivery of reliable, high-performance and scalable web-based
and embedded database applications. It is an integrated transaction safe,
ACID-compliant database with full commit, rollback, crash recovery, and row-
level locking capabilities. MySQL delivers the ease of use, scalability, and high
performance, as well as a full suite of database drivers and visual tools to help
developers and DBAs build and manage their business-critical MySQL
applications.
InnoDB Engine
InnoDB is the default database storage engine for the latest MySQL 5.7
release, which offers a balance between high reliability and performance.
InnoDB engine has the following advantages:
Group Replication
11
With shared infrastructure such as cooling and power supply, and dense
design that reduces space consumption, QuantaPlex T41S-2U delivers space
and energy efficiency, providing an ideal choice for building solution-based
appliances at a low cost.
Operated with Intel E5-2600 v3, v4 product family with maximum 18-core
count CPU and DDR4 memory technology up to 1024 GB memory capacity,
QuantaPlex T41S-2U improves productivity by offering superior system
performance. Besides, each node supports up to six 2.5-inch hot-swap drives
with HDDs up to 2 TB capacity and SSDs up to 3.84 TB. Hence, 4 nodes can
have up to 76.8 TB of storage capacity using six drives. Therefore, QuantaPlex
T41S-2U is an optimum option for compute-storage balance demand.
For more details of Quanta Cloud Technology and Quanta hardware platform,
please visit here .
12
4. Solution Configuration
This section introduces the resources and configurations for the solution
including an architecture diagram, hardware and software resources and other
relevant configurations.
13
4. 1 Solution Architecture
In this solution, we created a 4-node vSphere and vSAN cluster for single
MySQL server instances deployment. We also configured group replication on
each MySQL instance and set the running mode to single primary to protect
from MySQL software failure.
For the single instance deployment, we deployed one virtual machine per ESXi
host and validated the MySQL virtual machine count up to four. For those
users who need different MySQL backends for separate business purposes,
they can scale the MySQL workloads on demand and use vSAN to offer fault
tolerance and guarantee business continuity. In addition, users could use vSAN
SPBM for space efficiency, workload isolation or IOPS limitation on a virtual
disk basis.
For the group replication deployment, we enabled the group replication plugin
on each MySQL instance virtual machine and created up to 5-instance scale. It
allows users to protect their MySQL database from application level failures for
their critical MySQL applications. We evenly distributed the virtual machines
participating in group replication to fully leverage the cluster resources.
Figure
14
4. 2 Hardware Resources
Each ESXi server in the vSAN cluster has the following configuration:
PROPERTY SPECIFICATION
RAM 256GB
1 x Quanta ON 82599ES
Network
dual-port 10GbE
adapter
mezzanine card
15
4. 3 Software Resources
16
17
4. 4 Network Configuration
The vSphere Distributed Switch used two 10GbE adapters for the teaming and
failover. A port group defines properties regarding security, traffic shaping,
and NIC teaming. To isolate vSAN, VM (node) and vMotion traffic, we used the
default port group settings except for the uplink failover order. We assigned
one dedicated NIC as the active link and assigned another NIC as the standby
link. For vSAN and vMotion, the uplink order is reversed. See Table 3 for
network configuration.
DISTRIBUTE
ACTIVE STANDBY
D PORT
UPLINK UPLINIK
GROUP
VMware
Uplink2 Uplink1
vSAN
VM and
vSphere Uplink1 Uplink2
vMotion
18
Mem VM SCSI ID
MySQL vCP Virtual SCSI
ory Coun (Controll
VM Role U Disks Type
(GB) t er, LUN)
OS LSI
MySQL SCSI (0,
disk: 40 Logic
VM for 0)
GB
SysBench
complex 16 64 4
OLTP VMwar
MySQL
benchmar SCSI (1, e
disk:
k 0) Paravir
800 GB
tual
OS LSI
SCSI (0,
disk: 40 Logic
MySQL 0)
GB
VM for
Up
group 16 64
to 5 VMwar
replication MySQL
test SCSI (1, e
disk:
0) Paravir
800 GB
tual
19
The test database was populated by SysBench with 250 million records and
about 540 GB in total. Table 5 shows the MySQL single instance deployment
profile in the following solution validation part.
MySQL DB
InnoDB
engine
MySQL
540 GB, 250 million records
Database size
innodb_buffer_pool_size = 50G
innodb_buffer_pool_instances = 8
innodb_log_file_size = 512M
innodb_log_files_in_group = 2
innodb_log_buffer_size = 8M
20
innodb_flush_log_at_trx_commit = 2
Item Test Configuration
MySQL Concurrency settings:
parameters
innodb_read_io_threads = 16
innodb_write_io_threads = 16
innodb_thread_concurrency = 0
innodb_flush_neighbors = 0
innodb_flush_method = O_DIRECT
21
transaction_write_set_extraction=XXHASH64
# group replication group name should be unique
loose-group_replication_group_name="1bb1b861-f776-11e6-be42-78
2bcb377188"
loose-group_replication_start_on_boot=off
loose-group_replication_local_address= "IP address1:port1"
loose-group_replication_group_seeds= "IP address1:port1,IP add
ress2:port2, … , IP addressN:portN"
loose-group_replication_bootstrap_group= off
22
23
5. Solution Validation
In this section, we present the test methodologies and results used to validate
this solution.
24
5. 1 Test Overview
The solution validates both performance and protection for running MySQL on
vSAN. And it includes the following test scenarios:
This test validates the following SPBM features included in the vSAN storage
policy:
This test validates the following failure scenarios and how vSAN helps to
guarantee MySQL workload continuity.
5. 2 Testing Tools
We used the following monitoring tools and benchmark tools in this solution
testing:
25
Monitoring tools
vRealize Operations
vSphere and vSAN 6.7 includes vRealize Operations within vCenter . This
new feature allows vSphere customers to see a subset of intelligence offered
up by vRealize Operations through a single vCenter user interface. Light-
weight purpose-built dashboards are included for both vSphere and vSAN. It
is easy to deploy, provides multi-cluster visibility, and does not require any
additional licensing.
esxtop
esxtop is a command-line tool that can be used to collect data and provide
real-time information about the resource usage of a vSphere environment
such as CPU, disk, memory, and network usage. We measure the ESXi Server
performance with this tool.
Linux iostat
26
SysBench Benchmark
Test Objective
Test Scenario
27
Baseline test: scale with different SysBench test threads within one MySQL
virtual machine
Scale-out test: scale with different MySQL virtual machines count within
one vSAN cluster
Test Procedures
1. Update the MySQL “my.cnf” configuration file and restart the MySQL
Service. Start the SysBench workload and run the warm-up SyBench test
for one hour.
2. Set the thread count to 1; run the SysBench OLTP complex read/write test
for 30 minutes; collect and measure the performance result including:
Transaction per second (TPS)
Transaction response time: the latency monitored from the application
level
vSAN backend IOPS from vSAN Performance Services
vSAN backend response time from vSAN Performance Services
3. Set the thread count to 2, 4, 8, 16, 32 respectively, and repeat the test in
step 2.
4. Configure the second MySQL VM with the same “my.cnf” configuration
file, restart the MySQL service and run the warm-up SysBench test for one
hour.
5. Set the thread count to 1 on both MySQL instances, run SysBench OLTP
complex read/write test simultaneously for 30 minutes, collect and
measure the performance result as described in step 2.
6. Set the thread count to 2, 4, 8, 16, 32 respectively, repeat the test in step
5.
7. Configure the third and fourth MySQL VM with the same “my.cnf”
configuration file, restart the MySQL Service and run the warm-up
SysBench test for one hour.
8. Set the thread count to 1 on each MySQL instance, run the SysBench
OLTP complex read/write test simultaneously for 30 minutes, collect and
measure the performance result as described in step 2.
9. Set the thread count to 2, 4, 8, 16, 32 respectively, and repeat the test in
step 8.
Figure 2 shows the MySQL scalability performance in the baseline test using
various test threads for running the SysBench mixed OLTP workloads on vSAN.
28
We started with average 362 TPS and kept the average transaction response
time of 2.75 milliseconds, and the performance boosts 8 times to 2,368 TPS
with the average transaction response time kept in 13.5 milliseconds as shown
in Figure 3. Overall the performance is close to linear while scaling from 1 to 32
threads.
29
Figure 4 and Figure 5 show the MySQL scalability performance in the scale-out
test with two MySQL virtual machines. The workloads are identical across
multiple virtual machines that are spread out on different hosts, and the
workloads were run at the same time. Compared to the result in single VM, the
TPS achieved in this test almost doubled for all the thread count 1 to 32, again
with the maximum number achieved in 32 threads for 4,038 TPS, and the
transaction response time kept only about 15 milliseconds.
Figure 4. SysBench Scale-out Test Result with 2VMs – Transaction per Second
30
Then we increased the VM count to four and the result is shown in Figure 6 and
Figure 7. Compared to 2 VM result, the average TPS result increase over 65%.
With 32 threads, it increased from 4,038 to 6,609, while the transaction
response time only increased to around 19 milliseconds.
31
32
vSAN
Avera
Backe
ge Aggre
Aver nd
My Trans gated
age ESXi Respo
SQ action vSAN
TPS Processor nse
L Respo Backe
(Maxi Time (%) time
VM nse nd
mum) (ms)
Time IOPS
Read/
(ms)
Write
34.05/35.2
6,60 0.791/
4 19.37 3/31.95/36. 65,407
9 3.455
29
33
34
Test Objective
This test is designed to showcase the vSAN SPBM offering for MySQL
performance and capacity considerations.
Test Scenario
Test Procedures
1. Create a vSAN storage policy with checksum disabled. Apply the new
storage policy to the MySQL data disk on each of the MySQL virtual
machine. Run SysBench complex OLTP test with 32 threads and
compared the test result with baseline to evaluate the performance
impact.
2. Create a vSAN storage policy with stripe width set to 2, 4, 6, 8, 10,
respectively, based on the vSAN default storage policy. Apply the new
storage policy to the MySQL data disk on each of the MySQL virtual
machine. Run SysBench complex OLTP test with 32 threads and compare
the test results with the baseline to evaluate the performance impact.
3. Enable deduplication and compression on the vSAN cluster. Run SysBench
complex OLTP test with 32 threads and compare the test result with the
baseline to evaluate the performance impact. Measure the space savings
before and after enabling deduplication and compression.
4. Based on step 3, create a vSAN storage policy with erasure coding
enabled. Apply the new storage policy to the MySQL data disk on each of
the MySQL virtual machine. Run SysBench complex OLTP test with 32
35
threads and compare the test results with the baseline to evaluate the
performance impact. Measure the further space savings ratio before and
after enabling erasure coding.
Test Results
The checksum feature is enabled by default in the vSAN SPBM, which ensures
the data integrity at the vSAN component and object level. In the checksum
test, we disabled the checksum in the default vSAN storage policy, ran the
SysBench complex OLP test with 32 threads and compared the test results.
36
In the stripe width test, as the system contains 20 capacity SSDs in total, with
vSAN default tolerance setting the maximum stripe could be set to up to 10.
Figure 10 shows the stripe width test results, the aggregated TPS of four
MySQL instances reached 7,040 when we configured the stripe width to 8, and
the application response time was minimized to 18 milliseconds. Compared to
the baseline result, the performance improvement was 6.5%.
As the stripe width increases, the MySQL virtual disk objects are more evenly
distributed across the capacity disks in a vSAN cluster. Distributing objects
across a larger number of capacity disks increases rate at which read requests
can be executed, improving the performance of the MySQL servers. However,
it is hard to determine the stripe width impact on write performance since the
write operations are conducted at the cache-tier for all-flash vSAN.
37
vSAN SPBM erasure coding settings can help further improve MySQL
database space efficiency on vSAN. The default RAID level is RAID 1/0 and
users can configure erasure coding in the vSAN storage policy to enable RAID
5/6 which requests at least 4 and 6 hosts in a cluster, respectively. In our test
environment, RAID 5 is implemented since the cluster has four hosts.
In the baseline test, we tested with four MySQL single instances. In the
SysBench 32 threads test, we achieved 6,609 TPS and kept the average
transaction response time in about 19 milliseconds. The total environment
space consumption was 6.18 TB with default fault tolerance level.
38
Then we enabled erasure coding with RAID 5 on the MySQL data disks and we
observed further space savings of 45% as shown in Figure 12, the final space
consumption was 3.41 TB. As our workload is 75/25 in read/write ratio, RAID 5
does not help heavy write operations. We observed higher average transaction
response time which reached 29 milliseconds, and the aggregated TPS was
4,382 with 30% performance downgrade.
39
40
Test Objective
Test Scenario
41
Test Procedures
4. In the 5-member group replication cluster, run the baseline workload against
the primary instance for a certain period. Shut down the MySQL service on one
of the replication instances, collect TPS results every second and measure
the workload continuity.
5. In the 5-member group replication cluster, run the baseline workload against
the primary instance for a certain period. Shut down MySQL service on the
primary instance, measure the time of workload to recover on the new
primary instance.
Test Results
Figure 14 shows the group replication test result. In the baseline, we achieved
1,300 TPS and 6.14 milliseconds transaction response time on a single MySQL
instance.
42
Figure 15 shows the sustained TPS on the primary instance during a replication
instance failure in a 5-member group replication cluster. The failure point
happened in the middle of the test, we could expect a slight TPS increase after
failure, and the average TPS increased from 989 to 1,058. The reason is that
the group has lost one member but the origin workload is not changed. With
less replication work to sync among the group, the primary instance could
43
44
Test Objective
This test is designed to showcase how vSAN can help MySQL workloads
tolerate hardware failures thus guarantee business continuity.
Test Scenario
Disk failure: Evaluate how vSAN handles a disk failure to ensure the
sustainability of MySQL workloads.
Disk group failure: Evaluate how vSAN handles a disk group failure to
ensure the sustainability of MySQL workloads.
Host failure: Evaluate how vSAN handles a host outage to ensure the
sustainability of MySQL workloads.
45
Test Procedures
1. Running SysBench complex OLTP test with the thread count set to 8
against one MySQL server instance. Monitoring the workload and when
entering steady state, manually inject a disk error on a capacity SSD that
does not host the MySQL test data. Collect and measure the performance
before and after the disk failure.
2. Repeat the test in step 1, instead inject a disk error on a capacity SSD that
hosts the MySQL test data. Collect and measure the performance before
and after the disk failure.
3. Repeat the test in step 1, instead inject a disk error on a cache SSD, which
will cause an entire disk group failure. Collect and measure the
performance before and after the disk group failure.
4. Repeat the test in step 1, instead force shutdown a host in the vSAN
cluster. Collect and measure the performance before and after the host
outage.
Test Results
Disk failure: Evaluate how vSAN handles a disk failure to ensure sustainability
of MySQL workloads
Figure 17 and Figure 18 show the failure test result for a capacity disk failure,
both with and without hosting the MySQL test data. For the first situation, the
sustained performance dropped about 10%, from average 1,443 to average
1,303, and could maintain in a steady state after the failure happened. For the
second situation, we observed close to zero performance impact, sustained
TPS was 1,483 and 1,456 before and after the failure. vSAN by default will
maintain two copies of each object if the FTT setting is 1, therefore it could
tolerate single disk failure and sustain MySQL production workloads steadily.
46
Disk group failure: Evaluate how vSAN handles a disk group failure to ensure
sustainability of MySQL workloads
47
Figure 19 shows the disk group failure situation. After we manually introduce
an error on a cache SSD, the vSAN disk group will fail and stop service I/O
request at once. Therefore, the partially-missing vSAN object will be put in the
“degraded” state which will automatically trigger object rebuild on other disk
groups. The rebuild bandwidth may cause potential performance impact on
the MySQL production workload.
In this case, the sustained performance started from average 1,452, and after
the failure happened, it dropped to about 1,000 TPS. Then as the rebuild
process was initiated, the TPS curve fluctuated for a short while, and finally
reached stability at an average of 1,295 TPS.
Host failure: Evaluate how vSAN handles a host outage to ensure sustainability
of MySQL workloads
Figure 20 simulates a host outage situation. Before the failure happens, the
sustained TPS was 1,467 on average, and when we manually shut down a host
in the vSAN cluster, there was no obvious performance drop. This is because
the host outage did not trigger a vSAN object rebuild instantly, instead those
48
In this test case, the performance is about the same before and after the host
outage happens. Notice the rebuild process will be triggered if the host outage
is not recovered after 60 minutes by default. However, if the host comes back
shortly, vSAN will also automatically trigger the rebuild process if there is data
inconsistency between these objects.
49
6. Best Practices
This section provides the recommended best practices for this solution.
50
Refer to Performance Best Practices for VMware vSphere 6.5 for general
guidance for vSphere configurations.
51
Plan your MySQL data directory and innodb tablespace home directory
with data_dir and innodb_data_home_dir parameter, respectively.
Set innodb_file_per_table to ON to enable InnoDB to store data and index
on separate data files instead of system tables.
For write-intensive MySQL workload, set the innodb_adaptive_flushing
parameter to ON to avoid sudden dips in throughput such as sharp
checkpoint.
Increase the default MySQL innodb buffer pool with the
innodb_buffer_pool_size parameter for performance consideration. The
general guidance of this value is up to 70-80% of the server memory in
case reserve memory overhead for OS consumption.
Make sure each innodb buffer pool instance divided is at least 1GB to
improve concurrency by setting the innodb_buffer_pool_instances
parameter. If the buffer pool is less than 1 GB, set to 1 instead.
Set innodb_log_files_in_group to default and recommended value (2).
Pre-allocate the innodb log files by increasing the innodb_log_file_size
parameter.
For MySQL workload with large transactions, such as update, insert or
delete many rows, increase innodb_log_buffer_size to avoid frequent log
write operation to disks.
Carefully configure the innodb_flush_log_at_trx_commit parameter. For
full data integrity and transaction ACID requirements, keep the default
value (1), otherwise you may set the number to (2) for better performance.
For all-flash vSAN, set the innodb_flush_method to “O_DIRECT” may
provide better performance against the default value “fsync”. Set the
innodb_flush_neighbors to (1) to spread out write operations.
For application level high availability consideration, you may start MySQL
group replication deployment with three instances and add up more
members as your demanding protection level grows.
For more details, please refer to InnoDB Startup Options and System Variables
.
52
7. Conclusion
This section provides a summary of this reference architecture.
53
7. 1 Conclusion
We also validated vSAN SPBM offering with data services for MySQL
database. With vSAN all-flash deduplication and compression and erasure
coding, users can choose to improve space efficiency with minimized
performance impact. Stripe width setting allows user to possibly improve
performance with more evenly distributed MySQL virtual disk objects. What’s
more, it is recommended to use CPU offloaded checksum feature to ensure
MySQL data integrity on the vSAN level.
54
8. Reference
This section provides vSAN, MySQL, and Quanta references.
55
8. 1 Reference
56
57
Mark Xu, Senior Solutions Engineer in the Product Enablement team of the
Storage and Availability Business Unit wrote the original version of this paper.
Catherine Xu, Senior Technical Writer in the Product Enablement team of the
Storage and Availability Business Unit edited this paper to ensure that the
contents conform to the VMware writing style.
58