Академический Документы
Профессиональный Документы
Культура Документы
Table of Contents
Introduction ....................................................................................................3 Approaches to High Availability with MySQL .............................................3 Optimum Use Cases for DRBD .................................................................4 Introduction to MySQL on DRBD/Pacemaker/Corosync/Oracle Linux .....5 Setting up MySQL with DRBD/Pacemaker/Corosync/Oracle Linux ..........6 Target Configuration ..................................................................................6 File Systems ..............................................................................................6 Pre-Requisites ...........................................................................................7 Setting up and testing your system ...........................................................8 Step 1. Check correct kernel is installed ...............................................8 Step 2. Ensure that DRBD user-land tools are installed ........................8 Step 3. Ensure cluster software is installed..........................................10 Step 4. Configure DRBD & create file system ......................................11 Step 5. Install & configure MySQL .......................................................14 Step 6. Configure Pacemaker/Corosync resources .............................16 Step 7. Stop service when isolated from the network ..........................21 Step 8. Ensure the correct daemons are started at system boot .........22 Step 9. Test the system ........................................................................23 Support for DRBD ........................................................................................24 Oracle Linux Premier Support .................................................................24 MySQL Enterprise Edition .......................................................................24 Conclusion ....................................................................................................25 Additional Resources ..................................................................................25
Page 2
Introduction
As the worlds leading open source database, MySQL is deployed in many of todays most demanding web, cloud, social and mobile applications. Ensuring service continuity is a critical attribute in any system serving these applications, requiring close consideration by developers and architects. As a result of its popularity, there are many different ways of achieving high availability for MySQL. This Guide introduces DRBD (Distributed Replication Block Device), one of the leading solutions for MySQL HA (High Availability), offering users: - An end-to-end, integrated stack of mature and proven open source technologies, fully supported by Oracle; - Automatic failover and recovery for service continuity; - Mirroring, via synchronous replication, to ensure failover between nodes without the risk of losing committed transactions; - Building of HA clusters from commodity hardware, without the requirement for shared-storage. The paper provides a step-by-step guide to installing, configuring, provisioning and testing the complete MySQL and DRBD stack, including: - MySQL Database; - DRBD kernel module and userland utilities; - Pacemaker and Corosync cluster messaging and management processes; - Oracle Linux operating system. By reading this Guide, architects, developers and DBAs will be able to qualify ideal use-cases for DRBD and then quickly deploy new MySQL HA solutions with DRBD.
Operational Complexity
Replication
SPs & Line of Business Web & Cloud Services eCommerce Telecoms Military
35 days
4 days
9 . 9
8 hours
50 mins
5 mins
Figure 1 MySQL HA Solutions Covering the Spectrum of Requirements You can learn more about each of these solutions, and a best-practices methodology to guide their selection
Copyright 2012, Oracle and/or its affiliates. All rights reserved.
Page 3
HA Technology
Platform Support Supported Storage Engine Auto IP Failover Auto Database Failover Auto Data Resynchronization Failover Time Replication Mode Shared Storage No. of Nodes Availability Design Level
MySQL Replication
All supported by MySQL Server ** All (InnoDB required for Auto-Failover) No Yes, with MySQL 5.6 + HA Utilities Yes, with MySQL 5.6 + HA Utilities 5 seconds +
WSFC*
Windows Server 2008 InnoDB Yes Yes N/A Shared Storage 5 seconds + InnoDB Recovery Time*** N/A Shared Storage Yes Active / Passive Master + Multiple Slaves 99.95%
DRBD
Oracle Linux InnoDB Yes, with Corosync + Pacemaker Yes, with Corosync + Pacemaker Yes
Oracle VM Template
Oracle Linux InnoDB Yes Yes N/A Shared Storage 5 seconds + InnoDB Recovery Time*** N/A Shared Storage Yes Active / Passive Master + Multiple Slaves 99.99%
MySQL Cluster
All supported by MySQL Cluster **** NDB (MySQL Cluster) Yes Yes Yes
5 seconds + InnoDB Recovery Time*** Synchronous No, distributed across nodes Active / Passive Master + Multiple Slaves 99.99%
1 Second or Less
Asynchronous / SemiSynchronous No, distributed across nodes Master & Multiple Slaves 99.9%
* Windows Server 2008R2 Failover Clustering ** http://www.mysql.com/support/supportedplatforms/database.html *** InnoDB recovery time dependent on cache and database size, database activity, etc. **** http://www.mysql.com/support/supportedplatforms/cluster.html
Figure 2 Comparing MySQL HA Solutions The following sections of the whitepaper focus on the installation and configuration of the DRBD stack.
http://www.mysql.com/why-mysql/white-papers/mysql_wp_ha_strategy_guide.php
Page 4
Page 5
A single Virtual IP (VIP) is shown in the figure (192.168.5.102) and this is the address that the application will connect to when accessing the MySQL database. Pacemaker will be responsible for migrating this between the 2 physical IP addresses. Figure 4 Network configuration One of the final steps in configuring Pacemaker is to add network connectivity monitoring in order to attempt to have an isolated host stop its MySQL service to avoid a splitbrain scenario. This is achieved by having each host ping an external (not one part of the cluster) IP addresses in this case the network router (192.168.5.1).
File Systems
Figure 5 shows where the MySQL files will be stored. The MySQL binaries as well as the socket (mysql.sock) and process-id (mysql.pid) files are stored in a regular partition independent on each host (under /var/lib/mysql/). The MySQL Server configuration file (my.cnf) and the database files (data/*) are stored in a DRBD controlled file system that at any point in time is only available on one of the two hosts this file system is controlled by DRBD and mounted under /var/lib/mysql_drbd/.
Page 6
Pre-Requisites
2 Servers, each with: MySQL 5.5 or later Oracle Linux 6.2 or later Unpartitioned space on the local disks to create a DRBD partition Network connectivity (ideally redundant)
It is recommended that you do not rely on DNS to resolve host names and so for the configuration shown in Figure 4 the following host configuration files are created: /etc/hosts (host1):
127.0.0.1 localhost localhost.localdomain ::1 localhost localhost.localdomain 192.168.5.16 host2 host2.localdomain
/etc/hosts (host2):
127.0.0.1 localhost localhost.localdomain ::1 localhost localhost.localdomain 192.168.5.19 host1 host1.localdomain
Check that the same name is configured for the Network Interface Card on each of the 2 servers and change one if they dont match in this case the NIC is called eth0 on both hosts. If the NIC names do not match then they can be changed by editing the /etc/udev/rules.d/30-net_persistent_names.rules file and then restarting the server.
Page 7
SELINUX can prevent the cluster stack from operating correctly and so at this point, edit the /etc/selinux/config file to overwrite enforcing with permissive (on each host) and then restart each of
the hosts.
You need the version to be using Oracle Unbreakable Enterprise Kernel 2.6.39 or later; if thats the case then you can skip to Step 2. The instructions in this paper are based on Oracles Enterprise Kernel Release 2 for Oracle Linux 6; before going any further install the latest version on each server:
[root@host1 yum.repos.d]# wget http://public-yum.oracle.com/public-yum-ol6.repo -P /etc/yum.repos.d/ --2012-06-22 10:50:50-- http://public-yum.oracle.com/public-yum-ol6.repo Resolving public-yum.oracle.com... 141.146.44.34 Connecting to public-yum.oracle.com|141.146.44.34|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 1461 (1.4K) [text/plain] Saving to: /etc/yum.repos.d/public-yum-ol6.repo 100%[======================================>] 1,461 --.-K/s in 0s
Within the /etc/yum.repos.d/public-yum-ol6.repo file enable the ol6_UEK_base repository by setting enabled=1:
[ol6_UEK_base] name=Unbreakable Enterprise Kernel for Oracle Linux $releasever ($basearch) baseurl=http://public-yum.oracle.com/repo/OracleLinux/OL6/UEK/base/$basearch/ gpgkey=http://public-yum.oracle.com/RPM-GPG-KEY-oracle-ol6 gpgcheck=1 enabled=1
The system can then be updated (includes bringing the kernel up to UEK2 Release 2):
[root@host1]# yum update
Page 8
If they are there then you can jump to Step 3, otherwise you need to make sure that the system is registered with Oracle ULN (http://linux-update.oracle.com) from the desktop, select System/Administration/ULN Registration (as shown in Figure 6 and then follow the steps) or run uln_register if you dont have a desktop environment.
Within the ULN web site (http://linuxupdate.oracle.com), you need to subscribe to the HA Utilities for MySQL channel for each of the two systems (Figure 7).
Figure 7 Subscribe to "HA Utilities for MySQL" channel At this point, yum should be used to install the package on both hosts:
[root@host1 ]# yum install drbd83-utils Loaded plugins: refresh-packagekit, rhnplugin, security Setting up Install Process Resolving Dependencies --> Running transaction check ---> Package drbd83-utils.x86_64 0:8.3.11-1.el6 will be installed --> Finished Dependency Resolution Dependencies Resolved ================================================================================ Package Arch Version Repository Size ================================================================================ Installing: drbd83-utils x86_64 8.3.11-1.el6 ol6_x86_64_mysql-ha-utils 207 k Transaction Summary ================================================================================ Install 1 Package(s) Total download size: 207 k Installed size: 504 k Is this ok [y/N]: y Downloading Packages: drbd83-utils-8.3.11-1.el6.x86_64.rpm Running rpm_check_debug Running Transaction Test Transaction Test Succeeded Running Transaction Installing : drbd83-utils-8.3.11-1.el6.x86_64
| 207 kB
00:00
1/1
Page 9
Verifying
: drbd83-utils-8.3.11-1.el6.x86_64
1/1
If the Rep information is there but not set to installed then simply run:
[root@host1]# yum install corosync
on both hosts. Pacemaker may need to be installed again this can be checked using yum. If it is available in a repository then it is simple to install:
[root@host1]# yum info pacemaker Loaded plugins: security Repository ol6_latest is listed more than once in the configuration Repository ol6_ga_base is listed more than once in the configuration Available Packages Name : pacemaker Arch : x86_64 Version : 1.1.6 Release : 3.el6 Size : 405 k Repo : ol6_latest Summary : Scalable High-Availability cluster resource manager URL : http://www.clusterlabs.org License : GPLv2+ and LGPLv2+ Description : Pacemaker is an advanced, scalable High-Availability cluster : resource manager for Linux-HA (Heartbeat) and/or Corosync. : : It supports "n-node" clusters with significant capabilities for : managing resources and dependencies. : : It will run scripts at initialization, when machines go up or : down, when related resources fail and can be configured to : periodically check resource health. : : Available rpmbuild rebuild options: : --with(out) : heartbeat cman corosync doc publican snmp esmtp : pre_release
Copyright 2012, Oracle and/or its affiliates. All rights reserved.
Page 10
Disk /dev/sdb: 21.5 GB, 21474836480 bytes 255 heads, 63 sectors/track, 2610 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00000000 Disk /dev/sdb doesn't contain a valid partition table Disk /dev/mapper/vg_host1-lv_root: 40.3 GB, 40307261440 bytes 255 heads, 63 sectors/track, 4900 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00000000 Disk /dev/mapper/vg_host1-lv_root doesn't contain a valid partition table Disk /dev/mapper/vg_host1-lv_swap: 2113 MB, 2113929216 bytes 255 heads, 63 sectors/track, 257 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00000000 Disk /dev/mapper/vg_host1-lv_swap doesn't contain a valid partition table
In this case, disk sdb has no partitions and so we can safely create a new one:
[root@host1]# fdisk -cu /dev/sdb Device contains neither a valid DOS partition table, nor Sun, SGI or OSF di Building a new DOS disklabel with disk identifier 0xecef1a6a. Changes will remain in memory only, until you decide to write them. After that, of course, the previous content won't be recoverable. Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(ri Command (m for help): p Disk /dev/sdb: 21.5 GB, 21474836480 bytes 255 heads, 63 sectors/track, 2610 cylinders, total 41943040 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0xecef1a6a Device Boot Start End Blocks Id System
Page 11
Command (m for help): n Command action e extended p primary partition (1-4) p Partition number (1-4): 1 First sector (2048-41943039, default 2048): Using default value 2048 Last sector, +sectors or +size{K,M,G} (2048-41943039, default 41943039): Using default value 41943039 Command (m for help): p Disk /dev/sdb: 21.5 GB, 21474836480 bytes 255 heads, 63 sectors/track, 2610 cylinders, total 41943040 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0xecef1a6a Device Boot /dev/sdb1 Start 2048 End 41943039 Blocks 20970496 Id 83 System Linux
Command (m for help): w The partition table has been altered! Calling ioctl() to re-read partition table. Syncing disks.
This partition will be used as a resource, managed (and synchronized between hosts by DRBD); in order for DRBD to be able to do this a new configuration file (in this case called clusterdb_res.res) must be created in the /etc/drbd.d/ directory; the contents should look like this:
resource clusterdb_res { protocol C; handlers { pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notifyemergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f"; pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notifyemergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f"; local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergencyshutdown.sh; echo o > /proc/sysrq-trigger ; halt -f"; fence-peer "/usr/lib/drbd/crm-fence-peer.sh"; } startup { degr-wfc-timeout 120; outdated-wfc-timeout 2; } disk { on-io-error }
# 2 minutes. # 2 seconds.
detach;
net { cram-hmac-alg "sha1"; shared-secret "clusterdb"; after-sb-0pri disconnect; after-sb-1pri disconnect; after-sb-2pri disconnect; rr-conflict disconnect; } syncer { rate 10M;
Page 12
al-extents 257; on-no-data-accessible io-error; } on host1.localdomain { device /dev/drbd0; disk /dev/sdb1; address 192.168.5.19:7788; flexible-meta-disk internal; } on host2.localdomain { device /dev/drbd0; disk /dev/sdb1; address 192.168.5.16:7788; meta-disk internal; } }
Obviously, the IP addresses and disk locations should be specific to the hosts that the cluster will be using. In this example the device that DRBD will create will be located at /dev/drbd0 it is this device that will be swapped back and forth between the hosts by DRBD. This resource configuration file should be copied over to the same location on the second host:
[root@host1 drbd.d]# scp clusterdb_res.res host2:/etc/drbd.d/
Before starting the DRBD daemon, meta data must be created for the new resource (clusterdb_res) on each host:
[root@host1]# drbdadm create-md clusterdb_res Writing meta data... initializing activity log NOT initialized bitmap New drbd meta data block successfully created. success [root@host2]# drbdadm create-md clusterdb_res Writing meta data... initializing activity log NOT initialized bitmap New drbd meta data block successfully created. success
At this point the DRBD service is running on both hosts but neither host is the primary and so the resource (block device) cannot be accessed on either host; this can be confirmed by querying the status of the service:
[root@host1]# /etc/init.d/drbd status
drbd driver loaded OK; device status: version: 8.3.11 (api:88/proto:86-96) srcversion: DA5A13F16DE6553FC7CE9B2 m:res cs ro 0:clusterdb_res Connected Secondary/Secondary [root@host2]# /etc/init.d/drbd status drbd driver loaded OK; device status: version: 8.3.11 (api:88/proto:86-96) srcversion: DA5A13F16DE6553FC7CE9B2 m:res cs ro
ds Inconsistent/Inconsistent
p C
mounted
fstype
ds
mounted
fstype
Page 13
0:clusterdb_res
Connected
Secondary/Secondary
Inconsistent/Inconsistent
In order to create the file systems (and go on to store useful data in it), one of the hosts must be made primary for the clusterdb_res resource:
[root@host1]# drbdadm -- --overwrite-data-of-peer primary all [root@host1]# /etc/init.d/drbd status
drbd driver loaded OK; device status: version: 8.3.11 (api:88/proto:86-96) srcversion: DA5A13F16DE6553FC7CE9B2 m:res cs ro ... sync'ed: 0.4% K/sec 0:clusterdb_res SyncSource Primary/Secondary
ds (20404/20476)Mfinish: UpToDate/Inconsistent
p 0:33:29 C
mounted 10,384
fstype (10,384)
Note that the status output also shows the progress of the block-level syncing of the device from the new primary (host1) to the secondary (host2). This initial sync can take some time but it should not be necessary to wait for it to complete in order to complete Step 4 through Step 8. Now that the device is available on host1 it is possible to create a file system on it (note that this does not need to be repeated on the second host as DRBD will handle the syncing of the raw disk data):
[root@host1]# mkfs -t ext4 /dev/drbd0
mke2fs 1.41.12 (17-May-2010) Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) Stride=0 blocks, Stripe width=0 blocks 1310720 inodes, 5242455 blocks 262122 blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks=4294967296 160 block groups 32768 blocks per group, 32768 fragments per group 8192 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 4096000 Writing inode tables: done Creating journal (32768 blocks): done Writing superblocks and filesystem accounting information: done This filesystem will be automatically checked every 32 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override.
Note that we do not mount this new file system on either host (though in Step 5 we do so temporarily in order to install MySQL on it) as this is something that will be handled by the clustering software ensuring that the replicated file system is only mounted on the active server.
Page 14
HTTP request sent, awaiting response... 302 Found Location: http://mirrors.ukfast.co.uk/sites/ftp.mysql.com/Downloads/MySQL-5.5/MySQL-5.5.211.el6.x86_64.tar [following] --2012-02-27 17:58:46-- http://mirrors.ukfast.co.uk/sites/ftp.mysql.com/Downloads/MySQL5.5/MySQL-5.5.21-1.el6.x86_64.tar Resolving mirrors.ukfast.co.uk... 78.109.175.117 Connecting to mirrors.ukfast.co.uk|78.109.175.117|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 147312640 (140M) [application/x-tar] Saving to: MySQL-5.5.21-1.el6.x86_64.tar
[root@host1 ~]# tar xf MySQL-5.5.21-1.el6.x86_64.tar [root@host1 ~]# rpm -ivh --force MySQL-server-5.5.21-1.el6.x86_64.rpm [root@host1 ~]# yum install MySQL-client-5.5.21-1.el6.x86_64.rpm
Loaded plugins: refresh-packagekit, security Setting up Install Process Examining MySQL-client-5.5.21-1.el6.x86_64.rpm: MySQL-client-5.5.21-1.el6.x86_64 Marking MySQL-client-5.5.21-1.el6.x86_64.rpm to be installed Resolving Dependencies --> Running transaction check ---> Package MySQL-client.x86_64 0:5.5.21-1.el6 will be installed --> Finished Dependency Resolution Dependencies Resolved ================================================================================================== ================================================================================================= Package Arch Version Repository Size ================================================================================================== ================================================================================================= Installing: MySQL-client x86_64 5.5.21-1.el6 /MySQL-client-5.5.21-1.el6.x86_64 63 M Transaction Summary ================================================================================================== ================================================================================================= Install 1 Package(s) Total size: 63 M Installed size: 63 M Is this ok [y/N]: y Downloading Packages: Running rpm_check_debug Running Transaction Test Transaction Test Succeeded Running Transaction Installing : MySQL-client-5.5.21-1.el6.x86_64 1/1 Installed: MySQL-client.x86_64 0:5.5.21-1.el6 Complete!
Repeat the above installation for the second server. In order for the In DRBD file system to be mounted, the /var/lib/mysql_drbd directory should be created on both hosts:
[root@host1 [root@host1 [root@host1 [root@host1 [root@host1 ~]# ~]# ~]# ~]# ~]# mkdir chown chgrp chown chgrp /var/lib/mysql_drbd billy /var/lib/mysql_drbd billy /var/lib/mysql_drbd billy /var/lib/mysql billy /var/lib/mysql
[root@host2 ~]# mkdir /var/lib/mysql_drbd [root@host2 ~]# chown billy /var/lib/mysql_drbd [root@host2 ~]# chgrp billy /var/lib/mysql_drbd
Copyright 2012, Oracle and/or its affiliates. All rights reserved.
Page 15
[root@host2 ~]# chown billy /var/lib/mysql [root@host2 ~]# chgrp billy /var/lib/mysql
On just the one (DRBD active) host, the DRBD file system should be temporarily mounted so that the configuration file can be created and the default data files installed:
[root@host1 ~]# mount /dev/drbd0 /var/lib/mysql_drbd [root@host1 ~]# mkdir /var/lib/mysql_drbd/data [root@host1 ~]# cp /usr/share/mysql/my-small.cnf /var/lib/mysql_drbd/my.cnf
Edit the edit the /var/lib/mysql_drbd/my.cnf file and set datadir=/var/lib/mysql_drbd/data in the [mysqld] section. Also confirm that the socket is configured to /var/lib/mysql/mysql.sock and the pid file to /var/lib/mysql/mysql.pid. The default database files can now be populated:
[root@host1 ~]# mysql_install_db no-defaults --datadir=/var/lib/mysql_drbd/data -user=billy
Installing MySQL system tables... OK Filling help tables... OK To start mysqld at boot time you have to copy support-files/mysql.server to the right place for your system PLEASE REMEMBER TO SET A PASSWORD FOR THE MySQL root USER ! To do so, start the server, then issue the following commands: /usr/bin/mysqladmin -u root password 'new-password' /usr/bin/mysqladmin -u root -h host1.localdomain password 'new-password' Alternatively you can run: /usr/bin/mysql_secure_installation which will also give you the option of removing the test databases and anonymous user created by default. This is strongly recommended for production servers. See the manual for more instructions. You can start the MySQL daemon with: cd /usr ; /usr/bin/mysqld_safe & You can test the MySQL daemon with mysql-test-run.pl cd /usr/mysql-test ; perl mysql-test-run.pl Please report any problems with the /usr/bin/mysqlbug script!
Now that this has been set up, the DRBD file system should be unmounted (and primary control of the DRBD resource surrendered) and from this point onwards it will be managed by the clustering software:
[root@host1 ~]# umount /var/lib/mysql_drbd [root@host1 ~]# drbdadm secondary clusterdb_res
Page 16
Firstly, set up some network-specific parameters from the Linux command line and also in the Corosync configuration file. The multi-cast address2 should be unique in your network but the port can be left at 4000. The IP address should be based on the IP addresses being used by the servers but should take the form of XX.YY.ZZ.0.
[root@host1 [root@host1 [root@host1 [root@host1 ~]# ~]# ~]# ~]# export ais_mcast=226.99.1.1 export ais_port=4000 export ais_addr=192.168.5.0 cp /etc/corosync/corosync.conf.example /etc/corosync/corosync.conf
Create: /etc/corosync/service.d/pcmk:
service { # Load the Pacemaker Cluster Resource Manager name: pacemaker ver: 1 }
The same environment variables should be set up on the second server but to avoid any mismatches, the configuration file can be copied across:
[root@host2 [root@host2 [root@host2 [root@host1 [root@host1 ~]# ~]# ~]# ~]# ~]# export ais_mcast=226.99.1.1 export ais_port=4000 export ais_addr=192.168.5.0 scp /etc/corosync/corosync.conf host2:/etc/corosync/corosync.conf scp /etc/corosync/service.d/pcmk host2:/etc/corosync/service.d/pcmk
OK
To confirm that there are no problems at this point, check /var/log/messages for errors before starting Pacemaker:
[root@host1 ~]# /etc/init.d/pacemaker start Starting Pacemaker Cluster Manager: [root@host1 ~]# /etc/init.d/pacemaker status pacemakerd (pid 3203) is running...
2
OK
https://en.wikipedia.org/wiki/Multicast_address
Page 17
[root@host2 ~]# /etc/init.d/pacemaker start Starting Pacemaker Cluster Manager: [root@host2 ~]# /etc/init.d/pacemaker status pacemakerd (pid 3070) is running...
OK
Again, check /var/log/messages for errors and also run Pacemakers cluster resource monitoring command to view the status of the cluster: Its worth running it on both hosts to confirm that they share the same view of the world:
[root@host1 billy]# crm_mon -1 ============ Last updated: Mon Feb 27 17:51:10 2012 Last change: Mon Feb 27 17:50:25 2012 via crmd on host1.localdomain Stack: openais Current DC: host1.localdomain - partition with quorum Version: 1.1.6-3.el6-a02c0f19a00c1eb2527ad38f146ebc0834814558 2 Nodes configured, 2 expected votes 0 Resources configured. ============ Online: [ host1.localdomain host2.localdomain ] [root@host2 billy]# crm_mon -1 ============ Last updated: Mon Feb 27 17:51:34 2012 Last change: Mon Feb 27 17:50:25 2012 via crmd on host1.localdomain Stack: openais Current DC: host1.localdomain - partition with quorum Version: 1.1.6-3.el6-a02c0f19a00c1eb2527ad38f146ebc0834814558 2 Nodes configured, 2 expected votes 0 Resources configured. ============ Online: [ host1.localdomain host2.localdomain ]
Above, crm_mon is run with the -1 option to indicate that it should report once and then return. A recommendation would be to also run it without the option (on both servers) so that you get a continually refreshed view of the state of the cluster including any managed resources. Pacemakers resource management tool crm can then be used to configure the cluster resources. As we are configuring a cluster made up of just 2 hosts, when one host fails (or loses contact with the other) there is no node majority (quorum) left and so by default the surviving node (or both if theyre still running but isolated from each other) would be shut down by Pacemaker. This isnt the desired behavior as it does not offer High Availability and so that default should be overridden (well later add an extra behavior whereby each node will shut itself down if it cannot ping a 3rd node that is external to the cluster (preventing a split brain situation)).:
[root@host1 ~]# crm configure property no-quorum-policy=ignore
Pacemaker uses resource stickiness parameters to determine when resources should be migrated between nodes the absolute values are not important, rather how they compare with the values that will subsequently be configured against specific events; here we set the stickiness to 100:
[root@host1 ~]# crm configure rsc_defaults resource-stickiness=100
STONITH (Shoot The Other Node In The Head) otherwise known as fencing refers to one node trying to kill another in the even that it believes the other has partially failed and should be stopped in order to avoid any risk of a split-brain scenario. We turn this off as this solution will rely on each node shutting itself down in the event that it loses connectivity with the independent host:
[root@host1 ~]# crm configure property stonith-enabled=false
Page 18
The first resource to configure is DRBD a primitive (p_drbd_mysql) is created but before that, we stop the DRBD service:
[root@host1 ~]# /etc/init.d/drbd stop Stopping all DRBD resources: . [root@host2 billy]# /etc/init.d/drbd stop Stopping all DRBD resources: . [root@host1 ~]# crm configure crm(live)configure# primitive p_drbd_mysql ocf:linbit:drbd params drbd_resource="clusterdb_res" op monitor interval="15s"
WARNING: p_drbd_mysql: default timeout 20s for start is smaller than the advised 240 WARNING: p_drbd_mysql: default timeout 20s for stop is smaller than the advised 100 WARNING: p_drbd_mysql: action monitor not advertised in meta-data, it may not be supported by the RA
A master-slave relationship (called ms_drbd_mysql) is then set up for the p_drbd_mysql primitive and it is configured to only allow a single master:
crm(live)configure# ms ms_drbd_mysql p_drbd_mysql meta master-max="1" master-nodemax="1" clone-max="2" clone-node-max="1" notify="true"
Next a primitive (p_fs_mysql) is created for the file system running on the DRBD device and this is configured to mount it to the directory (/var/lib/mysql_drbd) where the MySQL service will expect to use it:
crm(live)configure# primitive p_fs_mysql ocf:heartbeat:Filesystem params device="/dev/drbd0" directory="/var/lib/mysql_drbd" fstype="ext4"
WARNING: p_fs_mysql: default timeout 20s for start is smaller than the advised 60 WARNING: p_fs_mysql: default timeout 20s for stop is smaller than the advised 60
As shown in Figure 4, the application will connect to MySQL through the Virtual IP Address 192.168.5.102. As a prerequisite check, you should have ensured that both hosts use the same name for their NIC in this example eth0. Using this information, the VIP can be created:
crm(live)configure# primitive p_ip_mysql ocf:heartbeat:IPaddr2 params ip="192.168.5.102" cidr_netmask="24" nic="eth0"
Now that the file system and the VIP to be used by MySQL have been defined in the cluster, the MySQL service itself can be configured the primitive being labeled as p_mysql. Note that Pacemaker will provide command-line arguments when running the mysqld process which will override options such as datadir set in the my.cnf file and so it is important to override Pacemakers defaults (actually the defaults set by the MySQL agent) by specifying the correct command-line options when defining the primitive here:
crm(live)configure# primitive p_mysql ocf:heartbeat:mysql params binary="/usr/sbin/mysqld" config="/var/lib/mysql_drbd/my.cnf" datadir="/var/lib/mysql_drbd/data" pid="/var/lib/mysql/mysql.pid" socket="/var/lib/mysql/mysql.sock" user="billy" group="billy" additional_parameters="-bind-address=192.168.5.102 --user=billy" op start timeout=120s op stop timeout=120s op monitor interval=20s timeout=30s
Rather than managing the individual resources/primitives required for the MySQL service, it makes sense for Pacemaker to manage them as a group (for example, migrating the VIP to the second host wouldnt allow applications to access the database unless the mysqld process is also started there). To that end, a group resource (g_mysql) is defined:
crm(live)configure# group g_mysql p_fs_mysql p_ip_mysql p_mysql
As the MySQL service (group) has a dependency on the host it is running on being the DRBD master, that relationship is added by defining a co-location and an ordering constraint to ensure that the MySQL group is
Page 19
co-located with the DRBD master and that the DRBD promotion of the host to the master must happen before the MySQL group can be started:
crm(live)configure# colocation c_mysql_on_drbd inf: g_mysql ms_drbd_mysql:Master crm(live)configure# order o_drbd_before_mysql inf: ms_drbd_mysql:promote g_mysql:start
At this point, all of the configuration changes are defined but have not been applied; the commit command will apply the changes and then crm_mon can be used to check that the resources have been correctly defined and are actually active:
crm(live)configure# commit
WARNING: WARNING: WARNING: RA WARNING: WARNING: p_drbd_mysql: default timeout 20s for start is smaller than the advised 240 p_drbd_mysql: default timeout 20s for stop is smaller than the advised 100 p_drbd_mysql: action monitor not advertised in meta-data, it may not be supported by the p_fs_mysql: default timeout 20s for start is smaller than the advised 60 p_fs_mysql: default timeout 20s for stop is smaller than the advised 60
[root@host1 ~]# crm_mon -1 ============ Last updated: Tue Feb 28 10:22:32 2012 Last change: Tue Feb 28 10:20:47 2012 via cibadmin on host1.localdomain Stack: openais Current DC: host1.localdomain - partition with quorum Version: 1.1.6-3.el6-a02c0f19a00c1eb2527ad38f146ebc0834814558 2 Nodes configured, 2 expected votes 5 Resources configured. ============ Online: [ host1.localdomain host2.localdomain ] Master/Slave Set: ms_drbd_mysql [p_drbd_mysql] Masters: [ host1.localdomain ] Slaves: [ host2.localdomain ] Resource Group: g_mysql p_fs_mysql (ocf::heartbeat:Filesystem): Started host1.localdomain p_ip_mysql (ocf::heartbeat:IPaddr2): Started host1.localdomain p_mysql (ocf::heartbeat:mysql): Started host1.localdomain
Page 20
Figure 8 illustrates the various entities that have been created in Pacemaker and the relationships between them.
Figure 8 Clustered entities Just as with any MySQL installation, it is necessary to grant privileges to users from remote hosts to access the database; on the host where the MySQL Serrver process is currently running, execute the following:
[billy@host1 ~]$ mysql -u root -e "GRANT ALL ON *.* to 'root'@'%'"
At this point it is possible to connect to the database (using the configured VIP) and store some data that we can then check is still there after later failing over to host2:
[billy@host1 ~]$ mysql -h 192.168.5.102 -P3306 -u root Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 2 Server version: 5.5.21 MySQL Community Server (GPL) Copyright (c) 2000, 2011, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> CREATE DATABASE clusterdb; USE clusterdb; Database changed mysql> CREATE TABLE simples (id INT NOT NULL PRIMARY KEY); mysql> INSERT INTO simples VALUES (1),(2),(3),(4);
Page 21
crm(live)configure# primitive p_ping ocf:pacemaker:ping params name="ping" multiplier="1000" host_list="192.168.5.1" op monitor interval="15s" timeout="60s" start timeout="60s"
As both hosts in the cluster should be running ping to check on their connectivity, a clone (cl_ping) is created this just causes the resource to be run on all hosts in the cluster:
crm(live)configure# clone cl_ping p_ping meta interleave="true"
Now that there is a ping resource defined for each host, Pacemaker needs telling how to handle the results of the pings. In this example, the new location constraint (l_drbd_master_on_ping) will control the location of the DRBD master (the Master role of the ms_drbd_mysql resource) by setting the preference score for the host to negative infinity (-inf) if there is no ping service defined on the host or that ping service is unable to successfully ping at least one node (<= 0 or in Pacemaker syntax number:lte 0)
crm(live)configure# location l_drbd_master_on_ping ms_drbd_mysql rule $role="Master" inf: not_defined ping or ping number:lte 0 crm(live)configure# commit
Page 22
The cluster management tool can then be used to request the MySQL group g_mysql (and implicitly any colocated resources such as the Master in the ms_drbd_mysql resource):
[root@host1 ~]# crm resource migrate g_mysql host2.localdomain
You should then check that Pacemaker believes that the resources have been migrated and most importantly that you can still access the database contents through the VIP:
[root@host1 ~]# crm_mon -1 ============ Last updated: Thu Mar 1 10:01:59 2012 Last change: Thu Mar 1 10:01:35 2012 via crm_resource on host1.localdomain Stack: openais Current DC: host1.localdomain - partition with quorum Version: 1.1.6-3.el6-a02c0f19a00c1eb2527ad38f146ebc0834814558 2 Nodes configured, 2 expected votes 7 Resources configured. ============
Page 23
Online: [ host1.localdomain host2.localdomain ] Master/Slave Set: ms_drbd_mysql [p_drbd_mysql] Masters: [ host2.localdomain ] Slaves: [ host1.localdomain ] Resource Group: g_mysql p_fs_mysql (ocf::heartbeat:Filesystem): Started host2.localdomain p_ip_mysql (ocf::heartbeat:IPaddr2): Started host2.localdomain p_mysql (ocf::heartbeat:mysql): Started host2.localdomain Clone Set: cl_ping [p_ping] Started: [ host2.localdomain host1.localdomain ] [root@host1 ~]# mysql -h 192.168.5.102 -P3306 -u root -e 'SELECT * FROM clusterdb.simples;' +----+ | id | +----+ | 1 | | 2 | | 3 | | 4 | +----+
Page 24
databases. You get a consistent backup copy of your database to recover your data to a precise point in time. In addition, MySQL Enterprise Backup supports creating compressed backup files, and performing backups of subsets of InnoDB tables. Compression typically reduces backup size up to 90% when compared with the size of actual database files, helping to reduce storage costs. In conjunction with the MySQL binlog, users can perform point in time recovery. MySQL Enterprise Monitor and Query Analyzer The MySQL Enterprise Monitor provides at-a-glance views of the health of your MySQL databases. It continuously monitors your MySQL servers and alerts you to potential problems before they impact your system. Its like having a virtual DBA assistant at your side to recommend best practices and eliminate security vulnerabilities, improve replication, and optimize performance. As a result, DBAs and system administrators can manage more servers in less time and helps developers and DBAs improve application performance by monitoring queries and accurately pinpointing SQL code that is causing a slow down. MySQL Workbench MySQL Workbench is a unified visual tool that enables developers, DBAs, and data architects to design, develop and administer MySQL servers. MySQL Workbench provides advanced data modeling, a flexible SQL editor, and comprehensive administrative tools. Note that these components can be downloaded from https://edelivery.oracle.com/ where they can be evaluated for 30 days. Oracle Premier Support for MySQL MySQL Enterprise Edition provides 24x7x365 access to Oracles MySQL Support team, which is staffed by seasoned database experts ready to help with the most complex technical issues, with direct access to the MySQL development team. Oracles Premier support provides you with:
24x7x365 phone and online support; Rapid diagnosis and solution to complex issues Unlimited incidents Emergency hot fix builds Access to Oracles MySQL Knowledge Base Consultative support services
Conclusion
With synchronous replication and support for distributed storage, DRBD represents one of the most popular HA solutions for MySQL. This Guide has been designed to enable you to get started today. Backed by support for the entire stack from Operating System and DRBD to the clustering processes and MySQL itself, users can quickly deploy new services based on open source technology, with the backing of 24 x 7 global support from Oracle.
Additional Resources
Guide to MySQL High Availability Solutions: http://www.mysql.com/why-mysql/white-papers/mysql_wp_ha_strategy_guide.php Oracle Linux: http://www.oracle.com/us/technologies/linux/index.html Unbreakable Linux Network - An Overview: http://www.oracle.com/us/technologies/027615.pdf MySQL Enterprise Edition Product Guide: http://www.mysql.com/why-mysql/white-papers/mysql_wp_enterprise_ready.php
Page 25