Вы находитесь на странице: 1из 22

Introduction to MetroCluster

ASE-2 Hardware Maintenance and Troubleshooting

Objectives
At the end of this module, you will be able to:
Describe the purpose of MetroCluster Explain the difference between Stretch MetroCluster and Fabric MetroCluster List the hardware and software components required to deploy a MetroCluster Describe MetroCluster failover expected behavior Explain how Cluster Failover on Disaster (CFOD) works

2008 NetApp. All rights reserved.

MetroCluster Overview
MetroCluster is a disaster recovery solution which supports long distances between storage system controllers
Can utilize FC switches to provide connectivity between nodes Utilizes SyncMirror to provide resiliency in storage connectivity Adds the possibility to declare disaster if one site fails

Two MetroCluster configuration types


Stretch MetroCluster Fabric MetroCluster
2008 NetApp. All rights reserved. 3

Stretch MetroCluster
Provides campus DR protection with direct connection of the two nodes (non-switched MetroCluster) Can stretch up to 500 meters with OM3 cabling
Maximum distance 500 meters at 2Gbps or 270 meters at 4 Gbps
Cluster Interconnect

Primary

Secondary

Primary FC Storage

Secondary FC Storage

Secondary FC Mirror

Primary FC Mirror

2008 NetApp. All rights reserved.

Fabric MetroCluster
Uses four Fibre Channel switches in a dual fabric configuration and a separate cluster interconnect card Deployed for distance over 500 meters; can stretch up to 100 km with DWDM switches
Maximum distance 100 km at 2Gbps or 55 km at 4 Gbps
CI CI CI CI CI CI CI CI

Primary

Secondary

ISL ISL
Brocade FC Switches Brocade FC Switches

ISL ISL

Primary FC Storage

Secondary FC Storage

Secondary FC Mirror

Primary FC Mirror

2008 NetApp. All rights reserved.

SyncMirror Pools and Plexes


SyncMirror copies synchronously data on two plexes Each plex of a mirror uses disks from separate pools: pool0 (local) and pool1 (mirror) Uses Snapshot copies to guarantee consistency between the plexes in case of failure
The unaffected plex continues to serve data Once the issue is fixed, the two plexes can be resynchronized

Aggregate

Pool 0 Plex0

Pool 1 Plex1

In a MetroCluster configuration, make sure each controllers data has its mirror at the other site
2008 NetApp. All rights reserved. 6

Example of Pools and Plexes in Fabric MetroCluster

FAS1
A B C D

FAS2
A B C D

Switch1

Switch2

Switch3

Switch4

P0

P1

P0

P1

P0

P1

P0

P1

P0

P1

P0

P1

P0

P1
Bank0

P0

P1

Bank0

Bank1

Bank0

Bank1

Bank0

Bank1

Bank1

Pool 0

Pool 1

Pool 1

Pool 0

FAS1 local plex0

FAS2 mirror plex1 Loop A

FAS1 mirror plex1

FAS2 local plex0

Loop B
2008 NetApp. All rights reserved.

MetroCluster General Requirements


Software requirements
Data ONTAP 6.4.1 and later SyncMirror_local, cluster, and cluster_remote licenses

Hardware requirements
A clustered pair of FAS900, FAS3000, FAS3100 or FAS6000 series appliances Cluster interconnect card, copper/fiber converters, and associated cables Mirrors should be set between identical storage hardware

2008 NetApp. All rights reserved.

Fabric MetroCluster Requirements


FC-VI (cluster interconnect) is dual-ported
One connection to each switch; any switch port can be used

Two storage FC ports; one connection to each switch Disk and storage shelf
Limited to 504 disk spindles Storage shelves are attached to the switches Only DS14, DS14Mk2, and DS14Mk4 storage shelves are supported Maximum of two shelves per loop

Storage HBAs and disk shelves must be attached using the pool and ownership rules

2008 NetApp. All rights reserved.

Fabric MetroCluster Requirements (contd)

Brocade FC Four dedicated certified Brocade FC switches supplied by NetApp and with supported firmware switch
Firmware downloads: Go to the NOW (NetApp on the Web) site > Download Software > Fibre Channel Switch > Brocade Supported switches: Brocade 200E, 300E, and 5000 It is recommended to have switch model identical at each given location

Brocade license

Extended Distance (ISL > 10 km) Full-Fabric Ports-on-Demand (for additional ports)
10

2008 NetApp. All rights reserved.

MetroCluster FC-VI Interconnect Card


P/N: X1926A

Use FC-VI (QLogic2462) 4Gbps cluster interconnect card Each port is connected to a separate FC switch Fabric Good connection to Brocade switch must have Yellow LED ON (4Gbps link speed)

Port A

Port B

PCIe Bus

Not applicable to NetApp appliances 11

2008 NetApp. All rights reserved.

MetroCluster Failover Expected Behavior


Event
Single, or double, or triple disk failure Single HBA failure (loop A, or loop B or both) Shelf module failure Disk shelf backplane failure Disk shelf single or dual power failure Controller simple reboot Controller single power failure Cluster interconnected failure (one port or both ports) Ethernet interface

Does the event trigger a failover?


No No No No No No No No Yes, if the options are set
12

2008 NetApp. All rights reserved.

MetroCluster Failover Expected Behavior (contd)

When cluster failover or cf is enabled, the following will cause a failover:


Controller dual power failure halt command Powering off a node Failed reboot after a panic

The next section illustrates some examples of failover and non-failover events cause-andeffect

2008 NetApp. All rights reserved.

13

MetroCluster - Interconnect Failure


1. Interconnect failure does not trigger a failover, but mirroring is disabled 2. FAS1 and FAS2 continue to serve data 3. Re-syncing happens automatically after interconnect is reestablished
DC#1
FAS1 FAS2

DC# 2

ISL

Vol1/Plex0

Vol2/Plex1

Vol1/Plex1

Vol2/Plex0

2008 NetApp. All rights reserved.

14

MetroCluster Disk Shelf Failure


1. Disk shelf connected to FAS1 has failed; data stored on plex0 is not accessible 2. FAS1 still serves the clients requests by accessing the same data mirrored (plex1) at the secondary data center
DC#1
FAS2

DC# 2

FAS1

ISL

Vol1/Plex0

Vol2/Plex1

Vol1/Plex1

Vol2/Plex0

2008 NetApp. All rights reserved.

15

MetroCluster - Controller Failure


1. FAS1 fails and its storage is still accessible at the primary data center 2. FAS2 takes over the identity of its failed partner
FAS2 serves all clients requests by accessing the data stored on disks in both data centers
DC#1
FAS2

DC# 2

FAS1

ISL

Vol1/Plex0

Vol2/Plex1

Vol1/Plex1

Vol2/Plex0

2008 NetApp. All rights reserved.

16

Cluster Failover on Disaster (CFOD)


Requires the cluster_remote license Enables the cf forcetakeover d command, which allows a takeover to occur without a quorum of disks available (due to of the partner mailbox disks missing)
Discard mailbox disks Split the mirror in order to bring the failed controllers mirror online File System ID (FSID) may be re-written on partners volumes
Depends on option cf.takeover.change_fsid
2008 NetApp. All rights reserved. 17

MetroCluster - Site Disaster


FAS1 fails and data are inaccessible at the primary data center; automatic takeover is disabled Use the cf forcetakeover d command from FAS2 to cause the takeover; the plexes for the failed partner are split
DC#1
FAS1 FAS2

DC# 2

ISL

Vol1/Plex0

Vol2/Plex1

Vol1/Plex1

Vol2/Plex0

2008 NetApp. All rights reserved.

18

MetroCluster - Site Recovery


Once the failures are fixed, you have to reestablish the MetroCluster configuration
1. Restrict booting of the previously failed node 2. Rejoin the mirrors that were split by the forced takeover 3. Perform a giveback: cf giveback
DC#1
FAS1 FAS2

DC# 2

ISL

Vol1/Plex0

Vol2/Plex1

Vol1/Plex1

Vol2/Plex0

2008 NetApp. All rights reserved.

19

Module Review
What are the two types of MetroCluster configuration?
Stretch MetroCluster and Fabric MetroCluster

What licenses are required to deploy a MetroCluster?


SyncMirror_local, cluster, and cluster_remote

Fabric MetroCluster can stretch up to 100 km (True or False?)


True

2008 NetApp. All rights reserved.

20

Module Review (contd)


What hardware is added to achieve a Fabric MetroCluster?
FC-VI cluster interconnect adapter, two pairs of certified Brocade switches, and appropriate cabling

In a Fabric MetroCluster configuration, which license would you install when using a 16 ports Brocade switch?
Brocade Ports-on-Demand license

2008 NetApp. All rights reserved.

21

Module Review (contd)


When cluster failover is enabled, which events will provoke a failover?
Triple disk failure? No Disk shelf dual power failure? No Cluster interconnected failure (both ports)? No Controller dual power failure? Yes

Which command would you execute to force a site failover?


cf forcetakeover d
2008 NetApp. All rights reserved. 22

Вам также может понравиться