Академический Документы
Профессиональный Документы
Культура Документы
aspx
In this section
A server cluster is a collection of servers, called nodes that communicate with each other to make a set of
services highly available to clients. Server clusters are based on one of the two clustering technologies in
the Microsoft Windows Server 2003 operating systems. The other clustering technology is Network Load
Balancing. Server clusters are designed for applications that have long-running in-memory state or
frequently updated data. Typical uses for server clusters include file servers, print servers, database
servers, and messaging servers.
This section provides technical background information about how the components within a server cluster
work.
The most basic type of cluster is a two-node cluster with a single quorum device. For a definition of a single
quorum device, see “What Is a Server Cluster? [ http://technet.microsoft.com/en-us/library/cc785197.aspx
] .” The following figure illustrates the basic elements of a server cluster, including nodes, resource groups,
and the single quorum device, that is, the cluster storage.
1 of 19 27/4/09 00:17
How a Server Cluster Works: Server Clusters (MSCS) http://technet.microsoft.com/en-us/library/cc738051(printer).aspx
Applications and services are configured as resources on the cluster and are grouped into resource groups.
Resources in a resource group work together and fail over together when failover is necessary. When you
configure each resource group to include not only the elements needed for the application or service but
also the associated network name and IP address, then that collection of resources runs as if it were a
separate server on the network. When a resource group is configured this way, clients can consistently get
access to the application using the same network name, regardless of which node the application is running
on.
The preceding figure showed one resource group per node. However, each node can have multiple resource
groups. Within each resource group, resources can have specific dependencies. Dependencies are
relationships between resources that indicate which resources need to come online before another resource
can come online. When dependencies are configured, the Cluster service can bring resources online or take
them offline in the correct order during failover.
The following figure shows two nodes with several resource groups in which some typical dependencies
have been configured between resources. The figure shows that resource groups (not resources) are the
unit of failover.
2 of 19 27/4/09 00:17
How a Server Cluster Works: Server Clusters (MSCS) http://technet.microsoft.com/en-us/library/cc738051(printer).aspx
The Cluster service runs on each node of a server cluster and controls all aspects of server cluster
operation. The Cluster service includes multiple software components that work together. These
components perform monitoring, maintain consistency, and smoothly transfer resources from one node to
another.
Diagrams and descriptions of the following components are grouped together because the components work
so closely together:
Separate diagrams and descriptions are provided of the following components, which are used in specific
situations or for specific types of applications:
Checkpoint Manager
Log Manager (quorum logging)
Event Log Replication Manager
Backup and Restore capabilities in Failover Manager
Diagrams of Database Manager, Node Manager, Failover Manager, Global Update Manager, and
Resource Monitors
The following figure focuses on the information that is communicated between Database Manager, Node
Manager, and Failover Manager. The figure also shows Global Update Manager, which supports the other
three managers by coordinating updates on other nodes in the cluster. These four components work
together to make sure that all nodes maintain a consistent view of the cluster (with each node of the cluster
maintaining the same view of the state of the member nodes as the others) and that resource groups can
be failed over smoothly when needed.
Basic Cluster Components: Database Manager, Node Manager, and Failover Manager
3 of 19 27/4/09 00:17
How a Server Cluster Works: Server Clusters (MSCS) http://technet.microsoft.com/en-us/library/cc738051(printer).aspx
The following figure shows a Resource Monitor and resource dynamic-link library (DLL) working with
Database Manager, Node Manager, and Failover Manager. Resource Monitors and resource DLLs support
applications that are cluster-aware, that is, applications designed to work in a coordinated way with cluster
components. The resource DLL for each such application is responsible for monitoring and controlling that
application. For example, the resource DLL saves and retrieves application properties in the cluster
database, brings the resource online and takes it offline, and checks the health of the resource. When
failover is necessary, the resource DLL works with a Resource Monitor and Failover Manager to ensure that
the failover happens smoothly.
4 of 19 27/4/09 00:17
How a Server Cluster Works: Server Clusters (MSCS) http://technet.microsoft.com/en-us/library/cc738051(printer).aspx
Descriptions of Database Manager, Node Manager, Failover Manager, Global Update Manager,
and Resource Monitors
The following descriptions provide details about the components shown in the preceding diagrams.
Database Manager
Database Manager runs on each node and maintains a local copy of the cluster configuration database,
which contains information about all of the physical and logical items in a cluster. These items include the
cluster itself, cluster node membership, resource groups, resource types, and descriptions of specific
resources, such as disks and IP addresses. Database Manager uses the Global Update Manager to replicate
all changes to the other nodes in the cluster. In this way, consistent configuration information is maintained
across the cluster, even if conditions are changing such as if a node fails and the administrator changes the
cluster configuration before that node returns to service.
Database Manager also provides an interface through which other Cluster service components, such as
Failover Manager and Node Manager, can store changes in the cluster configuration database. The interface
for making such changes is similar to the interface for making changes to the registry through the Windows
application programming interface (API). The key difference is that changes received by Database Manager
are replicated through Global Update Manager to all nodes in the cluster.
Some Database Manager functions are exposed through the cluster API. The primary purpose for exposing
Database Manager functions is to allow custom resource DLLs to save private properties to the cluster
database when this is useful for a particular clustered application. (A private property for a resource is a
property that applies to that resource type but not other resource types; for example, the SubnetMask
property applies for an IP Address resource but not for other resource types.) Database Manager functions
5 of 19 27/4/09 00:17
How a Server Cluster Works: Server Clusters (MSCS) http://technet.microsoft.com/en-us/library/cc738051(printer).aspx
Node Manager
Node Manager runs on each node and maintains a local list of nodes, networks, and network interfaces in
the cluster. Through regular communication between nodes, Node Manager ensures that all nodes in the
cluster have the same list of functional nodes.
Node Manager uses the information in the cluster configuration database to determine which nodes have
been added to the cluster or evicted from the cluster. Each instance of Node Manager also monitors the
other nodes to detect node failure. It does this by sending and receiving messages, called heartbeats, to
each node on every available network. If one node detects a communication failure with another node, it
broadcasts a message to the entire cluster, causing all nodes that receive the message to verify their list of
functional nodes in the cluster. This is called a regroup event.
Node Manager also contributes to the process of a node joining a cluster. At that time, on the node that is
joining, Node Manager establishes authenticated communication (authenticated RPC bindings) between
itself and the Node Manager component on each of the currently active nodes.
Note
A down node is different from a node that has been evicted from the cluster. When you evict a
node from the cluster, it is removed from Node Manager’s list of potential cluster nodes. A down
node remains on the list of potential cluster nodes even while it is down; when the node and
the network it requires are functioning again, the node joins the cluster. An evicted node,
however, can become part of the cluster only after you use Cluster Administrator or Cluster.exe
to add the node back to the cluster.
Membership Manager
Membership Manager (also called the Regroup Engine) causes a regroup event whenever another node’s
heartbeat is interrupted (indicating a possible node failure). During a node failure and regroup event,
Membership Manager and Node Manager work together to ensure that all functioning nodes agree on which
nodes are functioning and which are not.
Node Manager and other components make use of the Cluster Network Driver, which supports specific types
of network communication needed in a cluster. The Cluster Network Driver runs in kernel mode and
provides support for a variety of functions, especially heartbeats and fault-tolerant communication between
nodes.
Failover Manager manages resources and resource groups. For example, Failover Manager stops and starts
resources, manages resource dependencies, and initiates failover of resource groups. To perform these
actions, it receives resource and system state information from cluster components on the node and from
Resource Monitors. Resource Monitors provide the execution environment for resource DLLs and support
communication between resources DLLs and Failover Manager.
Failover Manager determines which node in the cluster should own each resource group. If it is necessary to
fail over a resource group, the instances of Failover Manager on each node in the cluster work together to
reassign ownership of the resource group.
Depending on how the resource group is configured, Failover Manager can restart a failing resource locally
or can take the failing resource offline along with its dependent resources, and then initiate failover.
Global Update Manager makes sure that when changes are copied to each of the nodes, the following takes
place:
Changes are made atomically, that is, either all healthy nodes are updated, or none are
updated.
Changes are made in the order they occurred, regardless of the origin of the change. The
6 of 19 27/4/09 00:17
How a Server Cluster Works: Server Clusters (MSCS) http://technet.microsoft.com/en-us/library/cc738051(printer).aspx
process of making changes is coordinated between nodes so that even if two different changes
are made at the same time on different nodes, when the changes are replicated they are put in
a particular order and made in that order on all nodes.
Global Update Manager is used by internal cluster components, such as Failover Manager, Node Manager, or
Database Manager, to carry out the replication of changes to each node. Global updates are typically
initiated as a result of a Cluster API call. When an update is initiated by a node, another node is designated
to monitor the update and make sure that it happens on all nodes. If that node cannot make the update
locally, it notifies the node that tried to initiate the update, and changes are not made anywhere (unless the
operation is attempted again). If the node that is designated to monitor the update can make the update
locally, but then another node cannot be updated, the node that cannot be updated is removed from the list
of functional nodes, and the change is made on available nodes. If this happens, quorum logging is enabled
at the same time, which ensures that the failed node receives all necessary configuration information when
it is functioning again, even if the original set of nodes is down at that time.
Some applications store configuration information locally instead of or in addition to storing information in
the cluster configuration database. Applications might store information locally in two ways. One way is to
store configuration information in the registry on the local server; another way is to use cryptographic keys
on the local server. If an application requires that locally-stored information be available on failover,
Checkpoint Manager provides support by maintaining a current copy of the local information on the quorum
resource.
Checkpoint Manager
Checkpoint Manager handles application-specific configuration data that is stored in the registry on the local
server somewhat differently from configuration data stored using cryptographic keys on the local server.
The difference is as follows:
For applications that store configuration data in the registry on the local server, Checkpoint
Manager monitors the data while the application is online. When changes occur, Checkpoint
7 of 19 27/4/09 00:17
How a Server Cluster Works: Server Clusters (MSCS) http://technet.microsoft.com/en-us/library/cc738051(printer).aspx
Manager updates the quorum resource with the current configuration data.
For applications that use cryptographic keys on the local server, Checkpoint Manager copies the
cryptographic container to the quorum resource only once, when you configure the checkpoint.
If changes are made to the cryptographic container, the checkpoint must be removed and
re-associated with the resource.
Before a resource configured to use checkpointing is brought online (for example, for failover), Checkpoint
Manager brings the locally-stored application data up-to-date from the quorum resource. This helps make
sure that the Cluster service can recreate the appropriate application environment before bringing the
application online on any node.
Note
When configuring a Generic Application resource or Generic Service resource, you specify the
application-specific configuration data that Checkpoint Manager monitors and copies. When
determining which configuration information must be marked for checkpointing, focus on the
information that must be available when the application starts.
Checkpoint Manager also supports resources that have application-specific registry trees (not just individual
keys) that exist on the cluster node where the resource comes online. Checkpoint Manager watches for
changes made to these registry trees when the resource is online (not when it is offline). When the
resource is online and Checkpoint Manager detects that changes have been made, it creates a copy of the
registry tree on the owner node of the resource and then sends a message to the owner node of the
quorum resource, telling it to copy the file to the quorum resource. Checkpoint Manager performs this
function in batches so that frequent changes to registry trees do not place too heavy a load on the Cluster
service.
The following figure shows how Log Manager works with other components when quorum logging is enabled
(when a node is down).
8 of 19 27/4/09 00:17
How a Server Cluster Works: Server Clusters (MSCS) http://technet.microsoft.com/en-us/library/cc738051(printer).aspx
When a node is down, quorum logging is enabled, which means Log Manager receives configuration
changes collected by other components (such as Database Manager) and logs the changes to the quorum
resource. The configuration changes logged on the quorum resource are then available if the entire cluster
goes down and must be formed again. On the first node coming online after the entire cluster goes down,
Log Manager works with Database Manager to make sure that the local copy of the configuration database
is updated with information from the quorum resource. This is also true in a cluster forming for the first
time — on the first node, Log Manager works with Database Manager to make sure that the local copy of
the configuration database is the same as the information from the quorum resource.
Event Log Replication Manager, part of the Cluster service, works with the operating system’s Event Log
service to copy event log entries to all cluster nodes. These events are marked to show which node the
event occurred on.
The following figure shows how Event Log Replication Manager copies event log entries to other cluster
nodes.
How Event Log Entries Are Copied from One Node to Another
9 of 19 27/4/09 00:17
How a Server Cluster Works: Server Clusters (MSCS) http://technet.microsoft.com/en-us/library/cc738051(printer).aspx
The following interfaces and protocols are used together to queue, send, and receive events at the nodes:
Events that are logged on one node are queued, consolidated, and sent through Event Log Replication
Manager, which broadcasts them to the other active nodes. If few events are logged over a period of time,
each event might be broadcast individually, but if many are logged in a short period of time, they are
batched together before broadcast. Events are labeled to show which node they occurred on. Each of the
other nodes receives the events and records them in the local log. Replication of events is not guaranteed
by Event Log Replication Manager — if a problem prevents an event from being copied, Event Log
Replication Manager does not obtain notification of the problem and does not copy the event again.
The Backup and Restore capabilities in Failover Manager coordinate with other Cluster service components
when a cluster node is backed up or restored, so that cluster configuration information from the quorum
resource, and not just information from the local node, is included in the backup. The following figure shows
how the Backup and Restore capabilities in Failover Manager work to ensure that important cluster
configuration information is captured during a backup.
Backup Request on a Node That Does Not Own the Quorum Resource
10 of 19 27/4/09 00:17
How a Server Cluster Works: Server Clusters (MSCS) http://technet.microsoft.com/en-us/library/cc738051(printer).aspx
A number of DLLs that are used by core resource types are included with server clusters in
Windows Server 2003. The resource DLL defines and manages the resource. The extension DLL (where
applicable) defines the resource’s interaction with Cluster Administrator.
Network Name
Print Spooler
File Share
Generic Application
Generic Script
Generic Service
Local Quorum
The following table lists files that are in the cluster directory (systemroot\cluster, where systemroot is the
root directory of the server’s operating system).
File Description
Clcfgsrv.dll DLL file for Add Nodes Wizard and New Server Cluster Wizard
Clcfgsrv.inf Setup information file for Add Nodes Wizard and New Server Cluster Wizard
11 of 19 27/4/09 00:17
How a Server Cluster Works: Server Clusters (MSCS) http://technet.microsoft.com/en-us/library/cc738051(printer).aspx
The following table lists files that are in systemroot\system32, systemroot\inf, or subfolders in
systemroot\system32.
12 of 19 27/4/09 00:17
How a Server Cluster Works: Server Clusters (MSCS) http://technet.microsoft.com/en-us/library/cc738051(printer).aspx
clusocm.inf systemroot\inf Cluster INF file for the Optional Component Manager
clussprt.dll systemroot\system32 A DLL that enables the Cluster service on one node to
send notice of local cluster events to the Event Log
service on other nodes
The following table lists files that have to do with the quorum resource and (for a single quorum device
cluster, the most common type of cluster) are usually in the directory q:\mscs, where q is the quorum disk
drive letter and mscs is the name of the directory.
File Description
Quolog.log The quorum log, which contains records of cluster actions that involve changes to
the cluster configuration database.
Chk*.tmp Copies of the cluster configuration database (also known as checkpoints). Only the
latest one is needed.
{GUID} Directory for each resource that requires checkpointing; the resource GUID is the
name of the directory.
With the Server Cluster application programming interface (API), developers can write applications and
resource DLLs for server clusters. The following table lists Server Cluster API subsets.
13 of 19 27/4/09 00:17
How a Server Cluster Works: Server Clusters (MSCS) http://technet.microsoft.com/en-us/library/cc738051(printer).aspx
Cluster API Works directly with cluster objects and interacts with the Cluster
service.
Nodes must form, join, and leave a cluster in a coordinated way so that the following are always true:
Only one node owns the quorum resource at any given time.
All nodes maintain the same list of functioning nodes in the cluster.
All nodes can maintain consistent copies of the cluster configuration database.
Forming a Cluster
The first server that comes online in a cluster, either after installation or after the entire cluster has been
shut down for some reason, forms the cluster. To succeed at forming a cluster, a server must:
If a node attempts to form a cluster and is unable to read the quorum log, the Cluster service will not start,
because it cannot guarantee that it has the latest copy of the cluster configuration. In other words, the
quorum log ensures that when a cluster forms, it uses the same configuration it was using when it last
stopped.
14 of 19 27/4/09 00:17
How a Server Cluster Works: Server Clusters (MSCS) http://technet.microsoft.com/en-us/library/cc738051(printer).aspx
7. The node updates its local copy of the cluster configuration database with any newer information
that might be stored on the quorum resource.
8. The node begins to bring resources and resource groups online.
Joining a Cluster
Leaving a Cluster
A node can leave a cluster when the node shuts down or when the Cluster service is stopped. When a node
leaves a cluster during a planned shutdown, it attempts to perform a smooth transfer of resource groups to
other nodes. The node leaving the cluster then initiates a regroup event.
Functioning nodes in a cluster can also force another node to leave the cluster if the node cannot perform
cluster operations, for example, if it fails to commit an update to the cluster configuration database.
When server clusters encounter changing circumstances and possible failures, the following processes help
the cluster to keep a consistent internal state and maintain availability of resources:
Heartbeats
Regroup events
Quorum arbitration
Heartbeats
Heartbeats are single User Datagram Protocol (UDP) packets exchanged between nodes once every 1.2
seconds to confirm that each node is still available. If a node is absent for five consecutive heartbeats, the
node that detected the absence initiates a regroup event to make sure that all nodes reach agreement on
the list of nodes that remain available.
Server cluster networks can be private (node-to-node communication only), public (client-to-node
communication), or mixed (both node-to-node and client-to-node communication). Heartbeats are
communicated across all networks; however, the monitoring of heartbeats and the way the cluster
interprets missed heartbeats depends on the type of network:
On private or mixed networks, which both carry node-to-node communication, heartbeats are
monitored to determine whether the node is functioning in the cluster.
A series of missed heartbeats can either mean that the node is offline or that all private and
mixed networks are offline; in either case, a node has lost its ability to function in the cluster.
On public networks, which carry only client-to-node communication, heartbeats are monitored
only to determine whether a node’s network adapter is functioning.
Regroup Events
If a node is absent for five consecutive heartbeats, a regroup event occurs. (Membership Manager,
described earlier, starts the regroup event.)
If an individual node remains unresponsive, the node is removed from the list of functional nodes. If the
unresponsive node was the owner of the quorum resource, the remaining nodes also begin the quorum
15 of 19 27/4/09 00:17
How a Server Cluster Works: Server Clusters (MSCS) http://technet.microsoft.com/en-us/library/cc738051(printer).aspx
Quorum Arbitration
Quorum arbitration is the process that occurs when the node that owned the quorum resource fails or is
unavailable, and the remaining nodes determine which node will take ownership. When a regroup event
occurs and the unresponsive node owned the quorum resource, another node is designated to initiate
quorum arbitration. A basic goal for quorum arbitration is to make sure that only one node owns the
quorum resource at any given time.
It is important that only one node owns the quorum resource because if all network communication
between two or more cluster nodes fails, it is possible for the cluster to split into two or more partial
clusters that will try to keep functioning (sometimes called a “split brain” scenario). Server clusters prevent
this by allowing only the partial cluster with a node that owns the quorum resource to continue as the
cluster. Any nodes that cannot communicate with the node that owns the quorum resource stop working as
cluster nodes.
This section describes how clusters keep resource groups available by monitoring the health of resources
(polling), bringing resource groups online, and carrying out failover. Failover means transferring ownership
of the resources within a group from one node to another. This section also describes how resource groups
are taken offline as well as how they are failed back, that is, how resource groups are transferred back to a
preferred node after that node has come back online.
Transferring ownership can mean somewhat different things depending on which of the group’s resources is
being transferred. For an application or service, the application or service is stopped on one node and
started on another. For an external device, such as a Physical Disk resource, the right to access the device
is transferred. Similarly, the right to use an IP address or a network name can be transferred from one node
to another.
The administrator of the cluster initiates resource group moves, usually for maintenance or other
administrative tasks. Group moves initiated by an administrator are similar to failovers in that the Cluster
service initiates resource transitions by issuing commands to Resource Monitor through Failover Manager.
Resource Monitor conducts two kinds of polling on each resource that it monitors: Looks Alive (resource
appears to be online) and Is Alive (a more thorough check indicates the resource is online and functioning
properly).
If a Generic Application resource has a long startup time, you can lengthen the polling interval
to allow the resource to finish starting up. In other words, you might not require a custom
resource DLL to ensure that the resource is given the necessary startup time.
If you lengthen the polling intervals, you reduce the chance that polls will interfere with each
other (the chance for lock contention).
You can bypass Looks Alive polling by setting the interval to 0.
The following sequence is used when Failover Manager and Resource Monitor bring a resource group online.
1. Failover Manager uses the dependency list (in the cluster configuration) to determine the appropriate
order in which to bring resources online.
2. Failover Manager works with Resource Monitor to begin bringing resources online. The first resource
16 of 19 27/4/09 00:17
How a Server Cluster Works: Server Clusters (MSCS) http://technet.microsoft.com/en-us/library/cc738051(printer).aspx
4. The sequence is repeated as Failover Manager brings the next resource online. Failover Manager
uses the dependency list to determine the correct order for bringing resources online.
After resources have been brought online, Failover Manager works with Resource Monitor to determine if
and when failover is necessary and to coordinate failover.
Failover Manager takes a resource group offline as part of the failover process or when an administrator
moves the group for maintenance purposes. The following sequence is used when Failover Manager takes a
resource group offline:
1. Failover Manager uses the dependency list (in the cluster configuration) to determine the appropriate
order in which to bring resources offline.
2. Failover Manager works with Resource Monitor to begin taking resources offline. The first resource or
resources stopped are ones on which other resources do not depend.
3. Resource Monitor calls the Offline entry point of the resource DLL and returns the result to Failover
Manager.
4. The sequence is repeated as Failover Manager brings the next resource offline. Failover Manager
uses the dependency list to determine the correct order for taking resources offline.
Group failover happens when the group or the node that owns the group fails. Individual resource failure
causes the group to fail if you configure the Affect the group property for the resource.
1. Resource Monitor detects the failure, either through Looks Alive or Is Alive polling or through an
event signaled by the resource. Resource Monitor calls the IsAlive entry point of the resource DLL to
17 of 19 27/4/09 00:17
How a Server Cluster Works: Server Clusters (MSCS) http://technet.microsoft.com/en-us/library/cc738051(printer).aspx
Failover that occurs when a node fails is different from failover that occurs when a resource fails. For the
purposes of clustering, a node is considered to have failed if it loses communication with other nodes.
As described in previous sections, if a node misses five heartbeats, this indicates that it has failed, and a
regroup event (and possibly quorum arbitration) occurs. After node failure, surviving nodes negotiate for
ownership of the various resource groups. On two-node clusters the result is obvious, but on clusters with
more than two nodes, Failover Manager on the surviving nodes determines group ownership based on the
following:
The nodes you have specified as possible owners of the affected resource groups.
The order in which you specified the nodes in the group’s Preferred Owners list.
Note
When setting up a preferred owners list for a resource group, we recommend that you list all
nodes in your server cluster and put them in priority order.
Failback is the process by which the Cluster service moves resource groups back to their preferred node
after the node has failed and come back online. You can configure both whether and when failback occurs.
By default, groups are not set to fail back.
The node to which the group will fail back initiates the failback. Failover Manager on that node contacts
Failover Manager on the node where the group is currently online and negotiates for ownership. The
instances of Failover Manager on the two nodes work together to smoothly transfer ownership of the
resource group back to the preferred node.
You can test failback configuration settings by following procedures in Help and Support Center.
Related Information
“Planning Server Deployments” in the Windows Server 2003 Deployment Kit on the Microsoft
Web site for more information about failover policies, choices for cluster storage, and ways that
18 of 19 27/4/09 00:17
How a Server Cluster Works: Server Clusters (MSCS) http://technet.microsoft.com/en-us/library/cc738051(printer).aspx
Tags:
Community Content
19 of 19 27/4/09 00:17