Technical Overview of Clustering in Windows Server 2003

Technical Overview of Clustering in Windows Server 2003

Microsoft Corporation
Published: January 2003

This white paper summarizes the new clustering features available in Microsoft® Windows® Server 2003.

The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication.
Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation.
Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property.
© 2002. Microsoft Corporation. All rights reserved. Microsoft, Active Directory, Windows, and Windows NT are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries.
The names of actual companies and products mentioned herein may be the trademarks of their respective owners.

Server Clusters 6
General 6
Larger Cluster Sizes 6
64-Bit Support 6
Terminal Server Application Mode 6
Majority Node Set (MNS) Clusters 7
Installation 7
Installed by Default 7
Pre-configuration Analysis 8
Default Values 8
Multi Node Addition 8
Extensible Architecture 8
Remote Administration 9
Command Line Tools 9
Simpler Uninstallation 9
Quorum Log Size 9
Local Quorum 9
Quorum Selection 10
Integration 10
Active Directory 10
Extend Cluster Shared Disk Partitions 11
Resources 11
Printer Configuration 11
MSDTC Configuration 12
Scripting 12
MSMQ Triggers 12
Network Enhancements 12
Enhanced Network Failover 12
Media Sense Detection 13
Multicast Heartbeat 13
Storage 13
Volume Mount Points 13
Client Side Caching (CSC) 14
Distributed File System 14
Encrypted File System 15
Storage Area Networks (SAN) 15
Operations 15
Backup and Restore 15
Enhanced Node Failover 15
Group Affinity Support 16
Node Eviction 16
Rolling Upgrades 16
Queued Changes 16
Disk Changes 17
Password Change 17
Resource Deletion 17
WMI Support 17
Supporting and Troubleshooting 18
Offline/Failure Reason Codes 18
Software Tracing 18
Cluster Logs 18
Event Log 19
Clusdiag 19
Chkdsk Log 19
Disk Corruption 19
Network Load Balancing 21
Network Load Balancing Manager 21
Virtual Clusters 21
Multi-NIC support 22
Bi-directional Affinity 22
Limiting switch flooding using IGMP support 23

Server Clusters

NOTE: Server clusters is a general term used to describe clusters based on the Microsoft® Cluster Service (MSCS), as opposed to clusters based on Network Load Balancing.
Larger Cluster Sizes
Microsoft Windows® Server 2003 Enterprise Edition now supports 8-node clusters (was two), and Windows Server 2003 Datacenter Edition now supports 8-node clusters (was four).
• Greater Flexibility – this provides much more flexibility in how applications can be deployed on a Server cluster. Applications that support multiple instances can run more instances across more nodes; multiple applications can be deployed on a single Server cluster with much more flexibility and control over the semantics if/when a node fails or is taken down for maintenance.
64-Bit Support
The 64-bit versions of Windows Server 2003 Enterprise Edition and Datacenter Edition support Cluster Service.
• Large Memory Needs – Microsoft SQL Server™ 2000 Enterprise Edition (64-bit) is one example of an application that can make use of the increased memory space of 64-bit Windows Server 2003 (up to 4TB – Windows 2000 Datacenter only supports up to 64GB), while at the same time taking advantage of clustering. This provides an incredibly powerful platform for the most computer intensive applications, while ensuring high availability of those applications.
NOTE: GUID Partition Table (GPT) disks, a new disk architecture in Windows Server 2003 that supports up to 18 exabyte disks, is not supported with Server clusters.
Terminal Server Application Mode
Terminal Server can run in application mode on nodes in a Server cluster. NOTE: There is no failover of Terminal Server sessions.
• High Availability – Terminal Server directory service can be made highly available through failover.
Majority Node Set (MNS) Clusters
Windows Server 2003 has an optional quorum resource that does not require a disk on a shared bus for the quorum device. This feature is designed to be built in to larger end-to-end solutions by OEMs, IHVs and other software vendors rather than be deployed by end-users specifically, although this is possible for experienced users. The scenarios targeted by this new feature include:
• Geographically dispersed clusters. This mechanism provides a single, Microsoft-supplied quorum resource that is independent of any storage solution for a geographically dispersed or multi-site cluster. NOTE: There is a separate cluster Hardware Compatibility List (HCL) for geographic clusters.
• Low-cost or appliance-like highly available solutions that have no shared disks but use other techniques such as log shipping or software disk or file system replication and mirroring to make data available on multiple nodes in the cluster.
NOTE: Windows Server 2003 provides no mechanism to mirror or replicate user data across the nodes of an MNS cluster, so while it is possible to build clusters with no shared disks at all, it is an application specific issue to make the application data highly available and redundant across machines.
• Storage Abstraction – frees up the storage subsystem to manage data replication between multiple sites in the most appropriate way, without having to worry about a shared quorum disk, and at the same time still supporting the idea of a single virtual cluster.
• No Shared Disks – there are some scenarios that require tightly consistent cluster features, yet do not require shared disks. For example, a) clusters where the application keeps data consistent between nodes (e.g. database log shipping and file replication for relatively static data), and b) clusters that host applications that have no persistent data, but need to cooperate in a tightly coupled way to provide consistent volatile state.
• Enhanced Redundancy – if the shared quorum disk is corrupted in any way, the entire cluster goes offline. With Majority Node Sets, the corruption of quorum on one node does not bring the entire cluster offline.
Installed by Default
Clustering is installed by default. You only need to configure a Cluster by launching Cluster Administrator or script the configuration with Cluster.exe. In addition, third-party quorum resources can be pre-installed and then selected during Server cluster configuration, rather than having additional resource specific procedures. All Server cluster configurations can be deployed the same way.
• Easier Administration – you no longer need to provide a media CD to install Server clusters.
• No reboot – you no longer need to reboot after you install or uninstall Cluster Service.
Pre-configuration Analysis
Analyzes and verifies hardware and software configuration and identifies potential problems. Provides a comprehensive and easy-to-read report on any potential configuration issues before the Server cluster is created.
• Compatability – Ensures that any known incompatibilities are detected prior to configuration. For example, Service for Macintosh (SFM), Network Load Balancing (NLB), dynamic disks, and DHCP issued addresses are not supported with Cluster Service.
Default Values
Creates a Server cluster that conforms to best practices using default values and heuristics. Many times for newly created Server clusters, the default values are the most appropriate configuration.
• Easier Administration – Server cluster creation asks fewer setup questions, data is collected and the code makes decisions about the configuration. The goal is to get a “default” Server cluster up and running that can then be customized using the Server cluster administration tools if required.
Multi Node Addition
Allows multiple nodes to be added to a Server cluster in a single operation.
• Easier Administration – makes it quicker and easier to create multi-node Server clusters.
Extensible Architecture
Extensible architecture allows applications and system components to take part in Server cluster configuration. For example, applications can be installed prior to a server being server clustered and the application can participate in (or even block) this node joining the Server cluster.
• Third-Party Support – allows applications to setup Server cluster resources and/or change their configuration as part of Server cluster installations rather than as a separate post-Server cluster installation task.
Remote Administration
Allows full remote creation and configuration of the Server cluster. New Server clusters can be created and nodes can be added to an existing Server cluster from a remote management station. In addition, drive letter changes and physical disk resource fail-over are updated to Terminal Server client’s sessions.
• Easier Administration – allows for better remote administration via Terminal Services.
Command Line Tools
Server cluster creation and configuration can be scripted through the cluster.exe command line tool.
• Easier Administration – much easier to automate the process of creating a cluster.
Simpler Uninstallation
Uninstalling Cluster Service from a node is now a one step process of evicting the node. Previous versions required eviction followed by uninstallation.
• Easier Administration – Uninstalling the Cluster Service is much more efficient as you only need to evict the node through Cluster Administrator or Cluster.exe and the node is unconfigured for Cluster support. There is also a new switch for Cluster.exe which will force the uninstall if there is a problem with getting into Cluster Administrator:

cluster node %NODENAME% /force
Quorum Log Size
The default size of the quorum log has been increased to 4096 KB (was 64 KB).
• Large number of shares – a quorum log of 4,096 KB allows for large numbers of file or printer shares (e.g. 200 printer shares). In previous versions, the quorum log would run out of space with this many shares, causing inconsistent failover of resources.
Local Quorum
If a node is not attached to a shared disk, it will automatically configure a “Local Quorum” resource. It is also possible to create a local quorum resource once Cluster Service is running.
• Test Cluster – This makes it very easy for users to create a test cluster on their local PC for testing out cluster applications, or for getting familiar with the Cluster Service. Users do not need special cluster hardware that has been certified on the Microsoft Cluster HCL to run a test cluster.
Note: Local quorum is only supported for one node clusters (i.e. lonewolf). In addition, the use of hardware that has not been certified on the HCL is not supported for production environments.
• Recovery – in the event you lose all of your shared disks, one option for getting a temporary cluster working (e.g. while you wait for new hardware) is to use the cluster.exe /fixquorum switch to start the cluster, then create a local quorum resource and set this as your quorum. In the case of a print cluster, you can point the spool folder to the local disk. In the case of a file share, you can point the file share resource to the local disk, where backup data has been restored. Obviously, this does not provide any failover, and would only be seen as a temporary measure.
Quorum Selection
You no longer need to select which disk is going to be used as the Quorum Resource. It is automatically configured on the smallest disk that is larger then 50 MB and formatted NTFS.
• Easier Administration – the end user no longer has to worry about which disk to use for the quorum. NOTE: The option to move the Quorum Resource to another disk is available during setup or after the Cluster has been configured.

Active Directory
Cluster Service now has much tighter integration with Active Directory™ (AD), including a “virtual” computer object, Kerberos authentication, and a default location for services to publish service control points (e.g. MSMQ).
• Virtual Server – by publishing a cluster virtual server as a computer object in the Active Directory, users can access the virtual server just like any other Windows 2000 server. In particular, it removes the need for NetBIOS to browse and administrator the cluster nodes, allowing clients to locate cluster objects via DNS, the default name resolution service for Windows Server 2003. NOTE: Although the network name Server cluster resource publishes a computer object in Active Directory, that computer object should NOT be used for administrative tasks such as applying group policy. The ONLY roles for the virtual server computer object in Windows Server 2003 are:
o To allow Kerberos authentication to services hosted in a virtual server, and
o For cluster-aware and Active Directory-aware services (such as MSMQ) to publish service provider information specific to the virtual server they are hosted in.
• Kerberos Authentication – this form of authentication allows users to be authenticated against a server without ever having to send their password. Instead, they present a ticket that grants them access to the server. This contrasts to NTLM authentication, used by Windows 2000 Cluster Service, which sends the user’s password as a hash over the network. In addition, Kerberos supports mutual authentication of client and server, and allows delegation of authentication across multiple machines. NOTE: In order to have Kerberos authentication for the virtual server in a mixed mode cluster (i.e. Windows 2000 & Windows Server 2003), you must be running Windows 2000 Advanced Server SP3 or higher. Otherwise NTLM will be used for all authentications.
• Publish Services – now that Cluster Service is Active Directory-aware, it can integrate with other services that publish information about their service in AD. For example, Microsoft Message Queuing (MSMQ) 2.0 can publish information about public queues in AD, so that users can easily find their nearest queue,. Windows Server 2003 now extends this to allow clustered public queue information to be published in AD.
NOTE: Cluster integration does not make any changes to the AD schema.
Extend Cluster Shared Disk Partitions
If the underlying storage hardware supports dynamic expansions of a disk unit, or LUN, then the disk volume can be extended online using the DISKPART.EXE utility.
Easier Administration – Existing volumes can be expanded online without taking down applications or services.
Printer Configuration
Cluster Service now provides a much simpler configuration process for setting up clustered printers.
• Easier Administration – To set up a clustered print server, you need to configure only the Spooler resource in Cluster Administrator and then connect to the virtual server to configure the ports and print queues. This is an improvement over previous versions of Cluster Service in which you had to repeat the configuration steps on each node in the cluster, including installing printer drivers.
MSDTC Configuration
The Microsoft Distributed Transaction Coordinator (MSDTC) can now be configured once, and then be replicated to all nodes.
• Easier Administration – in previous versions, the COMCLUST.EXE utility had to be run on each node in order to cluster the MSDTC. It is now possible to configure MSDTC as a resource type, assign it to a resource group, and have it automatically configured on all cluster nodes.
Existing applications can be made Server cluster-aware using scripting (VBScript and Jscript) rather than writing resource dlls in C or C++.
• Easier Development – makes it much simpler to write specific resource plug-ins for applications so they can be monitored and controlled in a Server cluster. Supports resource specific properties, allowing a resource script to store Server cluster-wide configurations that can be used and managed in the same way as any other resource.
MSMQ Triggers
Cluster Service has enhanced the MSMQ resource type to allow multiple instances on the same cluster.
• Enhanced Functionality – allows you to have multiple clustered message queues running at the same time, providing increased performance (in the case of Active/Active MSMQ clusters) and flexibility.
NOTE: You can only have one MSMQ resource per Cluster Group
Network Enhancements
Enhanced Network Failover
Cluster Service now supports enhanced logic for failover when there has been a complete loss of internal (heartbeat) communication. The network state for public communication of all nodes is now taken into account.
• Better Failover – in Windows 2000, if Node A owned the quorum disk and lost all network interfaces (i.e. public and heartbeat), it would retain control of the cluster, despite the fact that no one could communicate with it, and that another node may have had a working public interface. Windows Server 2003 cluster nodes now take the state of their public interfaces into account prior to arbitrating for control of the cluster.
Media Sense Detection
When using Cluster Service, if network connectivity is lost, the TCP/IP stack does not get unloaded by default, as it did in Windows 2000. There is no longer the need to set the DisableDHCPMediaSense registry key.
• Better Failover – in Windows 2000, if network connectivity is lost, the TCP/IP stack was unloaded, which meant that all resources that depended on IP addresses were taken offline. Also, when the networks came back online, their network role reverted to the default setting (i.e. client and private). By disabling Media Sense by default, it means the network role is preserved, as well as keeping all IP address dependant resources online.
Multicast Heartbeat
Allows multi-cast heartbeats between nodes in a Server cluster. Multi-cast heartbeat is automatically selected if the cluster is large enough and the network infrastructure can support multi-cast between the cluster nodes. Although the multi-cast parameters can be controlled manually, a typical configuration requires no administration tasks or tuning to enable this feature. If multicast communication fails for any reason, the internal communications will revert to unicast. All internal communications are signed and secure.
• Reduced Network Traffic – by using multicast, it reduces the amount of traffic in a cluster subnet, which can be particularly beneficial in clusters of more than two nodes, or geographically dispersed clusters.

Volume Mount Points
Volume mount points are now supported on shared disks (excluding the quorum), and will work properly on failover if configured correctly.
• Flexible Filesystem Namespace – volume mount points (Windows 2000 or later) are directories that point to specified disk volumes in a persistent manner (e.g. you can configure C:\Data to point to a disk volume). They bypass the need to associate each disk volume with a drive letter, thereby surpassing the 26 drive letter limitation (e.g. without volume mount points, you would have to create a G: drive to map the “Data” volume to). Now that Cluster Service supports volume mount points, you have much greater flexibility in how you map your shared disk namespace.
NOTE: The directory that hosts the volume mount point must be NTFS since the underlying mechanism uses NTFS reparse points. However the file system that is being mounted can be FAT, FAT32, NTFS, CDFS, or UDFS.
Client Side Caching (CSC)
Client Side Caching (CSC) is now supported for clustered file shares.
• Offline File Access –Client Side Caching for clustered file shares allows a client to cache data stored on a clustered share. The client works on a local copy of the data that is uploaded back to the Server cluster when the file is closed. This allows the failure of a server in the Server cluster and the subsequent failover of the file share service to be hidden from the client.
Distributed File System
Distributed File System (DFS) has had a number of improvements, including multiple stand-alone roots, independent root failover, and support for Active/Active configurations.
Distributed File System (DFS) allows multiple file shares on different machines to be aggregated into a common namespace (e.g. \\dfsroot\share1 and \\dfsroot\share2 are actually aggregated from \\server1\share1 and \\server2\share2). New clustering benefits include:
• Multiple Stand-Alone Roots – previous versions only supported one clustered stand-alone root. You can now have multiple ones, giving you much greater flexibility in planning your distributed file system namespace (e.g. multiple DFS roots on the same virtual server, or multiple DFS roots on different virtual servers).
• Independent Failover – granular failover control is available for each DFS root, allowing you to configure failover settings on an individual basis and resulting in faster failover times.
• Active/Active Configurations – you can now have multiple stand-alone roots running actively on multiple nodes.
Encrypted File System
With Windows Server 2003, the encrypting file system (EFS) is supported on clustered file shares. This allows data to be stored in encrypted format on clustered disks.
Storage Area Networks (SAN)
Clustering has been optimized for SANs, including targeted device resets and the shared storage buses.
• Targeted Bus Resets – the Server cluster software now issues a special control code when releasing disk drives during arbitration. This can be used in conjunction with HBA drivers that support the extended Windows Server 2003 feature set to selectively reset devices on the SAN rather than full bus reset. This ensures that the Server cluster has much lower impact on the SAN fabric.
• Shared Storage Bus – shared disks can be located on the same storage bus as the Boot, Pagefile and dump file disks. This allows a clustered server to have a single storage bus (or a single redundant storage bus). NOTE: This feature is disabled by default due to the configuration restrictions. This feature can/should only be enabled by OEMs and IHVs for specific and qualified solutions. This is NOT a general purpose feature exposed to end users.

Backup and Restore
You can actively restore the local cluster nodes cluster configuration or you can restore the cluster information to all nodes in the Cluster. A node restoration is also built into Automatic System Recovery (ASR).
• Backup and Restore – Backup (NTBackup.exe) in Windows Server 2003 has been enhanced to enable seamless backups and restores of the local Cluster database, and to be able to restore the configuration locally and to all nodes in a Cluster.
• Automated System Recovery – ASR can completely restore a cluster in a variety of scenarios, including: a) damaged or missing system files, b) complete OS reinstallation due to hardware failure, c) a damaged Cluster database, and d) changed disk signatures (including shared).
Enhanced Node Failover
Cluster Service now includes enhanced logic for node failover when you have a cluster with three or more nodes. This includes doing a manual “Move Group” operation in Cluster Administrator.
• Better Failover – during failover in a cluster with three or more nodes, the Cluster Service will take into account the “Preferred Owner List” for each resource, as well as the installation order for each node, in order to work out which node the group should be moved to.
Group Affinity Support
Allows an application to describe itself as an N+I application. In other words, the application is running actively on N nodes of the Server cluster and there are I “spare” nodes available if an active node fails. In the event of failure, the failover manager will try to ensure that the application is failed over to a spare node rather than a node that is currently running the application.
• Better performance – applications are failed over to spare nodes before active nodes.
Node Eviction
Evicting a node from a Server cluster no longer requires a reboot to clean up the Server cluster state. A node can be moved from one Server cluster to another without having to reboot. In the even of a catastrophic failure, the Server cluster configuration can be force cleaned regardless of the Server cluster state.
• Increased Availability – not having to reboot increases the uptime of the system.
• Disaster Recovery – in the event of a node failure, the cluster can be cleaned up easily.
Rolling Upgrades
Rolling upgrades are supported from Windows 2000 to Windows Server 2003.
• Minimum downtime – rolling upgrades allow one node in a cluster to be taken offline for upgrading, while other nodes in the cluster continue to function on an older version. NOTE: There is no support for rolling upgrades from a Microsoft Windows NT 4.0 cluster to a Windows Server 2003 cluster. An upgrade from Windows NT 4.0 is supported but the cluster will have to be taken offline during the upgrade.
Queued Changes
The cluster service will now queue up changes that need to be completed if a node is offline.
• Easier Administration – ensures that you do not need to apply a change twice if a node is offline. For example, if a node is offline and is evicted from the Cluster by a remaining node, the cluster service will be uninstalled the next time the first node attempts to join the Cluster. This also holds true for applications.
Disk Changes
The Cluster Service more efficiently adjusts to shared disk changes in regards to size changes and drive letter assignments.
• Dynamic Disk Size – If you increase the size of a shared disk, the Cluster Service will now dynamically adjust to it. This is particularly helpful for SANs, where volume sizes can change easily. It does this by working directly with Volume Mount Manager, and no longer directly uses the DISKINFO or DISK keys. NOTE: These keys are maintained for backwards compatibility with previous versions of the Cluster Service.
Password Change
Cluster Service account password changes no longer require any downtime of the cluster nodes. In addition, passwords can be reset on multiple clusters at the same time.
• Reduced Downtime – In Windows Server 2003, you can change the Cluster Service account password on the domain as well as on each local node, without having to take the cluster offline. If multiple clusters use the same Cluster service account, you can change them simultaneously. In Microsoft Windows NT 4.0 and Microsoft Windows 2000, to change the Cluster service account password, you have to stop the Cluster service on all nodes before you can make the password change.
Resource Deletion
Resources can be deleted in Cluster Administrator or with Cluster.exe without taking them offline first.
• Easier Administration – in previous versions, you first had to take a resource offline before you could delete it. Now, Cluster Service will take them offline automatically, and then delete them.
WMI Support
Server clusters provides WMI support for:
• Cluster control and management functions including starting and stopping resources, creating new resource and dependencies etc.
• Application and cluster state information. WMI can be used to query whether applications are online, whether cluster nodes are up and running as well as a host of other status information.
• Cluster state change events are propagated via WMI to allow applications to subscribe to WMI events that show when an application has failed, when an application is restarted, when a node fails etc.
• Better Management – allows Server clusters to be managed as part of an overall WMI environment.

Supporting and Troubleshooting
Offline/Failure Reason Codes
These provide additional information to the resource as to why the application was taken offline, or failed.
• Better Troubleshooting – allows the application to have different semantics if the application has failed or some dependency of the application has failed, versus the administrator specifically moved the group to another node in the Server cluster.
Software Tracing
Cluster Service now has a feature called software tracing that will produce more information to help with troubleshooting Cluster issues.
• Better Troubleshooting – this is a new method for debugging that will allow Microsoft to debug the Cluster Service without loading checked build versions of the dll’s (symbols).
Cluster Logs
A number of improvements have been made to the Cluster Service log files, including a setup log, error levels (info, warn, err), local server time entry, and GUID to resource name mapping.
• Setup Log – during configuration of Cluster Service, a separate setup log (%SystemRoot%\system32\Logfiles\Cluster\ClCfgSrv.log) is created to assist in troubleshooting.
• Error Levels – this makes it easy to be able to highlight just the entries that require action (e.g. err).
• Local Server Time Stamp – this will assist in comparing event log entries to Cluster logs.
• GUID to Resource Name Mapping – this assists in understanding the cluster log references to GUIDs. A Cluster object file (%windir%\Cluster\Cluster.obj) is automatically created and maintained that contains a mapping of GUID’s to Resource Name mappings.
Event Log
Additional events are written to the event log indicating not only error cases, but showing when resources are successfully failed over from one node to another. Benefits
• Better Monitoring – this allows event log parsing and management tools to be used to track successful failovers rather than just catastrophic failures.
A new tool called clusdiag is available in the Windows Server 2003 Resource Kit.
• Better Troubleshooting – makes reading and correlating cluster logs across multiple cluster nodes and debugging of cluster issues more straight forward.
• Validation and Testing – Clusdiag allows users to run stress tests on the server, storage and clustering infrastructure. As such, it can be used as a validation and test tool before a cluster can be put into production
Chkdsk Log
The cluster service creates a chkdsk log whenever chkdsk is run on a shared disk.
• Better Monitoring – this allows a system administrator to find out and react to any issues that were discovered during the chkdsk process.
Disk Corruption
When Disk Corruption is suspect, the Cluster Service reports the results of CHKDSK in event logs and creates a log in %sytemroot%\cluster.
• Better Troubleshooting – results are logged in the Application and Cluster.log. In addition, the Cluster.log references a log file (e.g. %windir%\CLUSTER\CHKDSK_DISK2_SIGE9443789.LOG) in which detailed CHKDSK output is recorded.
Network Load Balancing
Network Load Balancing Manager
In Windows 2000, to create an NLB Cluster, users had to separately configure each machine in the cluster. Not only was this unnecessary additional work, but it also opened up the possibility for unintended user error because identical Cluster Parameters and Port Rules had to be configured on each machine. A new utility in Windows Server 2003 called the NLB Manager helps solve some of these problems by providing single point of configuration and management of NLB clusters. Some key features of the NLB Manager:
• Creating new NLB clusters and automatically propagating Cluster Parameters and Port Rules to all hosts in the cluster and propagating Host Parameters to specific hosts in the cluster.
• Adding and removing hosts to and from NLB clusters.
• Automatically adding Cluster IP Addresses to TCP/IP.
• Managing existing clusters simply by connecting to them or by loading their host information from a file and then saving this information to a file for later use.
• Configuring NLB to load balance multiple web sites or applications on the same NLB Cluster, including adding all Cluster IP Addresses to TCP/IP and controlling traffic sent to specific applications on specific hosts in the cluster. [See the Virtual Clusters feature below]
• Diagnosing mis-configured clusters.
Virtual Clusters
In Windows 2000, users could load balance multiple web sites or applications on the same NLB Cluster simply by adding the IP Addresses corresponding to these web sites or applications to TCP/IP on each host in the cluster. This is because NLB, on each host, load balanced all IP Addresses in TCP/IP, except the Dedicated IP Address. The shortcomings of this feature in Windows 2000 were:
• Port Rules specified for the cluster were automatically applied to all web sites or applications load balanced by the cluster.
• All the hosts in the cluster had to handle traffic for all the web sites/ applications hosted on them.
• To block out traffic for a specific application on a specific host, traffic for all applications on that host had to be blocked.
A new feature in Windows Server 2003 called Virtual Clusters overcomes the above deficiencies by providing per-IP Port Rules capability. This allows the user to:
• Configure different Port Rules for different Cluster IP Addresses, where each Cluster IP Address corresponds to a web site or application being hosted on the NLB Cluster. This is in contrast to NLB in Windows 2000, where Port Rules were applicable to an entire host and not to specific IP Addresses on that host.
• Filter out traffic sent to a specific website/application on a specific host in the cluster. This allows individual applications on hosts to be taken offline for upgrades, restarts, etc. without affecting other applications being load balanced on the rest of the NLB cluster.
• Pick and choose which host in the cluster should service traffic sent to which website or application being hosted on the cluster. This way, not all hosts in the cluster need to handle traffic for all applications being hosted on that cluster.
Multi-NIC support
Windows 2000 allowed the user to bind NLB to only one network card in the system. Windows Server 2003 allows the user to bind NLB to multiple network cards, thus removing the limitation.
This now enables users to:
• Host multiple NLB clusters on the same hosts while leaving them on entirely independent networks. This can be achieved by binding NLB to different network cards in the same system.
• Use NLB for Firewall and Proxy load balancing in scenarios where load balancing is required on multiple fronts of a proxy or firewall.
Bi-directional Affinity
The addition of the Multi-NIC support feature enabled several other scenarios where there was a need for load balancing on multiple fronts of an NLB Cluster. The most common usage of this feature will be to cluster ISA servers for Proxy and Firewall load balancing. The two most common scenarios where NLB will be used together with ISA are:
• Web Publishing
• Server Publishing
In the Web Publishing scenario, the ISA cluster typically resides between the outside internet and the front-end web servers. In this scenario, the ISA servers will have NLB bound only to the external interface, therefore, there will be no need to use the Bi-directional Affinity feature.

However, in the Server Publishing scenario, the ISA cluster will reside between the Web Servers in the front, and the Published Servers in the back. Here, NLB will have to be bound to both the external interface [facing the Web servers] and the internal interface [facing the Published Servers] of each ISA server in the cluster. This increases the level of complexity because now when connections from the Web Servers are being load balanced on the external interface of the ISA Cluster and then forwarded by one of the ISA servers to a Published Server, NLB has to ensure that the response from the Published Server is always routed to the same ISA server that handled the corresponding request from the Web Server because this is the only ISA server in the cluster that has the security context for that particular session. So, NLB has to make sure that the response from Published Server does not get load balanced on the internal interface of the ISA Cluster since this interface is also clustered using NLB.
This task is accomplished by the new feature in Windows Server 2003 called Bi-directional Affinity. Bi-directional affinity makes multiple instances of NLB on the same host work in tandem to ensure that responses from Published Servers are routed through the appropriate ISA servers in the cluster.
Limiting switch flooding using IGMP support
The NLB algorithm requires every host in the NLB Cluster to see every incoming packet destined for the cluster. NLB accomplishes this by never allowing the switch to associate the cluster’s MAC address with a specific port on the switch. However, the unintended side effect of this requirement is that the switch ends up flooding all of its ports with all incoming packets meant for the NLB cluster. This can certainly be a nuisance and a waste of network resources. In order to arrest this problem, a new feature called IGMP support has been introduced in Windows Server 2003. This feature helps limit the flooding to only those ports on the switch that have NLB machines connected to them. This way, non-NLB machines do not see traffic that was intended only for the NLB Cluster, while at the same time, all of the NLB machines see traffic that was meant for the cluster, thus satisfying the requirements of the algorithm. It should, however, be noted that IGMP support can only be enabled when NLB is configured in multicast mode. Multicast mode has its own drawbacks which are discussed extensively in KB articles available on The user should be aware of these shortcomings of multicast mode before deploying IGMP support. Switch flooding can also limited when using unicast mode by creating VLANs in the switch and putting the NLB cluster on its own VLAN. Unicast mode does not have the same drawbacks as Multicast mode, and so limiting switch flooding using this approach may be preferable.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s