UPDATE 2: VMware vSphere 7.0 now supports shared (clustered) VMDK! A VMDK can be used as a shared disk resource for a WSFC deployed on a VMs hosting on different ESXi hosts (CaB). Check this document and the guide for more details.
In VMware virtualization environments (as a rule in clustering scenarios) it may be necessary to share the same disk (vmdk or RMD) between 2 (or more) virtual machines (VMs) on VMWare ESXi. The most optimal way is to use the vmdk disk physically located on the shared storage or locally on the ESXi host and this can be done following some specific configurations to be able to access data. Step 6: You then need to shut the machine down to modify the SCSI controller options. Windows Server 2003 SP1 and SP2 LSI Logic Parallel Windows Server 2008 SP2 and above LSI Logic SAS. You need to change the SCSI Bus Sharing to Virtual and click okay. Step 7: Adding the same disk to second server. Add HDD again but on select a disk choose the.
UPDATE 1: Starting with VMware vSAN 6.7 U3 vSAN provides native support for a clustered disk resource for WSFC! Check this article for more information how to configure vSAN for shared disk.
There’s a lot of conflicting materials on the Internet describing how to configure a Windows Failover Cluster (WSFC) on VMware vSphere platform. In this blog post you will learn about VMware supported and recommended configuration options when implementing a WSFC (previously known as Microsoft Service Cluster Service, or MSCS) with disk resources shared across nodes of a cluster. One of the application examples leveraging a WSFC with clustered disks is Microsoft SQL Server configured with Always On Failover Cluster Instance (FCI).
Note: Microsoft SQL Server Always On Availability Group does not require clustered disks between VMs to host a database and therefor no special disk configurations on the vSphere side are needed.
The information provided is applicable to VMware vSphere versions 6.x and 7.x in configurations when the VMs hosting the nodes of a WSFC cluster are located on different ESXi hosts – known as “Cluster-across-box (CAB)” in VMware official documentation. CAB provides high availability (HA) both from the In-guest Operating System (OS) and vSphere environment perspective. We do not recommend a configuration where all VMs hosting nodes of a cluster are placed on a single ESXi host (so called “cluster-in-a-box”, or CIB). The CIB solution should not be used for any production implementations – if a single ESXi host will fail, all cluster nodes will be powered off and, as a result, an application will experience downtime.
We will concentrate on the clustered disk option provided by a Raw Device Mapping (RDM) and will discuss configurations involving VMware Virtual Volumes (VVol) in a separate blog post.
Recommended WSFC Cluster Configurations
The recommended and supported vSphere configuration options for a CAB deployment of a WSFC are shown in the table below:
|vSphere version||Shared disk options||SCSI bus sharing||vSCSI Controller type||vMotion supported|
|vSphere 5.x||RDM physical mode||physical||LSI SAS||NO|
|vSphere 6.0 and 6.5||RDM physical mode||physical||PVSCSI||YES|
|vSphere 6.7||RDM physical mode, vVol||physical||PVSCSI||YES|
|vSphere 7.0||Clustered VMDK, RDM physical mode,|
vVol Beatles songbook.
|VMware Cloud on AWS||Clustered VMDK||physical||PVSCSI||YES|
|vSAN (vSphere 6.7 U3)||Clustered VMDK||physical||PVSCSI||YES|
NOTE: We are replacing references to “shared disk” with “clustered disk” to reduce confusion and provide more clarity.
We recommend attaching a disk to be used as a shared resource to a separate vSCSI controller(s) — consider using the PVSCSI Controller Type with vSphere 6.x and 7.0. A SCSI controller used for a clustered disk must be configured with the SCSI Bus Sharing set to physical. The requirement to use physical Bus Sharing is due to SCSI-3 Persistent Reservations used by Microsoft OS to arbitrate an access to a disk shared between nodes.
Block storage must be used to provision a clustered disk for WSFC. We support only Fibre Channel, iSCSI or Fibre Channel over Ethernet (FCoE) as the access storage protocols; file-based storage systems with NFS are not supported. A non-formatted LUN should be made available to the vSphere environment and further assigned as an RDM or vVol device to a VM, node of WSFC. SCSI commands must be directly passed to a LUN to satisfy the requirements of SCSI-3 Persistent Reservations, which justifies the usage of RDMs in physical compatibility mode. All disks should be using the same SCSI IDs across all VMs hosting nodes of WSFC. The figure below depicts the recommended VM level configuration of a disk resource and vController (using vSphere HTML5 Client).
Note: vSphere 7.0 provides support for clustered VMDK. Check this document and the guide for more details.
Live vMotion (both user- or DRS-initiated) of VMs hosting nodes of WFCS is supported starting with vSphere 6.0 and, among others, requires the VM Compatibility (vHardware) to be at least vSphere 6.0 (version 11). Check the information below for more details.
Note: You might see references on the Internet suggesting using the multi-writer feature in conjunction with WSFC. Please be advised that the multi-writer feature must not be used for a clustered disk resource for WSFC.
Both VMware Cloud on AWS and VMware vSAN (6.7 Update 3) support clustered VMDK(s) as a disk resource for WSFC. Support for clustered VMDKs on VMFS for on-premises versions of vSphere is introduced in vSphere 7.0. Check this document and the guide for more details.
While finalizing the configuration of your solution please ensure that the following important additional settings are implemented:
- All pRDMs used with WSFC should be configured as perennially reserved. Follow the kb below for details: ESXi/ESX hosts with visibility to RDM LUNs being used by WSFC nodes with RDMs may take a long time to start or during LUN rescan (Configuring perennially reserved flag)
- DRS anti-affinity rules and groups should be configured to separate the placement of VMs hosting members of WSFC across different ESXi hosts.
- Check the vMotion prerequisites:
- The vMotion network must be using a physical network wire with a transmission speed 10GE (Ten Gigabit Ethernet) and more. vMotion over a 1GE (One Gigabit Ethernet) is not supported
- vMotion is supported for Windows Server 2008 SP2 and above releases. It is not supported on Windows Server 2003.
- The WSFC cluster heartbeat time-out must be modified to allow 10 missed heartbeats at minimum.
- The virtual hardware version for the virtual machine hosting the nodes of WSFC – must be version 11 and later.
The following resources should be consulted while planning, designing and implementing the solution:
Consider the vSphere configuration options listed in this article as the official VMware recommendations to build a supported, highly available, and good performing solution for a Microsoft Windows environment with WSFC clusters with shared disks.
Need to P2V a SQL cluster at work. Here’s screenshots of what I did in a test environment to see if an idea of mine would work.
We have a 2 physical-nodes SQL cluster. The requirement was to convert this into a single virtual machine.
P2V-ing a single server is easy. Use VMware Converter. But P2V-ing a cluster like this is tricky. You could P2V each node and end up with a cluster of 2 virtual-nodes but that wasn’t what we wanted. We didn’t want to deal with RDMs and such for the cluster, so we wanted to get rid of the cluster itself. VMware can provide HA if anything happens to the single node.
My idea was to break the cluster and get one of the nodes of the cluster to assume the identity of the cluster. Have SQL running off that. Virtualize this single node. And since there’s no change as far as the outside world is concerned no one’s the wiser.
Found a blog post that pretty much does what I had in mind. Found one more which was useful but didn’t really pertain to my situation. Have a look at the latter post if your DTC is on the Quorum drive (wasn’t so in my case).
So here we go.
1) Make the node that I want to retain as the active node of the cluster (so it was all the disks and databases). Then shutdown SQL server.
2) Shutdown the cluster.
3) Remove the node we want to retain, from the cluster.
We can’t remove/ evict the node via GUI as the cluster is offline. Nor can we remove the Failover Cluster feature from the node as it is still part of a cluster (even though the cluster is shutdown). So we need to do a bit or “surgery”. :)
Open PowerShell and do the following:
Right click the desktop on any computer (or the SQL server computer itself) and create a new text file. Then rename that to
blah.udl. The name doesn’t matter as long as the extension is
.udl. Double click on that to get a window like this:
Now you can fill in the SQL server name and test it.
One thing to keep in mind (if you are not a SQL person – I am not). The Windows NT Integrated security is what you need to use if you want to authenticate against the server with an AD account. It is tempting to select the “Use a specific user name …” option and put in an AD username/ password there, but that won’t work. That option is for using SQL authentication.
If you want to use a different AD account you will have to do a run as of the tool.
Also, on a fresh install of SQL server SQL authentication is disabled by default. You can create SQL accounts but authentication will fail. To enable SQL authentication right click on the server in SQL Server Management Studio and go to Properties, then go to Security and enable SQL authentication.
Vmware Windows Cluster Shared Disk Management
Vmware 6.7 Windows Cluster Shared Disk
Now one can P2V this node.