Hyper-V VM with VirtualFC fails to start

Problem

This is just a quick note to remember the solution and EventIDs.

The VM fails to start complaining about failed resources or resource not available in Failover Cluster manager. Analysis of the event log reveals messages related to VirtualFC:

EventID 32110 from Hyper-V-SynthFC: ‘VMName’: NPIV virtual port operation on virtual port (WWN) failed with an error: The world wide port name already exists on the fabric. (Virtual machine ID ID)
EventID 32265 from Hyper-V-SynthFC: ‘VMName’: Virtual port (WWN) creation failed with a NPIV error(Virtual machine ID ID).
EventID 32100 from Hyper-V-VMMS: ‘VMNAME’: NPIV virtual port operation on virtual port (WWN) failed with an unknown error. (Virtual machine ID ID)
EventID 1205 from Microsoft-Windows-FailoverClustering: The Cluster service failed to bring clustered role ‘SCVMM VM Name Resources’ completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered role.

Analysis

The events point in the direction of Virtual Fibre Channel or Fibre Channel issues. After a while we realised that one of the nodes in the cluster did not release the WWN when a VM migrated away from it. Further analysis revealed that the FC driver versions were different.

Solution

Make sure all cluster nodes are running the exact same driver and firmware for the SAN and network adapters. This is crucial for failovers to function smoothly.
To “release” the stuck WWNs you have to reboot the offending node. To figure out which node is holding the WWN you have to consult the FC Switch logs. Or you could just do a rolling restart and restart all nodes until it starts working.
I have successfully worked around the problem by removing and re-adding the virtual FC adapters n the VM that is not working. I do not know why this resolved the problem.
Another workaround would be to change the WWN on the virtual FC adapters. You would of course have to make this change at the SAN side as well.

Last edit: Sunday, February 11, 2018

Author: DizzyBadger

SQL Server DBA, Cluster expert, Principal Analyst View all posts by DizzyBadger

One thought on “Hyper-V VM with VirtualFC fails to start”

Hey, wanted to say thanks – in a month long poke-and-poke-again session with a maldesigned SAN, we had similar issues – I introduced them to the wonders of multipathing but we lost live migration capability.
We saw that SetA and SetB WWPNs were all over the place, only one out of 5 VMs was consistent, and that one was stuck on SetB!

I suggested to maybe sometimes install driver updates if they are available, since the FC drivers were outdated.
Plus the HyperV recommended patches. But I also got them to give me HW for a validation cluster, and luckily I could see same/worse issues there. I had already upgraded to IBM/Lenovos latest & greatest drivers, but reading your post was the hunch that made me notice I had only rebooted one node. Then I saw they both ran the same driver anyway :(
But with your post in the back of my mind I made the right choice and just grabbed the latest vendor driver from QLogic, installed, rebooted, everything works fine.

I suppose as such they now also end up with a new cluster just for terminal server VMs which means there’s be direct benefit for the end users, which is often hard enough to realize.
It was very good that you took the time to make your writeup.

darkfader says:

2020.08.19 at 20:15:48

Hey, wanted to say thanks – in a month long poke-and-poke-again session with a maldesigned SAN, we had similar issues – I introduced them to the wonders of multipathing but we lost live migration capability.
We saw that SetA and SetB WWPNs were all over the place, only one out of 5 VMs was consistent, and that one was stuck on SetB!

I suggested to maybe sometimes install driver updates if they are available, since the FC drivers were outdated.
Plus the HyperV recommended patches. But I also got them to give me HW for a validation cluster, and luckily I could see same/worse issues there. I had already upgraded to IBM/Lenovos latest & greatest drivers, but reading your post was the hunch that made me notice I had only rebooted one node. Then I saw they both ran the same driver anyway :(
But with your post in the back of my mind I made the right choice and just grabbed the latest vendor driver from QLogic, installed, rebooted, everything works fine.

I suppose as such they now also end up with a new cluster just for terminal server VMs which means there’s be direct benefit for the end users, which is often hard enough to realize.
It was very good that you took the time to make your writeup.

Problem

Analysis

Solution

Like this:

Related

Author: DizzyBadger

One thought on “Hyper-V VM with VirtualFC fails to start”

Leave a ReplyCancel reply

Problem

Analysis

Solution

Share this:

Like this:

Related

Author: DizzyBadger

One thought on “Hyper-V VM with VirtualFC fails to start”

Leave a ReplyCancel reply