SMBv3.1.1 disconnects and fails to reconnect on Windows 10

Be warned: This will be a long one with a lot of text and few images. I never planned on doing a write-up on this issue, so I did not take a lot of pictures.

I have been troubleshooting this issue on and off for two years, and I was on the brink of giving up several times. I pride myself in finding solutions where others only find stress and hair-loss, and do so routinely, but sadly there are still nuts I cannot crack. This issue was believed to be such a nut. But I was wrong. The solution had been staring me straight in the eyes for quite some time, but we must not get ahead of ourselves. Let us start at the beginning.

Problem

SMB sessions are invalidated, such that it is impossible to reconnect. This happens only on Windows 10 clients, Windows 7 and 8? clients running SMBv2.* can still reconnect as normal.

User story:

  • The user opens a file explorer window and navigates to a folder on a fileserver containing documents the user wants to read and/or edit.
  • This works without issue 100% of the time as long as the client computer has a network connection to the file server.
  • After a period of inactivity the SMB session is suspended. The user does not detect this, everything is still ok.
  • Some time later, the user will either
    • Try to save a file
    • Try to open a new file using the same File Explorer window
  • Possible outcomes
    • Everything works as expected
    • It is impossible to save the file to the server, it has to be saved locally.
    • The File Explorer window is gone. The user has to re-open the window and navigate back to the folder in question.
  • Thus, the user gets annoyed and and complains about the stupid Windows 10 upgrade, which is understandable.

Relevant Event IDs: 30807 from SMBClient and 1016 from SMBServer.

Continue reading “SMBv3.1.1 disconnects and fails to reconnect on Windows 10”

What SMB version is actually used?

To verify which SMB version is in use for a specific fileshare/connection, run the following powershell command:

Get-SmbConnection |select ShareName, Dialect 

You can run this command on both the client and the server. A client/server connection will use the highest version supported by both client and server. If the client supports up to v3.02, but the server is only able to support v3.00, v3.00 will be used for the connection.

The Get-Smbconnection commandlet supports a several other outputs, use select * to list them all.

Sample output

SNAGHTML3f409e4d

This is from a Win2012R2 client, connected to a share on a Win2012 cluster with multichannel support.

Verify SMB3 Multichannel on your cluster

To ensure maximum throughput for file clusters and Hyper-V clusters with cluster shared volumes, ensure that SMB multichannel is working. Without it, your file transfers may be running on a single thread/cpu and be less resilient to network problems. See http://blogs.technet.com/b/josebda/archive/2012/05/13/the-basics-of-smb-multichannel-a-feature-of-windows-server-2012-and-smb-3-0.aspx for more background information. SMB multichannel requires Windows 2012 or newer.

SMB multichannel is on by default, but that does not necessarily translate to works like a charm by default. The underlying network infrastructure and network adapters have to be configured to support it. In short, you need at least one of the following:

· multiple nics

· RSS capable nics

· RDMA capable nics

· network teaming

Verify nic capability detection

Run this following powershell command on the client:

Get-SmbClientNetworkInterface

clip_image001

In this sample output, we have five RSS enabled interfaces, and no RDMA enabled interfaces. Check that the interfaces you are planning to use for SMB are listed. Teamed interfaces show up in this list as virtual nics, but the physical nics that are part of the team are hidden. This behavior is expected.

On the server, use this powershell command. For Hyper-V cluster noedes with CSV, run both the server and client commands.

Get-SmbServerNetworkInterface

clip_image002

Again, make sure the adapters and IP addresses you have dedicated to SMB traffic is shown in the list with the expected capabilities.

Verify multiple connections

The powershell commandlet Use Get-SmbMultichannelConnection lists active SMB multichannel connections on the client. You may have to start a large file copy operation before you run the command to get any data. If you add the -IncludeNotSelected option, possible connections that are not selected for use are listed. In the sample below, you will see that one of the possible connections involves crossing a gateway/firewall from 10.x to 192.x, and is therefore not used.

clip_image003

If you are unable to get any data, run Get-SmbConnection to verify that you have active SMB connections.

Enable multichannel in failover cluster manager

For SMB multichannel to be active on a clustered role, be it scale-out file server or the old-fashioned file server role, client connections has to be enabled on all participating networks. It is best practice to disable client connections on all non-client facing cluster networks, but if you want to use SMB multichannel on an internal cluster network for say a Hyper-v for instance, you have to enable client connections on the internal network(s). It is also a good practice to not have a default gateway in cluster internal networks, unless you are deploying a stretched cluster where also the internal cluster traffic has to cross a gateway. Thus, clients outside the internal cluster network should not be able to access this network anyway due to routing and/or firewall restrictions. That being said, if you are deploying a cluster where the clients are supposed to connect to the clustered file server, you should also create multiple networks accessible from the outside of the cluster. But cluster network design is a huge topic outside the scope of this post. Anyway, make sure Allow clients to connect through this network is enabled in Failover cluster manager.

clip_image004