Configure VMQ and RSS on physical servers

Introduction

Samples below are collected from Windows Server 2016

The primary objective is to avoid weighing down Core 0 with networking traffic. This is the first core on the first NUMA node, and this core is responsible for a lot of kernel processing. If this core suffers from contention, a wild blue screen of death will appear. Thus, we want our network adapters to use other cores to process their traffic. We can achieve this in three ways, depending on what we use the adapter for:

  • Enable Receive Side Scaling (RSS) and configure it to use specific cores.
  • Enable Virtual Message Queueing and configure it to use specific cores
  • Set the preferred NUMA node

For physical machines

On network adapters used for generic traffic, we should enable RSS and disable VMQ. On adapters that are part of a virtual switch, we should disable RSS and enable VMQ. The preferred NUMA node should be configured for all physical adapters.

For virtual machines

If the machine has more than one CPU, enable vRSS.

Investigating the NUMA architecture

Sockets and NUMA nodes

Sysinternals coreinfo -s -n will show the relationship between logical processors, sockets and NUMA nodes. In the example below we have a CPU with two sockets and four nodes.

clip_image002

Closest NUMA node

Each PCIE adapter is physically connected to a specific NUMA node. If possible, RSS / VMQ should be mapped to cores on the same NUMA node that the NIC is connected to. Get-NetadApterRss will show you which NUMA node is closest for each adapter. The one in the sample is connected to/closest to NUMA node 0 as the NUMA distance for cores in group 0 is 0. We can also see that the NUMA distance to node 1 for this particular port is lower than the distance to nodes 2 and 3. This is caused by the fact that node 0 and 1 are on the same physical CPU, whereas node 2 and 3 are on another physical CPU.

clip_image004

VMQ mode and network teams

Depending on the teaming mode, the VMQ mode will differ. Shortly, if you use Switch independent teaming with the HyperV or Dynamic load balancing algorithms, VMQ will be in Sum-of-queues mode. All other team configurations will use min-queues mode. This will define how you should configure your settings for VMQ. For Sum-of-queues, the processor sets should not overlap, for min-queues, the processor settings should be equal for both team members.

See https://lokna.no/?p=1980 for more information about network teaming.

NIC Preferred NUMA Node

Fill out the table below using the data gathered above. The sample server is a Hyper-V host with 6 physical NICs and three teams. The public team is the management interface, the internal team is used for cluster internal traffic such as livemigrations, and the Hyper-V team is connected to the virtual or logical switch.

Card Numa node
Public1 1
Public2 1
Internal1 0
Internal2 0
HyperV1 2
HyperV2 2

Set the preferred numa node for the adapter by running the following commands:

Set-NetAdapterAdvancedProperty -Name "Public1" -RegistryKeyword '*NumaNodeId' -RegistryValue '1'
Set-NetAdapterAdvancedProperty -Name "Public2" -RegistryKeyword '*NumaNodeId' -RegistryValue '1'
Set-NetAdapterAdvancedProperty -Name "Internal1" -RegistryKeyword '*NumaNodeId' -RegistryValue '0'
Set-NetAdapterAdvancedProperty -Name "Internal2" -RegistryKeyword '*NumaNodeId' -RegistryValue '0'
Set-NetAdapterAdvancedProperty -Name "HyperV1" -RegistryKeyword '*NumaNodeId' -RegistryValue '2'
Set-NetAdapterAdvancedProperty -Name "HyperV2" -RegistryKeyword '*NumaNodeId' -RegistryValue '2'

VMQ/RSS settings

You could distribute the cores in each numa node between the ports attached to the numa node, but this gets very complicated very fast. As this is already a very complicated subject, we recommend skipping this part and only caring about the NUMA nodes and base processor.

The same settings should be used for VMQ and RSS, but each port should only have one function enabled. VMQ for the HyperV ports and RSS for all other ports. You should always change base processor to something other than 0. The Core list is zero based, and if hyperthreading is enabled, every second core is a hyperthreading core.

Settings explained:

  • Numanode: Specifies NUMA node affinity once more. Does not affect the RSSProcessorArray.
  • BaseProcessorGroup / MaxProcessorGroup: Speciefies the numa node affinity for the RSSProcessorArray. Use the same value for both. VMQ supports just one processor group and thus lacks the MaxProcessorGroup setting.
  • BaseProcessorNumber: Specifies the starting processor to be used within the processor group. This should NOT be 0. 2 is the recommended value unless you want to get really fancy.
  • MaxProcessors: Specifies the maximum of processors used by VMQ/RSS for load balancing network transmissions. Max processors should be rounded up to the closest of 2,4,8 or 16 (the possible values). You can use this to limit the number of cores used by each adapter.

Settings table

Set the values according to which numa node each port/card is connected to. If you want to set MaxProcessors, choose one of the valid values according to you wishes. This sample shows overlapping VMQ values. Remember to use non-overlapping values when required by your teams queue mode. If in doubt, try overlapping values and check for the error message pictured below.

Mode NumaNode BaseProcessorGroup MaxProcessorGroup BaseProcessorNumber MaxProcessors
Public1 RSS 1 1 1 2 8
Public2 RSS 1 1 1 2 8
Internal1 RSS 0 0 0 2 8
Internal2 RSS 0 0 0 2 8
HyperV1 VMQ 2 2 2 2 8
HyperV2 VMQ 2 2 2 2 8

Graphical representation of the NUMA nodes

image

Powershell script

Edit the script to reflect the settings above.

#Disable/Enable VMQ and RSS. Each port should have one of them disabled, and the other one enabled
#Public
Disable-NetAdapterVmq -Name 'Public1'
Disable-NetAdapterVmq -Name 'Public2'
Enable-NetAdapterRSS -Name 'Public1'
Enable-NetAdapterRSS -Name 'Public2'

#Internal
Disable-NetAdapterVmq -Name 'Internal1'
Disable-NetAdapterVmq -Name 'Internal2'
Enable-NetAdapterRSS -Name 'Internal1'
Enable-NetAdapterRSS -Name 'Internal2'

#HyperV
Disable-NetAdapterRss -Name 'HyperV1'
Disable-NetAdapterRss -Name 'HyperV2'
Enable-NetAdapterVmq -Name 'HyperV1'
Enable-NetAdapterVmq -Name 'HyperV2'
Disable-NetAdapterRss -Name 'HyperV' #Disable RSS on the HyperV team adapter to avoid error messages in the event log.

#Configure numa node affinity. We configure both VMQ and RSS to avoid error messages.
#Public
Set-NetAdapterRss -Name 'Public1' -NUMANode 1 -BaseProcessorGroup 1 -MaxProcessorGroup 1 -BaseProcessorNumber 2
Set-NetAdapterRss -Name 'Public2' -NUMANode 1 -BaseProcessorGroup 1 -MaxProcessorGroup 1 -BaseProcessorNumber 2
Set-NetAdapterVMQ -Name 'Public1' -NUMANode 1 -BaseProcessorGroup 1 -BaseProcessorNumber 2
Set-NetAdapterVMQ -Name 'Public2' -NUMANode 1 -BaseProcessorGroup 1 -BaseProcessorNumber 2


#Internal
Set-NetAdapterRss -Name 'Internal1' -NUMANode 0 -BaseProcessorGroup 0 -MaxProcessorGroup 0 -BaseProcessorNumber 2
Set-NetAdapterRss -Name 'Internal2' -NUMANode 0 -BaseProcessorGroup 0 -MaxProcessorGroup 0 -BaseProcessorNumber 2
Set-NetAdapterVMQ -Name 'Internal1' -NUMANode 0 -BaseProcessorGroup 0 -BaseProcessorNumber 2
Set-NetAdapterVMQ -Name 'Internal2' -NUMANode 0 -BaseProcessorGroup 0 -BaseProcessorNumber 2


#Hyper V
Set-NetAdapterRss -Name 'HyperV1' -NUMANode 2 -BaseProcessorGroup 2 -MaxProcessorGroup 2 -BaseProcessorNumber 2
Set-NetAdapterRss -Name 'HyperV2' -NUMANode 2 -BaseProcessorGroup 2 -MaxProcessorGroup 2 -BaseProcessorNumber 2
Set-NetAdapterVMQ -Name 'HyperV1' -NUMANode 2 -BaseProcessorGroup 2 -BaseProcessorNumber 2
Set-NetAdapterVMQ -Name 'HyperV2' -NUMANode 2 -BaseProcessorGroup 2 -BaseProcessorNumber 2

Check the results

Use get-netadapterRSS and get-NetadapterVMQ to check that the settings has been applied.

Get-NetAdapterRss | Select-Object Name, Enabled, NumaNode, BaseProcessorGroup, MaxProcessorGroup, BaseProcessorNumber, MaxProcessorNumber,MaxProcessors |Sort-Object name |ft
Get-NetAdapterVmq |Select-Object Name, Enabled, NumaNode, BaseProcessorGroup, MaxProcessorGroup, BaseProcessorNumber, MaxProcessorNumber,MaxProcessors |Sort-Object name |ft

Before

RSS

clip_image007[4]

VMQ

clip_image009

After

RSS

clip_image011

VMQ

clip_image013

VMQ Queue mode

If your VMQ settings does not match your team load balancing algorithm mode, you will get an error message, e.g. “EventID 106 from Hyper-V VmSwitch: The processor sets overlap when LBFO is configured with sum-queue mode.”

Make sure that you are using non-overlapping processor sets if you use switch independent HyperV or Dynamic teaming. The sample above shows an overlapping VMQ processor set.

clip_image002

Useful powershell commandlets

Get-NetAdapterVmq |Sort-Object Name
Get-NetAdapterVmq |Select-Object Name, Enabled, MaxProcessorNumber,MaxProcessors, BaseProcessorNumber, BaseProcessorGroup, NumaNode|Sort-Object name |ft
Get-NetAdapterRss | Select-Object Name, Enabled, MaxProcessorNumber,MaxProcessors, BaseProcessorNumber, BaseProcessorGroup, NumaNode|Sort-Object name |ft

Get-NetAdapter
Get-NetAdapterVmq
Get-NetAdapterVmqQueue
Get-NetAdapterRss
Get-NetAdapterHardwareInfo
Get-NetAdapterAdvancedProperty -Name "NIC Adapter"

Author: DizzyBadger

SQL Server DBA, Cluster expert, Principal Analyst

3 thoughts on “Configure VMQ and RSS on physical servers”

  1. Good post on VMQ. I wasn’t aware of the sysinternal tools, very nice. You may want to mention the difference between ‘Sum of Queues’ and ‘Min-Queues’ mode. With 2016 the preferred NIC teaming is now SET and does not show up with Get-NetLbfoTeam; it uses the ‘Sum of Queues’ mode automatically and you do not want overlapping processors for VMQ.

    1. A bit late, but I have added info about the queue mode. Hopefully I will find time to add another sample configuration without overlapping values for VMQ :)

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.