Hello Community!
This will be a non-Veeam post today, but I thought it would be worthwhile to share about a vSphere network issue I encountered recently, and the steps I took to resolve it, in case any of you experience a similar occurrence.
PROBLEM
I had 1 ESXi Host in a Cluster which, when VMs in a certain subnet were connected to a segregated vSS and PG, there would be no network connectivity. No pings. No network connection of any kind. I use vSS and not vDS, so one might think there could be a configuration issue, rightfully so. But, I use PowerCLI to configure my Hosts, and all of them were configured using this same PCLI script, so the chances of that being the case was virtually zero. And, the other Hosts in this Cluster, when VMs were connected to the same subnet (vSS & PG), connectivity was fine. The following are some troubleshooting steps I took to try and narrow the cause down:
- Viewed first to see if the vmnic Teaming Policy in the vSS was set to ‘active’. Believe it or not, I have had it happen where my vmnic wasn’t in the active state even after being configured with my script (which does set vmnics active when assigning them to vSS PGs). As you can see below, it was all good there:
- I then SSH’d to the Host to check the link state and that was all good > all are ‘up’ and ‘up’:
- Then, things started looking a little “buggy”. Data was being received, but nothing transmitted, when looking at the nic stats:
- Even though the physical switch port is configured with a specific VLAN, but not trunked, thus meaning no VLAN ID is required in the vSS Port Group, I went ahead and configured a VLAN ID anyway; but still no connectivity.
- I disconnected the network cable from the Host and connected it to my laptop, configured an IP on my laptop in the same subnet and was able to connect to the subnet fine, so the issue potentially being cable-related or physical switch port-related was debunked. The problem had to be either the nic port itself, which I thought was very unlikely since 1. this Host is fairly new and w. was the 2nd nic port in a dual-port embedded nic card; or the issue was something going on in software (i.e. vSphere)
- I looked at one last area just to verify if the vmnic in question was actually being seen in vSphere as “used” in the vSS Teaming Policy. As noted in my first troubleshooting bullet item above, you can see the nic is configured for the active state, but when going to the vSS area and just “viewing the vSS settings”, the vmnic showed as ‘unused’. BINGO!
RESOLUTION
Ok, so there is a diconnect here on what is seen by the software and what is actually configured. I attempted to remove the vmnic from the vSS, then re-add it as active, but that did no good. Still showing as ‘unused’. What I ended up having to do to fully resolve this was remove the vmnic, then blow away (delete) the PG and vSS. I then re-created the vSS & PG, added the vmnic back, then voila’!...all good. I now have network connectivity for this subnet on this Host.
Hopefully this helps anyone else who may encounter this issue.
Cheers!