Solved

Dedicated Transport Network


Userlevel 3

I have a 2 node ESXi 7.0.3 host setup and running Veeam v12 to backup 5 VMs to a NFS share (all production VMs exist on node A, NFS share is BSD VM on node B).  The thought is that currently all production VMs can run on a single host and a backup copy of those can be created in a VM on host B.  Each server has qty 2 10Gb network interfaces and I have connected both NICs in a port group and virtual switch along with creating a vmkernel interface on both hosts referencing the correct port group.  After reconfiguring VMWare, I rescanned both ESXi hosts in Veeam and started the backup job.  I am not convinced that the process is using the new 10Gb link between the 2 hosts for the transport process.  What am I missing?

icon

Best answer by beenydu 22 February 2024, 09:24

View original

25 comments

Userlevel 3

Thank you coolsport00, that’s correct. My desire would be to utilize these 10gb nics for the backup process to the backup repository versus using the management nic/vmkernel.  The VM storage is local datastore on these 2 hosts and the 10Gb nics are not assigned to any other function.  

Userlevel 7
Badge +17

By "transport process" do you mean the backup process? Veeam won't use your ESXi Host 10Gb nics because those nics are used by the Hosts to connect to your storage. Veeam uses Proxies to do the transporting of VM backup data. By default, this is the VBR server. With your small environment, I assume the Proxy used is indeed your VBR server. You can check /verify by looking at the backup job history (double click the backup job), then look at the stats and see what was used as your Proxy. 

Userlevel 3

Also, I failed to add, these 2 hosts are in a remote datacenter, the Veeam Backup server is in another local datacenter that is managing these backups.  There is a proxy linux vm on host A and host B.

Userlevel 3

Thank you all for the prompt and courteous answers!  I will give this a try and report back!

Userlevel 3

Good evening all, quick update on the progress of this 10Gb dedicated nic project.  The following progress has been made:

-Added new virtual switch and port group for the 10Gb dedicated network

-Added network adapter to host A proxy, host B proxy, and host B NFS server.  All servers can ping each other on this new layer 2 10Gb network

-Updated NFS server to allow connections on the new 10Gb network (/etc/exports)

-Added new NFS server storage repo and assigned the host A and host B proxy servers for gateway servers. At the end of this wizard process, there was a prompt to import the backup history at which I agreed.  This process ran for 2 hours and I finally rebooted the server assuming the process was hung.

-After VBR12 rebooted, the backup jobs were updated to the new storage repo and started one of the backup jobs.  The backup process seems to be hung at this point running the process of “Getting VM info from vSphere” and “Initializing storage”.  Any idea what this process could be waiting on?  I have scoured the logs on the linux proxy servers (/etc/log/VeeamBackup) and while I see logs being generated for the backup process, I don’t see anything that would indicate a step is failing or waiting on another process to complete.   

Thank you all for following me on this journey and for everyone’s help so far!

Userlevel 7
Badge +21

Are the hosts using the 1GB network cards?  Check the properties of vCenter or the hosts in the Veeam console and edit them to see if they complete.  Then test again. Could be it is trying to connect to the hosts or vCenter and cannot.

Userlevel 3

@Chris, the ESXi hosts are using the 1Gb network adapters for the default vmkernel management.  I will definitely check this out again.  

Userlevel 7
Badge +21

Are the proxy and other servers using the port group with the 10gb nics?  Depending on the transport mode it will go over these nics or use mgmt.  But if you have dual 10gb you can put everything including management on them. 

Userlevel 7
Badge +17

To transport backup data, Veeam will use your Proxies...in this case the Linux VMs. And, since they are VMs, your jobs should be using hotadd. The only way to use those 10Gb nics would be if your VMs are assigned to the port group those storage nics are configured for. Generally, this isn't done as storage is usually on a separate network than VM network. 

Userlevel 3

Thank you Chris, the proxy servers are currently using a shared 1Gb port group.  Can I add a 2nd NIC to the linux proxy VMs to allow them to access the 10Gb port group?

The issue is currently I do not 10Gb switching ability at this datacenter and thought if 10Gb host to host would work, I could use this methodology until 10Gb switching was available.  

Userlevel 7
Badge +17

If your Storage & mgmt network is all the same subnet, you should be able to. 

Userlevel 7
Badge +21

Not a problem let us know how it works out. 

Userlevel 7
Badge +17

Sure...no problem. Keep us posted. 

Userlevel 7
Badge +6

Just so I’m following along, are the 10Gb NIC’s on each host direct-connected to each other then, and you want to send backup traffic between the two? 

If the proxy’s are in the same port group as the 10Gb uplinks, I don’t see why it won’t be able to use NIC’s in those port groups to transfer the data.  However, the backup server needs to be able to communicate with the proxy’s as well, so I’m guessing you’d need multiple NIC’s on the proxy’s, one to communicate with the VBR server and another for the data traffic to the repository.  Once that is done, you should be able to go into VBR up to the hamburger menu > Network Traffic Rules > Networks and then set your preferred networks on this screens so that it prefers to use the subnet that is utilizing the 10Gb links.  I haven’t played with this sort of setup before, but it seems that this would make sense to me.

 

VBR Global Traffic Rules - Setting Preferred Networks for Backup and Replication Traffic

 

Userlevel 3

Thank you Derek, that is correct, the host are directed connected to each other on the 10Gb links and each host has additional NICs directly connected to a firewall appliance serving up the additional VLANs.  Image Attached to hopefully explain the topology.  The Proxy Servers on host A and B have a management IP on the same network as the ESXi Host vmkernel along with a 2nd 10Gb NIC for Layer 2 (not routed through firewall).  This 10Gb virtual switch is also connected to the NFS Server as a 2nd 10Gb NIC as well.   

 

Userlevel 3

I have added the preferred network for Backup and trying a test job now to see how it is respected.  Thanks again everyone for your support with this configuration.  Hopefully this will be the ticket!

Userlevel 7
Badge +17

Are the Proxies on A & B assigned to the datastore storage on your NFS server? It appears they may not be and thus cannot find the VMs you’re trying to process.

Backup job logs can be found in C:\ProgramData\Veeam\Backup\<backup-job-name-folder> on the VBR server. See if a log in this folder shows anything on what Veeam is waiting on.

Userlevel 7
Badge +17

Nice! Keep us posted @beenydu .

Userlevel 7
Badge +21

@Chris, the ESXi hosts are using the 1Gb network adapters for the default vmkernel management.  I will definitely check this out again.  

Yeah cause the Proxy needs to connect to mgmt still for reading data.

Userlevel 7
Badge +21

Thank you Chris, the proxy servers are currently using a shared 1Gb port group.  Can I add a 2nd NIC to the linux proxy VMs to allow them to access the 10Gb port group?

The issue is currently I do not 10Gb switching ability at this datacenter and thought if 10Gb host to host would work, I could use this methodology until 10Gb switching was available.  

Unfortunately it does not work the way you are thinking based on host to host.  You can try the second nic but my guess is it won't work until you have switching. 

Userlevel 3

Good morning all, I apologize for the delay in getting back to you on an update (too many projects lately!). 

It was mentioned above that I could add the 10Gb NIC network as a Preferred Network within Global Network Traffic Rules.  Unfortunately, this idea did not work for this situation as this created a relationship between the VBR server (remote network) and the dedicated 10Gb NIC Backup Network.  Using the Preferred Network caused the VBR server to attempt to connect to this dedicated network instead of the backup proxy servers.  Essentially my goal was to only connect to the Backup Proxy servers to the NFS exports on the dedicated 10Gb NIC Backup Network.

After researching logs from the VBR server and the backup proxy servers (/var/log/Veeam), I was able to determine that the backup proxy servers were able to mount the NFS share on the dedicated 10Gb NIC Backup Network however the backup file was 0 bytes in size.  I reviewed the NFS server to ensure the NFS export was able to connect as Read Write and I found I needed to make a few tweaks to the /etc/exports file due to some constructs within FreeBSD OS.  Unfortunately, this did not help and no changes made to the progress.  

The part that was getting frustrating was it appeared that the Backup Job on the VBR server was being stalled at “Getting info from VSphere” and “Initializing Storage”.  I saw very clear evidence that the VBR server was able to connect to VSphere, download the xml file and parsed it, and successfully disconnect.  

I FINALLY had a break through while remembering an issue I ran into years ago with a much larger NFS network and VMWare environment.  Everything was working beautifully and the out of no where, 2pm on a Friday, BOOM!  All Datastores in VMWare became disconnected and all virtual machine offline; just an instant hard STOP.  We researched for a few hours and finally determined that the NFS server was trying to reverse DNS lookup the client IP address and due to this being a dedicated 10Gb NIC Network and none of these layer 2 IP Addresses were contained in DNS, the NFS server was all of a sudden blocking the Read/Write functions of the NFS export.  LightBulb!  As a last ditch effort to make this work, since these IP Addresses on the dedicated 10Gb Backup Network are layer 2 non-routed network addresses, I added the IP address of the 10Gb Network adapter for each Backup Proxy server to the /etc/hosts file on the NFS server and BAM!!  That finally flipped the switch for me and I am now able to use that network to connect on.  The backup jobs are now streaming at 250+ MBps and backup time is significantly reduced.

So the order of operations on the Backup Job for this installation is:

-VBR connects to VSphere, Backup Proxy performs HotAdd of VMDK files of Virtual Machine, Backup Proxy connects to the NFS export and writes backup data across. 

It is my hope that this process has helped someone else trying to figure out how this works.  I truly believe this would be helpful to a SMB shop with a 2 host ESXi environment to have solid, quick, backup jobs and secure them behind a proxy server.  If I can help anyone with this process, please don’t hesitate to reach out, you all have been wonderful to help me along the way!  Thank you again for everyone’s support with this design.  

Userlevel 7
Badge +17

Great to hear you got it sorted @beenydu . Well done! 

Userlevel 7
Badge +6

This is very interesting, but thank you for posting your findings.  I’m sure this will be useful to someone else down the road!

Userlevel 7
Badge +21

Great to hear you resolved the issue and gave the details for everyone to see.

Userlevel 7
Badge +6

Thanks for the visual.  Let me know if that works or we need to dig deeper.

Comment