Question

Optimal use of server network cards


Userlevel 3

Hi,

 

Veeam newbie here looking for recommendation on how to make optimal use of our network cards.

 

Our servers are HP DL365s with 4 x 1Gbps on a single module and 2 x 10Gbps on a separat module. The plan is to follow the VMwar Best Practice document to allocate 2 x 1Gbps for Management and vMotion, 2 x 1Gbps for the VM traffic (VLAN 111) and 1 x 10Gbps for storage traffic to our NetApp (from where we mount the VMDKs via NFS). Our Veeam BaR server and proxy are both VMs and not physical and we have a separate VLAN for backup and replication (VLAN 333), and for storage (VLAN 222).

 

A colleague says we are over-engineering this and backup traffic should just flow thru the same storage network. Any suggestions?


10 comments

Userlevel 7
Badge +20

You can have it flow the same path as the storage or dedicate the other 10GB for backup traffic. With 10GB you can use NBD mode without issue.  If you want direct from storage SAN mode then you need connection to storage network.

Userlevel 7
Badge +20

Hi @afn24805!

 

You haven’t told us what storage you’re using for your backup storage yet. Are you presenting the storage to a VM natively via something such as iSCSI, or using a VMDK via NFS?

 

I would want to be sure however you’re planning on recovering these backups doesn’t hold any nasty surprises, such as rebuilding the ESXi environment to get your backups functional again but with no surviving DNS servers etc.

Which NFS mode are you using between ESXi and your NetApp?

 

Userlevel 3

Hi MicoolPaul,

 

Our backup storage is a local Synology running a VM acting as immutable storage + a cloud repository from a Veeam provider.

Userlevel 3

Hi Chris,

 

If I recall the documents correctly, for NFS, Veeam recommends Direct NFS. Is this what you mean by direct from storage SAN mode?

Userlevel 3

Is it recommendable to use just the 2 x 10Gbps NICs for everything (but traffic still split into port groups / VLANs)? Would save us a ton of cabling ...

Userlevel 7
Badge +20

Hi Chris,

 

If I recall the documents correctly, for NFS, Veeam recommends Direct NFS. Is this what you mean by direct from storage SAN mode?

Yes that is what I meant for NFS.

Userlevel 7
Badge +20

What NIC speed does the Synology have? Is it 1Gbps or 10Gbps?

 

Also, what speed are your switches? 1Gbps or 10Gbps?

 

So far my recommendation would be:

2x 10Gbps NICs used for storage, assuming you’re using NFS 4.1 and the NetApp supports NFS Multipathing, I tend to work with iSCSI so I can’t confirm this feature exists on NetApp but would be surprised if it didn’t.

You haven’t said what performance the NetApp is capable of delivering but I’d certainly assume it can deliver more than 1Gbps worth of IO. I would also recommend the NetApp & ESXi hosts communicating over 2x dedicated storage switches if possible, this also makes it easier to just enable jumbo frames for your storage. This would prevent any broadcast storms etc from impacting the underlying storage. Failing this I’d still want the 2x NICs on separate switches, to prevent a single switch failure causing a complete infrastructure outage.

 

Moving away from this and towards the rest of the VMware platform

Are you licensed with vSphere Enterprise Plus? If so are you using vDS? (vSphere Distributed Switches) If this is the case, you can use Network IO Control to ensure everything is well balanced between your management VMKernel, vMotion, and VM traffic.

 

You’ll want to have dedicated interfaces to vMotion, but can I also confirm you have a license that lets you vMotion with a virtual machine powered on? If not there’s not much point worrying about this as a vMotion would just be re-registering a VMX to another host, no memory transfers required. Assuming you are licensed for this:

I’d create a 2x 1Gbps vSwitch that uses vMotion. to avoid that creating chaos as in some of the more recent vSphere 7.0 (think it was U1 or U2) it can easily saturate over 25Gbps links if it has the capacity.
 

I’d then use my other 2x 1Gbps NICs to create a vSwitch that shared management & VM Port Groups, I’d configure them to use the opposite NICs by default, and the other NIC only for failover. Arguably you could make a single vSwitch that incorporates vMotion to use like the below:

ESXi Management uses NIC 0, failover to NIC 2
vMotion uses NIC 1, failover to NIC 3

VM Port Groups use NICs 2 & 3, failover to NIC 0

 

I’d really try to avoid relying on any bandwidth that vMotion could use, as it will consume endlessly unless you have Network IO Control to starve it.

 

Finally onto Veeam.

 

I wouldn’t run a VM Proxy on this platform.

Your processing modes are:

Direct NFS

Virtual Appliance / Hot Add

Network

 

Network mode is recommended only with 10Gbps+ it does work with 1Gbps but the performance is limiting, this will also be sending the traffic over the shared NICs so you’ll see less than 1Gbps throughput. Virtual Appliance will attach the disks to the Proxy VM via ESXi so it will be able to read the data at 10Gbps from the storage, but it will still need to send this data over to the repository via the shared VM traffic NICs, meaning that you’ll get less than 1Gbps throughput.

Direct NFS Mode makes the most sense in my opinion. But I’d suggest putting your proxies onto your Synology as a separate VM. This way you can utilise the NICs in a dedicated capacity to get maximum throughput as you haven’t got any noisy neighbours on an ESXi host to worry about, bonus points if the Synology is 10Gbps. This way as well the proxy can communicate to the repository via the virtual switch to keep the network efficiency going.

 

How does this sound so far?

Userlevel 7
Badge +6

Michael pretty much hit all the points, but I’ll just state what I like to do.  This is all assuming that there is 10Gb capabilities end-to-end (servers, switching, storage devices).  

If I have 10Gb links, and I have enough to separate ISCSI traffic from the VM traffic, I’ll setup trunk ports on one set of the 10Gb NIC’s for Management, vMotion and any VM networks on separate VLAN’s, and then use dedicated 10Gb ports for ISCSI. If I have separate 10Gb ports that I can dedicated ISCSI traffic to separate from vMotion, management and VM traffic, I’d do that, but it sounds like in your case you can’t, but that’s okay.  Of course, use jumboframes if possible, but again, make sure that is enabled end-to-end as if you don’t, you’ll end up with packet fragmentation and it will hurt your performance more than you’ll gain.

If using the NAS for the backup repo as an ISCSI device, I don’t have an issue with it sitting on the same ISCSI network as your primary storage, buy you CAN separate it to a separate VLAN if you’d like….you’ll just need to have separate vmkernel NIC’s to talk to that ISCSI device, and of course pay attention to your vmkernel NIC assignment to the physical ports so that you can use proper port binding if needed.

I will note that I REALLY like to use Multi-Nic vMotion, and there’s nothing more satisfying than watching 10Gb vMotion after suffering with 1Gb for years (and having had it previously)….unless you know...you have 25Gb or 40Gb capabilities.  I just upgraded my primary datacenter to 25Gb earlier this year after suffering with 1Gb, and 10Gb is a huge upgrade…..25Gb is fun just from a geek perspective but not really as huge as a performance increase once it’s already really fast.

Honestly, I don’t have much use for those 1Gb NIC’s unless you want to separate management to a different physical port, which isn’t a bad thing, but I’d rather segment it to another VLAN and use simpler cabling, consume less switch ports, less wires to block heat/airflow, etc.

Also as Michael states, I do like direct storage access so that you don’t need a proxy server for backups.  That said, you may still need a proxy server for restores depending on what your VM configuration is.  Note that I’ve only been doing Direct SAN Access, and the features and limitations are different for Direct NFS Access.

Userlevel 3

Wow! What an awesome community this is! Thanks to everyone who provided their inputs.

 

The amount of information is overwhelming (but definitely appreciated) so I had to take a moment to try and digest all of it (that and the fact that our 10G switches which had 10G uplinks recently were replaced with 2x40G uplinks so got pretty busy).

 

First off, to clarify, our Synology is currently 1G BUT we can add 10G cards to it so I guess let’s just assume that they are all 10G end-to-end (switch ports are 10G as well just to be clear).

 

Secondly, looking at the monitoring tab on vSphere vCenter, our busiest physical servers (the one that hosts AD/DNS and the one that hosts a SQL VM and a web VM), have never exceeded 60Mbps combined TX-RX.

 

Third, we use vMotion (on running VMs) between physical hosts connected to the same NetApp (so we don’t actually move the storage, just the compute resource). For those curious why, it’s because we have a 2nd server room on the other end of the campus that contains the exact same physical set up under a different subnet with “redundant” VMs running there (e.g. 2nd domain controller, 2nd SQL server, 2nd web server, etc). For VMs that don’t support active-active redundancy (we only have 2), we will restore from the Veeam backup to the 2nd server room but we will need to change the IP and DNS entry. We’re licensed for vSphere Standard so no DRS.

 

Fourth, each server cabinet has 2 switches for redundancy but all traffic pass thru these 2 switches (even storage traffic/NFS3).

 

So, if I’ve digested the extensive info that @MicoolPaul provided, ideally it’d be:

vSwitch0 (2 x 1Gbps) - management, vMotion

vSwitch1 (2 x 10Gbps) - storage (VLAN 222)

vSwitch2 (2 x 1Gbps) - VM traffic (including the Veeam BaR server)

Synology (2 x 1Gbps) - management / BaR traffic (VLAN 333)

Synology (2 x 10Gbps) - storage traffic (VLAN 222)

NetApp (2 x 10Gbps) - storage traffic (VLAN 222)

Run proxy VM (connected to VLAN 333 and 222) and repo VM (connected to VLAN 333) on the Synology.

 

If I want to reduce our cabling and just use the 2x10G NICs on the HPs and Synology, I could get away with:

vSwitch0 (2 x 10Gbps) - management, vMotion, storage (VLAN 222), VM traffic (including the Veeam BaR server)

Synology (2 x 10Gbps) - management / BaR traffic (VLAN 333), storage traffic (VLAN 222)

NetApp (2 x 10Gbps) - storage traffic (VLAN 222)

Run proxy VM (connected to VLAN 333 and 222) and repo VM (connected to VLAN 333) on the Synology.

Userlevel 7
Badge +8

I agree with everything said, but except here is an interesting situation that can change it all.

 

BLADES!.

 

I have a cisco UCS chassis with 8 ESXI hosts that all use the Fabric Interconnects (2 ports 10G each side)

 

I can give the ESXI host as many virtual ports as I want, they all use the FI 10GB ports. and it’s shared between 8 hosts.

 

Lucky a lot of the data internally is done in the chassis and doesn’t leave, I also have fiber for storage which are on different ports. That being said, Veeam is a pretty big data mover compared to anything else in the network. 

Comment