Question

Behaviour is different on each site with same config


Userlevel 4

Hi,

I have 2 locations with a Veeam server and a vCenter cluster with 2 hosts.

The configuration is on both servers identical.

One one location, when running a job with application awareness, the inventorying guest system and the other preparing jobs are taking several seconds, but on the other location, it takes minutes.

It does not fail, but the time of processing is much longer.

The difference I see, on the ‘slow’ location, all communications running via vCenter. On the fast one, all is going direct from B&R server to the VM.

Firewall rules are identical and I don’t see any blocks, so the slow one does not try to direct connect to the VM.

Any ideas where to look for?

The B&R server is in a seperated VLAN with explicit firewall rules between them, conform the manuals.

 

Thank you!


10 comments

Userlevel 7
Badge +17

The ‘prep/processing’ beginning stage of jobs varies with VM types and transport mode, as well as length of time to collect CBT data. So, I assume jobs are same transport mode at both sites? (i.e. all using hotadd or DirectSAN, or BfSS, or nbd at both locations)

You can look at job logs to start with for a given job in C:\ProgramData\Veeam\Backup then the folder of the jobname of a backup job.

Userlevel 7
Badge +17

Without being able to ‘see’ your environment, when you say communication seems to go through vCenter at the ‘slow location’, to me that means your jobs on the VBR server at that location may not be using the correct transport mode, and instead is using nbd. You can easily check this by opening (double-clicking) a backup job and see the transport mode (nbd = network). Open a backup job on the ‘fast location’ and see what mode is there. If the slow locaiton is using nbd and fast one using hotadd or ‘san’ (i.e. DirectSAN), then there is your answer as to why the slower speed/time.

Userlevel 7
Badge +20

Along with what Shane has mentioned is the storage identical in both locations and are you using FC or iSCSI for connections?  Also is everything virtual for VBR or is there any physical elements?

Userlevel 7
Badge +14

At the slow site, could you please test guest processing via test now and share the results here?

https://helpcenter.veeam.com/docs/backup/vsphere/backup_job_vss_vm.html?ver=120

Userlevel 4

Hi, both sites are connected with ISCSI to the shared storage device (Nimble), and the DirectSan connection is functional in both sites. On both jobs:

Using backup proxy VMware Backup Proxy for disk Hard disk 1 [san]  
It is the guestprocessing itself whats differs on behaviour.

The good site:

Guest processing

 

The bad one:

Guest processing

 

I’m diving in the logs now, and I come back with more details.

Could it be that something is failing at site 1 (the fast one) and therefore takes an alternative way, who is much faster? 

Thanks all already!

 

Userlevel 4

There is one difference in the hardware of the ESXi itself. The slow site is working on Dell servers, the fast one with HPE Proliant. Maybe there is a difference in ESXi config because it is installed with OEM ISO?

Userlevel 7
Badge +17

Mmm...that shouldn’t be the case @mike.beirens .

Userlevel 7
Badge +20

Hardware should not matter there is a configuration difference somewhere between the sites causing this more than likely.

Userlevel 4

Hi, Because of another failure I did create a case for, they had see that VIX is used instead of RPC en that would explain the delays. I did modify the regkey KB1230: Win32 error: The network path was not found. Code 53 (veeam.com) (set to 0) and now the job is running very fast without delays.

The most strange thing is, the other server has the same setting as well and wasn’t causing the delays. Probably because VIX was already failing and going to RPC.

I will try other server as well and see if errors in logs are reduced.

Thanks all!

Userlevel 7
Badge +8

Try and force the transport mode and proxy. If it fails, trouble shoot it that way.  Often if a firewall rule or something on the network is not routed correctly it will failover to a slower method

Comment