Solved

Veeam Failback Takes More Time - Veeam Version 11


Userlevel 3

Hello Team,

We have a Veeam Backup & Replication Version 11. 

The purpose is to replicate the on-premise VMware based VMs to DR location (In this case DR location is VMware Based Cloud). 

So in on-premise we have got primary VBR instance running, this is mainly to do the backup jobs of VMs and store it locally. 

In Cloud we have got Secondary VBR (enabled source proxy on this) instance running, this is used to do the replication of VMs from on-premise to cloud. We also have Veeam Enterprise Manager to manage these two primary & secondary VBR instance. 

We notice that the failback is taking lot of time even when the data is not changed much (during test failover operation for few mins as well), the bandwidth between the on-premise and cloud is 100mbps via VPN. 

Example: If we do a failover of test VM (Size 100GB) for few mins and then do failback, the failback takes more than an hour to complete.
We also noticed during the failback Veeam reads all the hard disks attached to VMs, this process takes a long time to complete, is there a way we can speed up this? so overall the failback time will be reduced. 

Note: The failover is quick, this issue is only witht the failback. 

Is there a way we can decrease the time taken to do the failback operation? 

Will bandwidth increase help? 

Thank you in advance

~ Ravi Kumar S

icon

Best answer by RaviKumarS 20 October 2022, 13:43

View original

7 comments

Userlevel 7
Badge +13

Hi @RaviKumarS , based on a previous answer on forum side, during failback replica Veeam B&R needs to calculate and synchronize the differences between the original VM and the replica VM. The time required to scan the VM image depends on its size and is comparable to the full backup time of this VM. It’s not just a CBT changing like any incremental.

Let’s see if someone on community side will add some more useful details!

Userlevel 2
Badge +2

 

Hi,

If you failback to the original VM in the original location you can enable Quick Rollback. This will use CBT to query for the changed blocks, instead of reading all the disks. 

 

https://helpcenter.veeam.com/docs/backup/vsphere/failback_quick_rollback.html?ver=110

Userlevel 3

Thank you @ravatheodor @marcofabbri for your quick response!

@marcofabbri @ravatheodor - Tomorrow we will be trying the Quick Rollback option cause our failback is always on original location and original VM. Will keep you guys posted how it goes. 

Could anyone also share some thoughts around bandwidth? Does it make sense to increase bandwidth from 100Mbps to a higher bandwidth? Will this help us in anyway? 

Userlevel 2
Badge +2

Bandwidth will impact your replication, failover and failback. It depends on the amount of changed data that needs to be sent across. When you will saturate the bandwidth, you have the alternative to put in place WAN accelerators. Then you could go for increasing the bandwidth. Look also at how the VPN terminators are behaving from the point of view of performance.  

Userlevel 7
Badge +8

Hi there!

In my personal experience, when this happened to us, it was a lack of disk speed instead of bandwidth, are you restoring or falling over a SSD datastore? or mechanical disks datastore?

also the amount of changes, but if there is no too many changes, shouldn't take forever.

Keep in mind that time is calculated with a lot of variables in the equation, network, availability of source and destination, Veeam B&R Server resources, disks speed, etc.

So would be also good to give us some extra info, and also drive some tests yourself, like replicating and then planned failover for example, so you can estimate the time with “no changes at all”, and then with a 12 or 24 hours changes and so on.

cheers.

Userlevel 3

@HunterLAFR Thank you -

  • It is failing back to a normal SAN Storage and not SSD.
  • The amount of changes is very less or no changes cause we were just testing failover and failback. 
  • I will first try to use the quick roll back option suggested by @ravatheodor to see if it makes any difference. 
  • Later we will try to test out one by one like the resource on VBR instance, Network connectivity and how VPN handles the traffic between source & destination & vice versa, underlying disks etc.  
Userlevel 3

Team,

Just to update - we have done testing with quick rollback option and now the time taken to failback is much faster. 

Without quick rollback - Failback of 2 VMs used to take 1 hour. 

With quick rollback - Failback of 2 VMs takes about 30 mins only. 

Thank you everyone for quick response. 

~ Ravi Kumar S

Comment