Do you actively use Veeam Replication?
If so, how long has it been since you’ve implemented it? For me, I don’t currently use it, but am in the process of getting it going again after about an 8yr hiatus. Not too much has changed since then, but there has been some enhancements you might not’ve known about, or some behavior you either didn’t know about or forgot. Below, I will be sharing some tidbits & behaviors I feel you should be aware of when making design decisions when implementing Veeam Replication.
Before getting into the “Don’t Know/Forgot” items, let me briefly review how Replication works from a high-level. First, they are a job-driven task. You’ll need to set up a job and configure various settings. Next, depending on the source data you’re wanting to replicate, you can either replicate data directly from your production environment, or from other locations. I’ll touch more on this later. Lastly, the first run of Replication creates a fully functional VM on the target, and subsequent runs create incremental Restore Points, which in reality are native VM snapshots. And for the most part, that’s it. Now let’s discuss what you may not have known, or just forgot, about Veeam Replication (I will be specifically referencing Replication with vSphere, although obviously it is supported in Hyper-V as well)…
- First, and this is probably one of the most important behaviors to pay attention to – if you modify the disk size of a source VM you’re replicating, all Restore Points of the replicated VM are removed during the subsequent Replication job run. This behavior occurs due to VMware limitations. And, this is important to know if you have aggressive data retention SLAs within your organization
- Assuming your source data comes directly from your production environment and no other Veeam jobs or tasks higher in the priority chain can hinder your Replication job, your Replication job could in theory run up to 21 days to get data to the target. Who knew! (um..I didn’t ) This could happen for a few reasons – initial run of a Replication job has TBs or more of data; slow link between source and DR sites; no WAN Accelerators to help speed the data transfer process along, etc. If your job can not finish within this timeframe, it will stop and return a Failed status
- Replication cannot use DirectSAN Transport Mode. If you use a physical machine as a Source Proxy, it will use Network Mode (nbd). Using a VM obviously results in the Virtual Applicance (hotadd) mode being utilized
- A source-side Repository is needed to store Replication job metadata, which contains information on the read data blocks, such as checksum & digests. This metadata is also used when restoring a VM from Replica to the original location using the Quick Rollback feature to detect changed blocks between the two states. Something you may not know, the source Repository cannot be a scale-out repository or dedup appliance
- Since VM Replicas are actual VMs, they are stored in their native format. As such, they are not compressed. Compression does occur during Replication job runs, but happens on the Source Proxy, while the Target Proxy decompresses the data before storing it
- VMware has a maximum of 32 snapshots allowed in a snapshot tree. What you may not know is because Veeam could potentially have up to 4 snapshots for various Replication tasks (i.e. create helper snapshots), the maximum Restore Points allowed per Replica is 28
- You can not only use Veeam Replication to source directly from your production environment, but you can also Replicate from Backups, specifically a Veeam Repository where Backups (or Backup Copies) reside. This technology lessens the load on your vSphere environment, as well as production VM servers since snapshots are not needed. Pretty cool stuff. But what you may not have known is, to use this feature, the Backup Job needs to reside on the same VBR Server which runs your Replication jobs. Why this could be a problem is generally, it is Veeam best practice to use a separate VBR Server at your DR site to run your Replication jobs so you have a VBR Server available during a prod-site disaster situation. This doesn’t necessarily mean you can’t use this feature, but to do so, you have to import the Backups from the Repository at this source-side location, create a new Backup Job on the Replication VBR Server and map this imported backup to the Replication job, configuring the Get Seed From the Following Repository option. Makes this feature a little “clunky” to use. But, if you do onsight Replication where you use 1 VBR Server for both Backup and Replication, this is a more viable feature to use. Something about this technology you may not have known – when you do use Replica From Backup, the Restore Points (snapshots) created do not have the timestamp of when the Replication job runs, but rather of the Restore Point when the VM in the Backup job was created. Keep this in mind when determing your Recovery Point needs
- Veeam also has the ability to lessen the load on your WAN using either Replica Seeding or Replica Mapping. Seeding is simply when you have copies of your Backups at your DR site, and configure your job to use them to hydrate your replicas; Mapping is when you have VM copies and use those to hydrate your data. Pop quiz time! Since both Seeding and Replica From Backup both use Backups as a source, what is the difference between the two? *honk* Time’s up The difference is for Seeding, only the first run of the Replication job uses the Backup location as a source & subsequent runs use the production environment; whereas Replica From Backup always uses the Backup Repository location as job source. Pop quiz #2: Can a Replication job have both Seeding and Mapping? And, can a job have some VMs you want Seeded or Mapped, and some not? Answer: yes to the first part, and no to the 2nd part. You can have both Seeding and Mapping in a Replication job. And, if you configure a job with VMs you don’t want to use Seeding or Mapping for, those VMs will be skipped from processing
- Re-IP functionality can only be done on Windows VMs; and, for IP ranges, you have to use the asterisk (*) as the last octet, not .0, for example:. 172.16.0.*
- If you configure “continuously run” Replication jobs, and configure the Wait before each retry attempt a specified set of minutes, Veeam will ignore this setting and retry the job one right after the other with no time interval between retries
- Oh look…another Pop Quiz! Did you know Replicas do not have 3 states, but rather 4? Name them – <waiting>
Ok, here they are. I’m sure you got 3 of them, but maybe didn’t know the 4th. They are:
> Ready to switch – this is the one you probably didn’t know. This status refers to the point of the Replica after Failover, but before committing a Failback operation
Did you get them all? Kudos if you did.
- As stated above, Replica From Backup always uses data from a Backup as its source, correct? Wrong! ha Ok, so it mostly does. The only time when this is not true is when performing a Planned Failover. When this process is started, Veeam powers down the source VM, and replicates incremental data from the production environment, not the Backup source it normally uses, to get the Replica to the most current state the source VM is in
- Unlike Planned Failovers where the source VM(s) get powered down and VMs are processed sequentially (one-by-one), in a Failover Plan VMs are not powered down, and can be processed in groups. Keep this in mind (VMs not powered on) if your source VMs are still in production when initiating a Failover Plan. You can process 10 VMs at a time in a Failover Plan. But, if performing an Undo Failover, you can only process up to 5 VMs at a time
Well, that’s it. Was there anything that “got you”? When reviewing the Veeam documentation, there were certainly quite a few things I had forgotten.. or, if I’m being honest, didn’t know Hope you found this post useful.
Happy Veeaming fellow community folks!