Solved

Does Windows Data Deduplication role on the source VM, impact Veeam compression?


Userlevel 7
Badge +9

Title says it all

 

Got a customer who advises they are running Windows dedupe on the local OS, and I can see the backup copy job is getting really poor compression on the VM. Anyone else have customers doing this and seeing similar results?

 

Example

Source VM data size = 868 GB 
Source VM VBK file = 863 GB

 

icon

Best answer by MicoolPaul 10 April 2024, 17:51

View original

5 comments

Userlevel 7
Badge +20

You will see very poor data efficiency improvements due to the deduplication that has already taken place, compression will do something but not much. The other headache with windows data deduplication is that whenever the schedule runs you’ll have a HUGE amount of changed blocks, so your backup times will be long. And every changed block needs to be backed up so you’ll see a large increase in storage consumption typically vs if the data was just stored normally as a backup source

Userlevel 7
Badge +9

Thanks Michael ...confirmed our fears. It also hinders our ability to accurately size repos, no way to predict the impact on the compression rate :(

Userlevel 7
Badge +20

The best suggestion I can give is to get them to set the schedule to weekly for data deduplication. And align that to an active/synthetic full backup with Veeam. This way you’ll only have a massive block generation event weekly and you can calculate it as if it was an active full per week without any synthetic full + fast clone benefits. That’ll get you near enough accurate, yes it will consume more space vs fast clone, but you can make it more like a traditional weekly full vs daily change rate chaos

Userlevel 7
Badge +9

Backup Copy Job also affected by the Windows Dedupe?? Can’t see why as the blocks are already processed by the initial Backup Job...or is it having to “re-hydrate” the block to then process it again. Brain frazzled by all this

 

Userlevel 7
Badge +20

Hi Craig, BCJ will have the same storage capacity penalty due to the blocks being changed at the source VM. So the primary backup job sees say 80% block change, and then the BCJ needs to copy the 80% changed blocks.

 

So yes BCJ shouldn’t have extra penalty over the primary backup that still depends on the chain types being used.

Comment