Hello to all,
As I configure a testing Job with Multiple VMs with the same OS, I can see that there is no difference in the deduplication rate in Report when I run the VM itself alone or in the Group.
One more point to ask:
The VMs in the Backup Copy Job are all started at the same time when the job starts.
Is this how it should be or they need to be run one at a time?
Hi,
Is your backup repository configured to use ‘Per-machine backup files’? If so deduplication is only performed within the individual server’s data, it won’t span between multiple servers with identical data (such as OS binaries), because they must be stored as separate chains. If you’ve got a new VBR deployment it’s quite possible you’re seeing this behaviour as from v12 Per-Machine backup files is the new default.
The number of VMs copied simultaneously in a BCJ will depend on the number of simultaneous tasks your source & destination backup repositories are configured for, unless you’re also using WAN accelerators. So provided those are sized appropriately for your backup repository resources, I wouldn’t worry if it’s copying many at once. There’s no impact to production from a snapshot or disk IO perspective, purely your backup network & backup infrastructure IO.
Hello
It was not enabled, I use V11 for testing at the moment.
I just enabled Per-Machine to check what are the differences.
We don’t have much VMs to backup, its totally 13. So does grouping same OS VMs can have some benefits for us or it doesnt?
Hi,
Is your backup repository configured to use ‘Per-machine backup files’? If so deduplication is only performed within the individual server’s data, it won’t span between multiple servers with identical data (such as OS binaries), because they must be stored as separate chains. If you’ve got a new VBR deployment it’s quite possible you’re seeing this behaviour as from v12 Per-Machine backup files is the new default.
The number of VMs copied simultaneously in a BCJ will depend on the number of simultaneous tasks your source & destination backup repositories are configured for, unless you’re also using WAN accelerators. So provided those are sized appropriately for your backup repository resources, I wouldn’t worry if it’s copying many at once. There’s no impact to production from a snapshot or disk IO perspective, purely your backup network & backup infrastructure IO.
Hello Again,
Now with “Per-machine” the deduplication was still the same.
Amount of Read and Transfered files were the same.
Then in our Situation I quess its the same if we use Backup Jobs with Mulitple VMs or Single VM.
Hi,
Is your backup repository configured to use ‘Per-machine backup files’? If so deduplication is only performed within the individual server’s data, it won’t span between multiple servers with identical data (such as OS binaries), because they must be stored as separate chains. If you’ve got a new VBR deployment it’s quite possible you’re seeing this behaviour as from v12 Per-Machine backup files is the new default.
The number of VMs copied simultaneously in a BCJ will depend on the number of simultaneous tasks your source & destination backup repositories are configured for, unless you’re also using WAN accelerators. So provided those are sized appropriately for your backup repository resources, I wouldn’t worry if it’s copying many at once. There’s no impact to production from a snapshot or disk IO perspective, purely your backup network & backup infrastructure IO.
Hello Again,
Now with “Per-machine” the deduplication was still the same.
Amount of Read and Transfered files were the same.
Then in our Situation I quess its the same if we use Backup Jobs with Mulitple VMs or Single VM.
It is also dependent on the OS of your repository as well. If you use XFS or ReFS then you will see savings. Also a dedupe appliance as well.
In the Veeam console under Backups, Disk, right click the job and click properties.
What does it say for Deduplication and Compression?
The VBK is larger and will usually have more dedupe.
Per-VM jobs deduplicate blocks from each VM individually, Per Job does it over the whole job. Per-VM is usually a better option due to the performance gain of parallel processing. It also makes the VM’s more portable, and restores become faster, especially when you push your archives elsewhere. (I can restore a VM from Tape vs an entire job)
Depending on the Size of your VM, OS, Type of data, etc. Dedupe and Compression can vary. For example, I’m looking at a file srever right now that gets 2.5x Dedupe, and 1.0x Compression on the backup. This is because the disks have lots of empty space still and are provisioned as Thick in VMware. Now when I look at the incremental on this VM, I see it’s often 2.8x, 2.3x getting decent results as well. Depending on what changes, sometimes I’ll see 5x, others 0x.
I’m looking at one of my large video servers right now and at 51TB it is only getting 1.2x Dedupe and 1.0x Compression. The sad thing is that dedupe is only because of the empty space on the disks too.
Dedupe appliances can help with this, and especially when you do a Per-VM job.
Grouping benefits with either type of job. Either for speed and parallelism, or dedupe. Go Per-VM and don’t worry about it. 1 job for 13 VM’s is nothing. I have many jobs with 50+
In the Veeam console under Backups, Disk, right click the job and click properties.
What does it say for Deduplication and Compression?
The VBK is larger and will usually have more dedupe.
Per-VM jobs deduplicate blocks from each VM individually, Per Job does it over the whole job. Per-VM is usually a better option due to the performance gain of parallel processing. It also makes the VM’s more portable, and restores become faster, especially when you push your archives elsewhere. (I can restore a VM from Tape vs an entire job)
Depending on the Size of your VM, OS, Type of data, etc. Dedupe and Compression can vary. For example, I’m looking at a file srever right now that gets 2.5x Dedupe, and 1.0x Compression on the backup. This is because the disks have lots of empty space still and are provisioned as Thick in VMware. Now when I look at the incremental on this VM, I see it’s often 2.8x, 2.3x getting decent results as well. Depending on what changes, sometimes I’ll see 5x, others 0x.
I’m looking at one of my large video servers right now and at 51TB it is only getting 1.2x Dedupe and 1.0x Compression. The sad thing is that dedupe is only because of the empty space on the disks too.
Dedupe appliances can help with this, and especially when you do a Per-VM job.
Grouping benefits with either type of job. Either for speed and parallelism, or dedupe. Go Per-VM and don’t worry about it. 1 job for 13 VM’s is nothing. I have many jobs with 50+
Hello Scot,
Currently I have Per-VM enabled.
Job’s are configured and today is the first day of starting them after an reconfiguration.
I used per OS grouping for the backup jobs.
SPCL100(7VM’s), SPVM101(4VM’s), SPVM202(4VM’s) are different Hyper-V hosts.
I have 3 more single VM job’s because they are largen in size.
I think that this approach is okay.
Timing of the job is set like 21:00, 21:01, 21:02 for the largest one’s and then other ones.
So that he starts first with the larges’s and most critical ones.
It’s synthetic full’s to Windows Server connecting directly via iSCSI to Synology NAS (ReFS 64KB)
In the Veeam console under Backups, Disk, right click the job and click properties.
What does it say for Deduplication and Compression?
The VBK is larger and will usually have more dedupe.
Per-VM jobs deduplicate blocks from each VM individually, Per Job does it over the whole job. Per-VM is usually a better option due to the performance gain of parallel processing. It also makes the VM’s more portable, and restores become faster, especially when you push your archives elsewhere. (I can restore a VM from Tape vs an entire job)
Depending on the Size of your VM, OS, Type of data, etc. Dedupe and Compression can vary. For example, I’m looking at a file srever right now that gets 2.5x Dedupe, and 1.0x Compression on the backup. This is because the disks have lots of empty space still and are provisioned as Thick in VMware. Now when I look at the incremental on this VM, I see it’s often 2.8x, 2.3x getting decent results as well. Depending on what changes, sometimes I’ll see 5x, others 0x.
I’m looking at one of my large video servers right now and at 51TB it is only getting 1.2x Dedupe and 1.0x Compression. The sad thing is that dedupe is only because of the empty space on the disks too.
Dedupe appliances can help with this, and especially when you do a Per-VM job.
Grouping benefits with either type of job. Either for speed and parallelism, or dedupe. Go Per-VM and don’t worry about it. 1 job for 13 VM’s is nothing. I have many jobs with 50+
Hello Scot,
Currently I have Per-VM enabled.
Job’s are configured and today is the first day of starting them after an reconfiguration.
I used per OS grouping for the backup jobs.
SPCL100(7VM’s), SPVM101(4VM’s), SPVM202(4VM’s) are different Hyper-V hosts.
I have 3 more single VM job’s because they are largen in size.
I think that this approach is okay.
Timing of the job is set like 21:00, 21:01, 21:02 for the largest one’s and then other ones.
So that he starts first with the larges’s and most critical ones.
It’s synthetic full’s to Windows Server connecting directly via iSCSI to Synology NAS (ReFS 64KB)
Having the start time this close should be ok if your Proxy servers can handle the load and not have jobs waiting for a task slot. Otherwise, I would suggest spacing the times out much more.
In the Veeam console under Backups, Disk, right click the job and click properties.
What does it say for Deduplication and Compression?
The VBK is larger and will usually have more dedupe.
Per-VM jobs deduplicate blocks from each VM individually, Per Job does it over the whole job. Per-VM is usually a better option due to the performance gain of parallel processing. It also makes the VM’s more portable, and restores become faster, especially when you push your archives elsewhere. (I can restore a VM from Tape vs an entire job)
Depending on the Size of your VM, OS, Type of data, etc. Dedupe and Compression can vary. For example, I’m looking at a file srever right now that gets 2.5x Dedupe, and 1.0x Compression on the backup. This is because the disks have lots of empty space still and are provisioned as Thick in VMware. Now when I look at the incremental on this VM, I see it’s often 2.8x, 2.3x getting decent results as well. Depending on what changes, sometimes I’ll see 5x, others 0x.
I’m looking at one of my large video servers right now and at 51TB it is only getting 1.2x Dedupe and 1.0x Compression. The sad thing is that dedupe is only because of the empty space on the disks too.
Dedupe appliances can help with this, and especially when you do a Per-VM job.
Grouping benefits with either type of job. Either for speed and parallelism, or dedupe. Go Per-VM and don’t worry about it. 1 job for 13 VM’s is nothing. I have many jobs with 50+
Hello Scot,
Currently I have Per-VM enabled.
Job’s are configured and today is the first day of starting them after an reconfiguration.
I used per OS grouping for the backup jobs.
SPCL100(7VM’s), SPVM101(4VM’s), SPVM202(4VM’s) are different Hyper-V hosts.
I have 3 more single VM job’s because they are largen in size.
I think that this approach is okay.
Timing of the job is set like 21:00, 21:01, 21:02 for the largest one’s and then other ones.
So that he starts first with the larges’s and most critical ones.
It’s synthetic full’s to Windows Server connecting directly via iSCSI to Synology NAS (ReFS 64KB)
Having the start time this close should be ok if your Proxy servers can handle the load and not have jobs waiting for a task slot. Otherwise, I would suggest spacing the times out much more.
Hello Chris,
I’m using only dedicated on-host proxies.
For now I didn’t implement any off-host proxy.
This could be a thing for our largest VM, its our File Server.
For Backup Job we are using only one repo and its configured for max 4 concurrent tasks. So I guess that it will start only first 4 by time scheduled and when one job is finish only then it will continue to the next one that is waiting in line?
In the Veeam console under Backups, Disk, right click the job and click properties.
What does it say for Deduplication and Compression?
The VBK is larger and will usually have more dedupe.
Per-VM jobs deduplicate blocks from each VM individually, Per Job does it over the whole job. Per-VM is usually a better option due to the performance gain of parallel processing. It also makes the VM’s more portable, and restores become faster, especially when you push your archives elsewhere. (I can restore a VM from Tape vs an entire job)
Depending on the Size of your VM, OS, Type of data, etc. Dedupe and Compression can vary. For example, I’m looking at a file srever right now that gets 2.5x Dedupe, and 1.0x Compression on the backup. This is because the disks have lots of empty space still and are provisioned as Thick in VMware. Now when I look at the incremental on this VM, I see it’s often 2.8x, 2.3x getting decent results as well. Depending on what changes, sometimes I’ll see 5x, others 0x.
I’m looking at one of my large video servers right now and at 51TB it is only getting 1.2x Dedupe and 1.0x Compression. The sad thing is that dedupe is only because of the empty space on the disks too.
Dedupe appliances can help with this, and especially when you do a Per-VM job.
Grouping benefits with either type of job. Either for speed and parallelism, or dedupe. Go Per-VM and don’t worry about it. 1 job for 13 VM’s is nothing. I have many jobs with 50+
Hello Scot,
Currently I have Per-VM enabled.
Job’s are configured and today is the first day of starting them after an reconfiguration.
I used per OS grouping for the backup jobs.
SPCL100(7VM’s), SPVM101(4VM’s), SPVM202(4VM’s) are different Hyper-V hosts.
I have 3 more single VM job’s because they are largen in size.
I think that this approach is okay.
Timing of the job is set like 21:00, 21:01, 21:02 for the largest one’s and then other ones.
So that he starts first with the larges’s and most critical ones.
It’s synthetic full’s to Windows Server connecting directly via iSCSI to Synology NAS (ReFS 64KB)
Having the start time this close should be ok if your Proxy servers can handle the load and not have jobs waiting for a task slot. Otherwise, I would suggest spacing the times out much more.
Hello Chris,
I’m using only dedicated on-host proxies.
For now I didn’t implement any off-host proxy.
This could be a thing for our largest VM, its our File Server.
For Backup Job we are using only one repo and its configured for max 4 concurrent tasks. So I guess that it will start only first 4 by time scheduled and when one job is finish only then it will continue to the next one that is waiting in line?
Yes, that is the case with the repo too. So that may be bottleneck or possibly Proxy. Will just need to keep an eye on things and tune as necessary.
Yes, I will see how they will perform and then I can fine tune it whit the statistics I get after first chain
First run (Active Full) + incrementals and then Synthetic (fast clone)
So it’s also recommended to get proxy that will offload the processes from Windows Server acting as Repo?
All of our backup infrastructure its located in the same site. For Backup Jobs.
So I don’t know if for this small environment with 10+ VMs is needed to add some off-host proxies.
Generally on-host proxy’s from hyper-v host’s and VMs should be enough
Yes, I will see how they will perform and then I can fine tune it whit the statistics I get after first chain
First run (Active Full) + incrementals and then Synthetic (fast clone)
So it’s also recommended to get proxy that will offload the processes from Windows Server acting as Repo?
All of our backup infrastructure its located in the same site. For Backup Jobs.
So I don’t know if for this small environment with 10+ VMs is needed to add some off-host proxies.
Generally on-host proxy’s from hyper-v host’s and VMs should be enough
No, the Proxy does the processing and repo stores data, but the repo can only process 4 tasks at a time. They work together but are not specifically related.
Yes, I will see how they will perform and then I can fine tune it whit the statistics I get after first chain
First run (Active Full) + incrementals and then Synthetic (fast clone)
So it’s also recommended to get proxy that will offload the processes from Windows Server acting as Repo?
All of our backup infrastructure its located in the same site. For Backup Jobs.
So I don’t know if for this small environment with 10+ VMs is needed to add some off-host proxies.
Generally on-host proxy’s from hyper-v host’s and VMs should be enough
No, the Proxy does the processing and repo stores data, but the repo can only process 4 tasks at a time. They work together but are not specifically related.
So with more proxies more processes can be done (more Backup jobs)
But it also depend on the repo I/O, how much it can handle at one time?
Thank’s for the help
Yes, I will see how they will perform and then I can fine tune it whit the statistics I get after first chain
First run (Active Full) + incrementals and then Synthetic (fast clone)
So it’s also recommended to get proxy that will offload the processes from Windows Server acting as Repo?
All of our backup infrastructure its located in the same site. For Backup Jobs.
So I don’t know if for this small environment with 10+ VMs is needed to add some off-host proxies.
Generally on-host proxy’s from hyper-v host’s and VMs should be enough
No, the Proxy does the processing and repo stores data, but the repo can only process 4 tasks at a time. They work together but are not specifically related.
So with more proxies more processes can be done (more Backup jobs)
But it also depend on the repo I/O, how much it can handle at one time?
Thank’s for the help
Exactly! That is the idea.
Yes, I will see how they will perform and then I can fine tune it whit the statistics I get after first chain
First run (Active Full) + incrementals and then Synthetic (fast clone)
So it’s also recommended to get proxy that will offload the processes from Windows Server acting as Repo?
All of our backup infrastructure its located in the same site. For Backup Jobs.
So I don’t know if for this small environment with 10+ VMs is needed to add some off-host proxies.
Generally on-host proxy’s from hyper-v host’s and VMs should be enough
No, the Proxy does the processing and repo stores data, but the repo can only process 4 tasks at a time. They work together but are not specifically related.
So with more proxies more processes can be done (more Backup jobs)
But it also depend on the repo I/O, how much it can handle at one time?
Thank’s for the help
Exactly! That is the idea.
It’s clear for me now, thank you.
I’ll wait for backups to run then I will check statistics and see what can I do better
Comment
Enter your E-mail address. We'll send you an e-mail with instructions to reset your password.