Skip to main content

Hi all, 


I hope you’re well,

I require assistance in sizing the backup repository storage capacity for a VM workload running on vSphere with approximately 14 TB of source data. The backup repository well be hosted on a deduplication appliance either Dell EMC Data Domain or ExaGrid.

The backup policy includes:
7 daily incrementals
5 weekly full backups 
6 monthly full backups 
With offloading to azure blob storage cool tier and subsequently azure archive tier.


Could you help calculate the actual storage capacity required on the support this retention policy, considering deduplication and compression ratios best practices?

You can calculate this yourself on the calculator site for Veeam - https://www.veeam.com/calculators

 


And also considering immutability (7 or 14 days


And also considering immutability (7 or 14 days

Yes you can add that on this site.  Check it out.


@Chris.Childerhose Thank you for your help!

 

I’m trying to better understand how the Veeam sizing calculator determines the required storage capacity. In my use case, I’m dealing with a small site hosting 6 VMs, with a total of 14 TB currently used on the datastore. However, when I run the calculator, it outputs a requirement of 148 TB.

 

If I plan to integrate my Veeam server with deduplication, immutability, and compression, does that mean I need a model like the ExaGrid EX54 or Dell Data Domain DD6400–DD6900 (64–96 TB usable)? Just trying to reconcile the actual data size with the projected sizing.


That would be dependent on what you need based on the calculations, etc.  I cannot say which model you need.


Hi Chris,

 

Thanks again for your input.

 

Just to clarify my use case and question a bit more:

 

I’m sizing storage for a small site using the Veeam Official Calculator with the following parameters:

 

  • Source VM data: 14 TB
  • Daily change rate: 5%
  • Retention: 7 daily + 5 weekly + 6 monthly (GFS)
  • Veeam compression: 50%

 

 

The calculator outputs a requirement of 148.4 TB (no compression), or 80.8 TB with 50% Veeam-native compression.

 

Now here’s where I get confused:

Even assuming 50% Veeam compression and average 10:1 deduplication at the storage appliance level (ExaGrid or DD Boost), the effective stored data could potentially be ~7–8 TB.

 

So why is the calculator recommending such a high number (80–148 TB)?

And is it still justified to use large appliances like ExaGrid EX54 or DD6410 64–96 TB usable?

 

I’m trying to reconcile:

 

  1. What I think I’ll store (post-compression/dedupe)
  2. What I need to size for upfront to ensure safe ingest, GFS, workspace, etc.

 

 

Could you help clarify what’s typically used as the sizing baseline — is it pre-dedup/post-compression? Or something else?

 

Thanks in advance!


It is pre that typically is used.  It also factors working space too. 


Check out this section of the community for more help too - https://community.veeam.com/groups/calculator-commons-151

 


Perhaps you are not using the calculator correctly? I ran the numbers from the above in the calculator, and came up with 31.2 TB required. Please do read the calculator notes carefully. Remember that these numbers, and the calculator in general, are “conservative”, meaning the required backup capacity is overestimated.

14 TB source data (I call it the “data footprint”)

GFS of 7D/5W/6M

7 days immutability

1 year forecast period

5% daitly change rate

5% growth rate

50% data reduction

Block Generation period of 10 days (the correct value for xfs/ReFS repos)


@abdelali.elkouri, as ​@randyweis said about XFS (fast clone).

Make sure that you are considering this:

 


@abdelali.elkouri, as ​@randyweis said about XFS (fast clone).

Make sure that you are considering this:

 

Ah yes forgot about this little checkbox and how that plays in to the calculations.  Good call out ​@wolff.mateus 


I need to play with that calculator more myself. I tend to only do so when studying for exams, or a little bit during greenfield install in my env 😬


Thank you for All

even if the local repository storage will be a deduplication appliance like Data Domain or ExaGrid to enable XFS / ReFS


AFAIK, the Block or Fast Cloning capability has no negative effect on deduplication results using a purpose built appliance. It would result in less data to have to store, because there will no longer be full backups on disk in an xfs repostiory. Deduplication or space savings statistics will drop on the Data Domain, but net space utilization should be lower using block cloning than if NTFS is being used.


Thank you for All

even if the local repository storage will be a deduplication appliance like Data Domain or ExaGrid to enable XFS / ReFS

AFAIK, the Block or Fast Cloning capability has no negative effect on deduplication results using a purpose built appliance. It would result in less data to have to store, because there will no longer be full backups on disk in an xfs repostiory. Deduplication or space savings statistics will drop on the Data Domain, but net space utilization should be lower using block cloning than if NTFS is being used.

For avoidance of doubt, DataDomain allows for Virtual Synthetic Fulls, which effectively are the same as fast clone, but using the Data Domain deduplication engine. Give a read of the description, you will find it’s quite similar to the description of XFS/ReFS fast clone, just it’s working with the Data Domain DDBoost library to minimize the data transferred to the DD. 

So the DD will do its thing as best it can while also reducing data over the wire between the gateway and the DD itself. 

Exagrid and Quantum work a bit differently but achieve the same results.

 

As for actually calculating the space savings, unfortunately only real-world testing will tell this as it’s heavily dependent on the workload. Do _not_ use Veeam Backup Encryption on your jobs, instead enable the storage’s native “at rest” encryption instead, as Veeam’s encryption will not be deduplicable (dedupe engine will not be able to decrypt the data blocks and thus each block will be “unique” to the dedup engine)

My experience is that you can usually trust the projected space savings from the storage vendors for their devices, but depending on your workload some of the data may not be dedup friendly. 

Review our KB here on deduplication appliances + Veeam, it’s quite comprehensive and includes whitepapers made by the storage vendors + veeam: https://www.veeam.com/kb1745

And remember to size the storage for growth; you have 14 TiB now, but that can grow over time. 


Appreciate the additional info/explanation David! 👍🏻


Comment