Solved

Why is my backup repository folder size much greater than disk capacity used?

  • 31 October 2023
  • 7 comments
  • 491 views

Userlevel 2
Badge
  • Not a newbie anymore
  • 2 comments

Hi, I am new to Veeam Backup.  On a Dell PowerEdge R7515 running Windows 2022 Standard, I have a 67 terabyte REFS volume where I store Veeam backups under a folder called “vault” (see screenshot below).  In Windows Explorer, when I check the size of the “vault” folder (by right-clicking on the folder name and going to Properties) it says the “Size on disk” is more than 67.6 terabytes (see below).  This folder contains one .vbm file and a few .vbk and .vib files.  However, when I check the properties of the physical disk volume under Disk Management utility (diskmgmt.msc), it says that I still have about 31.9 terabytes left on disk meaning that ONLY about 35 terabytes of data is used with about 48% free (see below).  Also, I do not have data duplication role installed on this server. Could someone shed some light on this extremely large discrepancy between disk volume size and folder size?  Thank you!

 

Content of “vault folder”:

 

Disk Management says:

 

 

 

icon

Best answer by Chris.Childerhose 31 October 2023, 14:07

View original

7 comments

Userlevel 7
Badge +20

It is due to ReFS block cloning feature.  That is what saves space when using this type of repository with Veeam.

Userlevel 7
Badge +20

See here - Block cloning on ReFS | Microsoft Learn

Userlevel 7
Badge +7

It seems like the disk has non-partition space. Can you show the disk status? It’s at the bottom of disk management. If it does, you can extend the volume space.

Userlevel 7
Badge +6

ReFS using Spaceless Full Backup Technology: This is a storage-efficiency technology that allows multiple full backup files residing on the same ReFS volume to share the same physical data blocks, resulting in reduced backup storage capacity consumption that rivals that of deduplication technologies.

 

For more details: availability-suite-advanced-refs-integration_wpp.pdf (veeam.com)

Userlevel 7
Badge +6

As others noted, you’re using the REFS file system which uses block cloning.  That is, in very simple/rudimentary terms, if you have a full backup, and then create another full backup, technically you’re going to have two backup files that are going to be very similar in size.  Block cloning looks at the duplicate blocks in both files, and for those that are the same blocks, rather than having two copies of that block, it keeps only one block and each file references those blocks that they have in common.  This will cause reporting far greater than the actual disk space consumed.  If you were using the NTFS file system, the space reported would be accurate because you would have multiple copies of the same blocks.  Note that if you were to ever move these files off of this disk to another partition/disk/volume to a NTFS volume or use a tool that is not aware of block cloning, your data is going to swell as the duplicate blocks are created on the new volume.  For this reason, if you were to ever move data from a REFS volume to another REFS volume or an XFS volume (same sort of idea as REFS, but on linux machines), it is recommended to use the VeeaMover feature within Veeam as this is block cloning aware and will move the data without re-expanding it to its original size.

Userlevel 4
Badge +1

As per above, the storage savings can be quite significant. It isn’t just inside regular backup jobs that this is beneficial using ReFS/XFS but you will also find that it also works extremely well with Backup Copy jobs, wherever a duplicate file might end up, it will map there. 

But definitely keep in mind when moving between file systems, the rehydration will inflate and if you don’t have the space available, that could be disasterous.  

I use both ReFS and XFS in prod now - For one customer we saw >1.8PB on 230TB of RAW storage on ReFS. So it is actually really quite cool. 

Userlevel 7
Badge +6

As others noted, you’re using the REFS file system which uses block cloning.  That is, in very simple/rudimentary terms, if you have a full backup, and then create another full backup, technically you’re going to have two backup files that are going to be very similar in size.  Block cloning looks at the duplicate blocks in both files, and for those that are the same blocks, rather than having two copies of that block, it keeps only one block and each file references those blocks that they have in common.  This will cause reporting far greater than the actual disk space consumed.  If you were using the NTFS file system, the space reported would be accurate because you would have multiple copies of the same blocks.  Note that if you were to ever move these files off of this disk to another partition/disk/volume to a NTFS volume or use a tool that is not aware of block cloning, your data is going to swell as the duplicate blocks are created on the new volume.  For this reason, if you were to ever move data from a REFS volume to another REFS volume or an XFS volume (same sort of idea as REFS, but on linux machines), it is recommended to use the VeeaMover feature within Veeam as this is block cloning aware and will move the data without re-expanding it to its original size.

Great explanation, @dloseke 

Comment