Inherency:
Since we bill by actual disk usage for our client utilizing CloudConnect, the existing reports are non-functional for reporting and billing (They report pre-deduplication/reflink data). It took a while to figure out, but there is a way to calculate actual disk space used on a per directory on an immutable repository.
Solution:
I have extrapolated a script that will give the disk usage of each folder (IE: client) on an immutable repo. This isn't "Data used"; that is what Veeam reports. This is "Disk Used". The actual size on disk after reflinks (duplicated data is only counted once).
A note about this script; it appears that the original blog entry is wrong on the size of a block. They attribute it to 4096... which is true... on disk... but the utility used explicitly gives the information in block sizes of 512:
https://linux.die.net/man/8/xfs_bmap
"units of 512-byte blocks"
To use this script, we use a cron task and pipe the output to a mail client on the repo itself (IE: script.bash 2>&1 | mail -s “Immutable storage report for $HOSTNAME” someemail@email.place)
#!/bin/bash
for clientDir in `find /backups/disk-01/backups/ -mindepth 1 -maxdepth 1 -type d`
do
echo $clientDir
clientSpaceUsed=$(find $clientDir/*/* -xdev -type f -exec xfs_bmap -l {} + | awk '{ print $3 " " $4 }' | sort -k 1 | uniq | awk '{ print $2 }' | grep -Eo ' 0-9]{1,7}' | paste -sd+ | bc | awk '{print $1*512/1024/1024/1024}')
#block sizes of 512 bytes. Divided by 1024 for KB. Divided by 1024 for MB. Divided by 1024 for GB.
echo "$clientSpaceUsed GB"
done
To break down how this works:
For each client directory in “/backups/disk-01/backups/”
output the directory being reported on
run xfs_bmap -l (this tells us all about the blocks in question)
Take columns 3 and 4 (now becomes column 1 and 2, the rest are discarded)
sort by column 1
remove duplicate rows of data (reflinks for fast cloning; keeps a single copy of the data for counting purposes)
Select only column 2 (now becomes column 1)
remove anything other than numbers
Add those numbers together
multiple by block size (512)
divide by 1024 (now KB)
divide by 1024 (now MB)
divide by 1024 (now GB)
output text