Fast Clone Savings: Call for community testing


Userlevel 2
Badge +1

Hi all,

 

Recently I started experimenting with XFS savings calculation. I’ve seen some good work from the community and therefor I want to say, thanks for all people that played with xfs_bmap, filefrag, and especially mweissen13 with some good idea’s on how to find the overlapping intervals

I’m sharing here what I believe is a working implementation:

https://github.com/tdewin/ttozz/tree/main/sortedfragments

So first of all, I’ve split up the work in 2 parts:

  • Creating frag files: these files are basically filefrag output but reordered on volume level offset. This is required to find the overlapping intervals (shared blocks). Notice that filefrag order the fragments on file level offset. You can cat the files if you are curious. They are basically in YAML format because you should be able to easily parse them with libraries or just pure stream reading line per line.
  • Analyzing the frag files: open the files and read them in ordered fashion. If two files report a similar intervals (starting block of one file falls into the interval of another file), the intervals are merged. Two stats are kept. Real and Total. Total is the sum of all fragments in all files and corresponds to the sum of all filesizes that you see when you use “ls” for example or du. Real is the sum of all unique intervals with all overlapping fragments removed.

The savings are then expressed in Total-Real = savings_gb or in a total/real = ratio which is similar to x:1 compression (eg typical 2:1 for our backups).

Now the reasoning for some decisions taken:

  • Filefrag is used because I was getting inconsistent results with xfs_bmap (ironically it underreports). I don’t know if this means that -or Linux is wrong -or xfs_bmap is wrong. But with filefrag, I can get total to match up with du. with xfs_bmap some extents seem to go missing.
  • The two stage approach should allow you to buffer or cache the sorting action. I expect the sorting of filefrag output on 100TB vbk to take some time. But imagine you only have to do it on new files (unless you defragment). Do keep in mind that you should also remove frag files if the original file no longer exist. 
  • You can have analyzing done separately from the frag file creation. Eg, I experimented with using inotify to run a shell script as soon as a file is locked. Ironically with linux hardened repo this inotify event (ATTRIB change) is the perfect moment to do the analytics. Your file is ready and we can be 100% sure that it is not being modified since it is, well, immutable.
  • You can analyze the frag files on a different system
  • Python3 was chosen, not because I’m the biggest fan of python or because it is supposed to be superfast. But it has the advantage that it is an interpreter so you can read/mod the code as you like. On ubuntu 22.04 it should be installed by default so no huge dependency tree

Ok so how do you use this code? Well at this point, I would only focus on the on a hardened repository since well, your files should be locked so they should be safe. If not I’ve created the perfect hacking tool ☠. But regardless, use this code

only on test systems. After all, it is experimental and MIT license 🙈

 

I just don’t have a huge home lab to make sure the results hold up at scale (and if separation of duty makes sense). It is also not packaged really conveniently until more people say “yeah it works”

To get started, clone the code

DLDIR=/tmp/sortedfragments
mkdir $DLDIR
git clone https://github.com/tdewin/ttozz $DLDIR
sudo cp $DLDIR/sortedfragments/sortedfrag.py /usr/bin/sortedfrag
sudo cp $DLDIR/sortedfragments/sortedfrag-analyze.py /usr/bin/sortedfrag-analyze
sudo chmod +x /usr/bin/sortedfrag /usr/bin/sortedfrag-analyze

#you might want to remove DLDIR
#rm -rf /tmp/sortedfragments

Ok now you need some vbk and vib frags

SRCDIR="/mnt/backup/Backup Job 1/"
TGTCACHE="/tmp/fragcache/"

mkdir -p "$TGTCACHE/$SRCDIR"
find "$SRCDIR" -iname "*.v[bi][kb]" | while read f ; do /usr/bin/sortedfrag -f "$f" > "$TGTCACHE/$f.frag" ; done

If everything went smooth, you should have some frag files in /tmp/fragcache/

find /tmp/fragcache

for me yielded

/tmp/fragcache
/tmp/fragcache/mnt
/tmp/fragcache/mnt/backup
/tmp/fragcache/mnt/backup/Backup Job 1
/tmp/fragcache/mnt/backup/Backup Job 1/forbackup.51D2023-06-14T143013_0B41.vib.frag
/tmp/fragcache/mnt/backup/Backup Job 1/forbackup.51D2023-06-13T220008_1AF7.vib.frag
/tmp/fragcache/mnt/backup/Backup Job 1/forbackup.51D2023-06-14T160733_650C.vib.frag
/tmp/fragcache/mnt/backup/Backup Job 1/forbackup.51D2023-06-09T104205_FFBE.vbk.frag
/tmp/fragcache/mnt/backup/Backup Job 1/forbackup.51D2023-06-14T151123_9DB6.vib.frag
/tmp/fragcache/mnt/backup/Backup Job 1/forbackup.51D2023-05-15T144450_E57C.vbk.frag
/tmp/fragcache/mnt/backup/Backup Job 1/forbackup.51D2023-06-14T151837_C80B.vib.frag
/tmp/fragcache/mnt/backup/Backup Job 1/forbackup.51D2023-06-14T145551_460C.vib.frag
/tmp/fragcache/mnt/backup/Backup Job 1/forbackup.51D2023-06-14T144041_DB7A.vib.frag
/tmp/fragcache/mnt/backup/Backup Job 1/forbackup.51D2023-06-14T145207_CD36.vib.frag
/tmp/fragcache/mnt/backup/Backup Job 1/forbackup.51D2023-06-11T220021_DD47.vib.frag
/tmp/fragcache/mnt/backup/Backup Job 1/forbackup.51D2023-06-14T151458_B06B.vib.frag
/tmp/fragcache/mnt/backup/Backup Job 1/forbackup.51D2023-05-15T220007_5A06.vib.frag
/tmp/fragcache/mnt/backup/Backup Job 1/forbackup.51D2023-06-14T154406_7489.vib.frag
/tmp/fragcache/mnt/backup/Backup Job 1/forbackup.51D2023-06-14T153410_E7EB.vib.frag
/tmp/fragcache/mnt/backup/Backup Job 1/forbackup.51D2023-06-14T155530_0DF8.vib.frag
/tmp/fragcache/mnt/backup/Backup Job 1/forbackup.51D2023-06-14T142757_F916.vib.frag
/tmp/fragcache/mnt/backup/Backup Job 1/forbackup.51D2023-06-09T135947_4C96.vib.frag
/tmp/fragcache/mnt/backup/Backup Job 1/forbackup.51D2023-06-09T103701_F73B.vib.frag
/tmp/fragcache/mnt/backup/Backup Job 1/forbackup.51D2023-06-09T115813_C9E3.vib.frag
/tmp/fragcache/mnt/backup/Backup Job 1/forbackup.51D2023-06-14T153057_DDBC.vib.frag
/tmp/fragcache/mnt/backup/Backup Job 1/forbackup.51D2023-06-11T141117_4CCC.vib.frag
/tmp/fragcache/mnt/backup/Backup Job 1/forbackup.51D2023-06-14T142516_9E22.vbk.frag
/tmp/fragcache/mnt/backup/Backup Job 1/forbackup.51D2023-06-13T212047_D67A.vbk.frag
/tmp/fragcache/mnt/backup/Backup Job 1/forbackup.51D2023-06-11T102338_FAD9.vbk.frag
/tmp/fragcache/mnt/backup/Backup Job 1/forbackup.51D2023-06-14T144905_32AF.vib.frag
/tmp/fragcache/mnt/backup/Backup Job 1/forbackup.51D2023-06-14T154718_1CD7.vib.frag

ok lets analyse them

# Should be the same as previous run:
SRCDIR="/mnt/backup/Backup Job 1/"
TGTCACHE="/tmp/fragcache/"

/usr/bin/sortedfrag-analyze -d "$TGTCACHE/$SRCDIR" -b 4096

This outputs

---
basedir: /tmp/fragcache///mnt/backup/Backup Job 1/
frag_files_count: 27
frag_files_processed:
- forbackup.51D2023-06-09T115813_C9E3.vib.frag
- forbackup.51D2023-06-09T135947_4C96.vib.frag
- forbackup.51D2023-06-11T141117_4CCC.vib.frag
- forbackup.51D2023-06-11T220021_DD47.vib.frag
- forbackup.51D2023-05-15T144450_E57C.vbk.frag
- forbackup.51D2023-05-15T220007_5A06.vib.frag
- forbackup.51D2023-06-09T103701_F73B.vib.frag
- forbackup.51D2023-06-09T104205_FFBE.vbk.frag
- forbackup.51D2023-06-11T102338_FAD9.vbk.frag
- forbackup.51D2023-06-13T212047_D67A.vbk.frag
- forbackup.51D2023-06-13T220008_1AF7.vib.frag
- forbackup.51D2023-06-14T142516_9E22.vbk.frag
- forbackup.51D2023-06-14T142757_F916.vib.frag
- forbackup.51D2023-06-14T143013_0B41.vib.frag
- forbackup.51D2023-06-14T144041_DB7A.vib.frag
- forbackup.51D2023-06-14T144905_32AF.vib.frag
- forbackup.51D2023-06-14T145207_CD36.vib.frag
- forbackup.51D2023-06-14T145551_460C.vib.frag
- forbackup.51D2023-06-14T151123_9DB6.vib.frag
- forbackup.51D2023-06-14T151458_B06B.vib.frag
- forbackup.51D2023-06-14T151837_C80B.vib.frag
- forbackup.51D2023-06-14T153057_DDBC.vib.frag
- forbackup.51D2023-06-14T153410_E7EB.vib.frag
- forbackup.51D2023-06-14T154406_7489.vib.frag
- forbackup.51D2023-06-14T154718_1CD7.vib.frag
- forbackup.51D2023-06-14T155530_0DF8.vib.frag
- forbackup.51D2023-06-14T160733_650C.vib.frag
real_clusters: 1654455
real_kb: 6617820
real_gb: 6.25
total_clusters: 4531430
total_kb: 18125720
total_gb: 17.25
savings_rat: 2.7389261116198385
savings_gb: 10.9375

Output is in YAML like syntax (it passes yamllint). So my savings are (rounding might be a bit off so you might want to compare with kb)

18125720-6617820=11507900 or 11507900/1024/1024 = 10.97 GB 

Ratio being

18125720/6617820 = 2.73

So you have 2.73:1 savings 

Ok but how does it stack up. “du” should pretty much stack up with total_kb

$ du "/mnt/backup/Backup Job 1/"
18126000 /mnt/backup/Backup Job 1/

That is virtually the same and expected since there are still some meta data files like vbm and lock files in that directory

“df” should compare with real if there is only a single chain on the disk (which is the case for me). However “df” should be more since meta data should also consume space. eg and empty xfs volumes still uses bytes

$ df /mnt/backup
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sdb1 104805380 7381852 97423528 8% /mnt/backup

So used “7381852” is inline with “6617820”. For example. I have a completely empty volume called /mnt/backup02. You can see there is some usage and that is without having to handle all the fast clone going on.

$ find /mnt/backup02
/mnt/backup02
$ df /mnt/backup02
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sdc1 52402180 398400 52003780 1% /mnt/backup02

Final notes, one could only imagine create a w32filefrag for Windows ReFS to create frag files 😀


6 comments

Userlevel 7
Badge +20

Was reading this over on the Forums and will be watching how you get along with this stuff for sure.  It will be nice to have something that can tell you the ReFS space usage with Block Clone savings.

Userlevel 7
Badge +22

I think this one is only for XFS? Either way this could massively help with billing issues when dealing with customer backup data, unless of course you just charge them the reported amount without savings :)

Userlevel 4
Badge +1

I don’t want to steal the show from Timo, but if you look at the root of the repository…

https://github.com/tdewin/ttozz

“Additionally the output is https://github.com/tdewin/w32filefrag , so you should be able to use sortedfrag-analyze.(py|ps1) to analyze ReFS frags created with w32filefrag. sf-sample.ps1 is provided for such a setup.”

Userlevel 2
Badge +1

Yup there is a pre release uploaded there. So if you have a TEST ReFS system with huge long running chain, I would really enjoy feedback on speed here. Does w32filefrag struggle in performance like blockstat (I would assume not but it still needs to sort through the whole ordeal).

Userlevel 7
Badge +17

As I will (probably) be implementing XFS Repos with new arrays I just received, I’m intrigued about this. Thanks for sharing!

Userlevel 7
Badge +17

Thanks for sharing and great that you are researching for ReFS, too. We have spoken about the lacking performance of blockstat some months ago 👍🏼

I will play around with both (XFS and ReFS) for sure 😎

Comment