Experience with XFS


Userlevel 7
Badge +13

Because of the new v11-feature of immutable backups on XFS filesystem, I am looking for real-world experiences with XFS as Repository.

I just read through this forum entry:

https://forums.veeam.com/veeam-backup-replication-f2/v10-xfs-all-there-is-to-know-t65222-30.html

and the closing experience of user ferrus:

https://forums.veeam.com/veeam-backup-replication-f2/v10-xfs-all-there-is-to-know-t65222-30.html#p388579

 

Sounds very good to me so far. I can remember first implementations of ReFS Block Cloning. Lets say, it does not work that fine at the beginning.

Would you share some of your experiences with XFS as Veeam Repository? I would be interested in:

  • stability 
  • performance (over time)
  • needs for troubleshooting
  • administrative effort

Thanks!


38 comments

Userlevel 7
Badge +17

Would like to warm up an old topic here again 😁

Some time ago a had to delete some huge backup-file on different ReFS volumes. During the deletion the Windows server hung completely. Furthermore it needed quite some time to delete the file. Today I did some similar on a RHEL on a XFS volume. To keep it short: No server hang, duration was - according to my feeling - like the delete of un-pointed files.

Great work, XFS!

 

Ok, had no problems with this up to now. It takes some time to delete a big file with many pointers. But I did not have a server hang.

A compare between ReFS and XFS would be interesting… Maybe this summer 😎

Userlevel 7
Badge +20

Would like to warm up an old topic here again 😁

Some time ago a had to delete some huge backup-file on different ReFS volumes. During the deletion the Windows server hung completely. Furthermore it needed quite some time to delete the file. Today I did some similar on a RHEL on a XFS volume. To keep it short: No server hang, duration was - according to my feeling - like the delete of un-pointed files.

Great work, XFS!

 

Yeah deleting on XFS even with block cloning seems way faster than wonderful Windows. 😋😂

I know Microsoft were bragging a lot about WS 2022 being up to 50x faster on ReFS operations such as this so it’d be good to see how that translates in the real world, I feel a heavy emphasis on “up to” is necessary!

Userlevel 7
Badge +20

Would like to warm up an old topic here again 😁

Some time ago a had to delete some huge backup-file on different ReFS volumes. During the deletion the Windows server hung completely. Furthermore it needed quite some time to delete the file. Today I did some similar on a RHEL on a XFS volume. To keep it short: No server hang, duration was - according to my feeling - like the delete of un-pointed files.

Great work, XFS!

 

Yeah deleting on XFS even with block cloning seems way faster than wonderful Windows. 😋😂

Userlevel 7
Badge +8

Good to know @vNote42 , thanks! How was fast the “storage space reclamation”? was it instant if you are doing a “df -h”?

yes, “df -h”  showed reclaimed space directly afterwards

Amazing! It was such a pain with dedup appliance, zzz waiting recycle jobs to have the space available.

Userlevel 7
Badge +13

Good to know @vNote42 , thanks! How was fast the “storage space reclamation”? was it instant if you are doing a “df -h”?

yes, “df -h”  showed reclaimed space directly afterwards

Userlevel 7
Badge +8

Good to know @vNote42 , thanks! How was fast the “storage space reclamation”? was it instant if you are doing a “df -h”?

Userlevel 7
Badge +13

Would like to warm up an old topic here again 😁

Some time ago a had to delete some huge backup-file on different ReFS volumes. During the deletion the Windows server hung completely. Furthermore it needed quite some time to delete the file. Today I did some similar on a RHEL on a XFS volume. To keep it short: No server hang, duration was - according to my feeling - like the delete of un-pointed files.

Great work, XFS!

 

Userlevel 7
Badge +22

I’ve done both XFS an EXT4 with Hardened Linux Repo, both worked fine. If you’re using a physical repo server with JBOD storage I would probably lean toward XFS so you can get the block clone features, especially since you have to use Forward Incremental with periodic fulls with HLR. I was using SAN storage with really good built in native dedup, so in that case I decided to stick with EXT4 since I didn’t really need the block clone benefits.

@jdw , thanks for your feedback!

Keep in mind, Block Clone also saves time (because it reduces needed IOs dramatically) when merging last increment into last full. 

Do you have any good test results showing real world comparison? This test is on my to-do list for the blog, but not had the time yet; would be great to see some other comparison if they exist. My theory is that it will be far more important for low end storage but I’m not sure if it will have as much advantage on the QLC-flash systems I primarily work with.

Block cloning at merging increments to fulls matters also for large installations. It depends on the environment of course, but I know about backups merging in minutes with block cloning comparted to hours without.

Yes this is for certain. One fun way to find out is to accidently add an NTFS rep to your REFS SOBR :). Then when a new backup chooses the NTFS since there is more space you can watch as it takes at least 10 times longer to perform synthetic operations :). This was done once by accident so I witnessed it personally.

Userlevel 7
Badge +20

Most definitely you can mix them to migrate I would assume.  Just we had issues using them together.  Adding XFS then sealing the ReFS would be the best solution as noted for sure.

Userlevel 7
Badge +13

Hi @StefanZi - this issues we saw was when a job ran if it was on XFS to start and then for some reason decided to write over to ReFS it seemed to cause major issues on the VCC server and repos.  I cannot explain it fully but when we separated the XFS from ReFS everything seems to work great at that point.

I know the documentation says you can mix them but from the people I spoke with at Veeam it was recommended not to.  So that is the approach we have been taking lately.

XFS has been great and looking forward to the Immutable storage now on it as well as the Linux Proxies. :sunglasses:

Chris, thanks for your details!

So I think it should be safe to migrate ReFS to XFS in the way @StefanZi lined out.

Userlevel 7
Badge +20

Hi @StefanZi - this issues we saw was when a job ran if it was on XFS to start and then for some reason decided to write over to ReFS it seemed to cause major issues on the VCC server and repos.  I cannot explain it fully but when we separated the XFS from ReFS everything seems to work great at that point.

I know the documentation says you can mix them but from the people I spoke with at Veeam it was recommended not to.  So that is the approach we have been taking lately.

XFS has been great and looking forward to the Immutable storage now on it as well as the Linux Proxies. :sunglasses:

Userlevel 6
Badge +3

I have installed a SOBR with three extents of XFS in our largest VCC datacenter.  It works extremely well and have not had issues.  We did mix it with ReFS but that was a bad idea and created a new SOBR just for XFS alone and no mixing.

We also try to keep SOBRs at 2-3 extents maximum and then create new ones as needed.

Thanks Chris!

What was the problem(s) when mixing XFS with ReFS in a SOBR?

One thing to note is the block size for ReFS is recommended on Veeam to be 64K and XFS is 4K, so I believe that alone would be a problem if trying to mix in one SOBR.

Continuing this - I’d really also like to know which issues you saw when with mixing XFS and ReFS in one SOBR?! @Chris.Childerhose 
I can’t think of any tbh.
@vNote42 The block cloning block size is completely irrelevant in SOBRs - it’s just something how the file is stored on the OS. You can also mix a dedup appliance and an ReFS repo and that works fine.
When moving files out of one extent to another you’ll loose the block cloning anyways.

So @HenkRuis: As said I don’t see a reason for not mixing XFS and ReFS repos, so I’d add XFS extents to your SOBR and seal the ReFS extents, so that the backups age out and you do not loose your ReFS block cloning space savings by evacuating the extent.
You’ll have to create a new active full to start with block cloning on XFS, but that’s it.

Userlevel 2

Just for my interpretation, if you want XFS and have an REFS environment, you have to got a greenfield for that right?. How do existing customers wit SOBRS handle this (or what is VEEAM point of view in this. Especially in a window only environment>

Userlevel 7
Badge +13

I’ve done both XFS an EXT4 with Hardened Linux Repo, both worked fine. If you’re using a physical repo server with JBOD storage I would probably lean toward XFS so you can get the block clone features, especially since you have to use Forward Incremental with periodic fulls with HLR. I was using SAN storage with really good built in native dedup, so in that case I decided to stick with EXT4 since I didn’t really need the block clone benefits.

@jdw , thanks for your feedback!

Keep in mind, Block Clone also saves time (because it reduces needed IOs dramatically) when merging last increment into last full. 

Do you have any good test results showing real world comparison? This test is on my to-do list for the blog, but not had the time yet; would be great to see some other comparison if they exist. My theory is that it will be far more important for low end storage but I’m not sure if it will have as much advantage on the QLC-flash systems I primarily work with.

Block cloning at merging increments to fulls matters also for large installations. It depends on the environment of course, but I know about backups merging in minutes with block cloning comparted to hours without.

Userlevel 4
Badge

I’ve done both XFS an EXT4 with Hardened Linux Repo, both worked fine. If you’re using a physical repo server with JBOD storage I would probably lean toward XFS so you can get the block clone features, especially since you have to use Forward Incremental with periodic fulls with HLR. I was using SAN storage with really good built in native dedup, so in that case I decided to stick with EXT4 since I didn’t really need the block clone benefits.

@jdw , thanks for your feedback!

Keep in mind, Block Clone also saves time (because it reduces needed IOs dramatically) when merging last increment into last full. 

Do you have any good test results showing real world comparison? This test is on my to-do list for the blog, but not had the time yet; would be great to see some other comparison if they exist. My theory is that it will be far more important for low end storage but I’m not sure if it will have as much advantage on the QLC-flash systems I primarily work with.

Userlevel 7
Badge +13

I’ve done both XFS an EXT4 with Hardened Linux Repo, both worked fine. If you’re using a physical repo server with JBOD storage I would probably lean toward XFS so you can get the block clone features, especially since you have to use Forward Incremental with periodic fulls with HLR. I was using SAN storage with really good built in native dedup, so in that case I decided to stick with EXT4 since I didn’t really need the block clone benefits.

@jdw , thanks for your feedback!

Keep in mind, Block Clone also saves time (because it reduces needed IOs dramatically) when merging last increment into last full. 

Userlevel 7
Badge +13

I have installed a SOBR with three extents of XFS in our largest VCC datacenter.  It works extremely well and have not had issues.  We did mix it with ReFS but that was a bad idea and created a new SOBR just for XFS alone and no mixing.

We also try to keep SOBRs at 2-3 extents maximum and then create new ones as needed.

Thanks Chris!

What was the problem(s) when mixing XFS with ReFS in a SOBR?

One thing to note is the block size for ReFS is recommended on Veeam to be 64K and XFS is 4K, so I believe that alone would be a problem if trying to mix in one SOBR.

Also keep in mind that Microsoft only supports Trim/UNMAP for ReFS on Storage Spaces. ReFS shouldn’t be used with SAN storage or you could get some really weird space accounting. Anton also calls this out in the R&D Forum. 

 

From my perspective, Trim/UNMAP is not necessary for backup repositories. At least it does not hurt, if this feature is missing.

But I am not sure to don’t use ReFS on SAN Volumes is still true. I can remember, Microsoft added this limitations to their web-site when we faced massive performance problems within the first year after Windows 2016 and Veeam Block Cloning was available. 

Userlevel 4
Badge

I have installed a SOBR with three extents of XFS in our largest VCC datacenter.  It works extremely well and have not had issues.  We did mix it with ReFS but that was a bad idea and created a new SOBR just for XFS alone and no mixing.

We also try to keep SOBRs at 2-3 extents maximum and then create new ones as needed.

Thanks Chris!

What was the problem(s) when mixing XFS with ReFS in a SOBR?

One thing to note is the block size for ReFS is recommended on Veeam to be 64K and XFS is 4K, so I believe that alone would be a problem if trying to mix in one SOBR.

Also keep in mind that Microsoft only supports Trim/UNMAP for ReFS on Storage Spaces. ReFS shouldn’t be used with SAN storage or you could get some really weird space accounting. Anton also calls this out in the R&D Forum. 

 

Userlevel 7
Badge +22

I have installed a SOBR with three extents of XFS in our largest VCC datacenter.  It works extremely well and have not had issues.  We did mix it with ReFS but that was a bad idea and created a new SOBR just for XFS alone and no mixing.

We also try to keep SOBRs at 2-3 extents maximum and then create new ones as needed.

Thanks Chris!

What was the problem(s) when mixing XFS with ReFS in a SOBR?

One thing to note is the block size for ReFS is recommended on Veeam to be 64K and XFS is 4K, so I believe that alone would be a problem if trying to mix in one SOBR.

Userlevel 4
Badge

I’ve done both XFS an EXT4 with Hardened Linux Repo, both worked fine. If you’re using a physical repo server with JBOD storage I would probably lean toward XFS so you can get the block clone features, especially since you have to use Forward Incremental with periodic fulls with HLR. I was using SAN storage with really good built in native dedup, so in that case I decided to stick with EXT4 since I didn’t really need the block clone benefits.

Userlevel 7
Badge +13

I have two XFS based repos each with a ~445TB extent. Excellent performance and stability. This is my preferred filesystem.

Thanks for your answer! It is good to know, big extents do not cause problems!

Userlevel 3

I have two XFS based repos each with a ~445TB extent. Excellent performance and stability. This is my preferred filesystem.

Userlevel 7
Badge +13

For those, interested in experiences with XFS, check this new post:

https://forums.veeam.com/veeam-backup-replication-f2/we-did-some-refs-vs-xfs-tests-t72011.html#p400824

Good quality and performance of XFS in real world deployment.

Userlevel 7
Badge +13

I have installed a SOBR with three extents of XFS in our largest VCC datacenter.  It works extremely well and have not had issues.  We did mix it with ReFS but that was a bad idea and created a new SOBR just for XFS alone and no mixing.

We also try to keep SOBRs at 2-3 extents maximum and then create new ones as needed.

Thanks Chris!

What was the problem(s) when mixing XFS with ReFS in a SOBR?

Just that XFS and ReFS did not play well together so it is best to have them separate.  Been rock solid since.

Thanks @Chris.Childerhose ! So not mixing ReFS and XFS in a SOBR will be part of our best practices!

 

Userlevel 7
Badge +13

We’ve had ~30 Ubuntu 20.04 repo’s in production since v10 launch in 2020, all with reflink enabled; the majority being at client sites and then a few ~500TB extents in sobr’s at our data centers; they all have been rock solid, which is to be expected from linux.

  • stability - no issues at all
  • performance (over time) - hasn’t changed in a year; only limited by the disk iops available from your hardware
  • needs for troubleshooting - no troubleshooting over the past year
  • administrative effort - linux experience or being a fast learner is important for the initial design and deployment; hardware monitoring/testing to determine the number of concurrent jobs that your hardware can handle requires some effort in the beginning but that applies to any OS; after that, cron jobs take care of updates and we schedule a manual reboot of the repo’s whenever an update notifies us that it requires it

Thank you @gtelnet for your very detailed answer! I am very relieved, feedback about XFS was that positive. So v11 with hardened repos can come! 

Comment