ReFS issues with latest Windows Server Updates (KB5009624, KB5009557, KB5009555)


Userlevel 7
Badge +13

Just a quick post; the latest Windows Server updates for 2012R2, 2019 and 2022 (haven’t seen 2016) can cause ReFS issues. After the installation, ReFS volumes are shown as RAW and are no longer accessible; so if you’re using ReFS for your repositories, then take be aware of this when installating the update. Besides that, those updates can also cause bootloops for domain controllers and break Hyper-V.

Depending on your Windows Server version one of the following updates could have caused the issue: KB5009624, KB5009557, KB5009555

If you’re affected then removing the mentioned update should solve the issue. Don’t (!) try to repair the ReFS volume, because this could cause a dataloss.

This should remind everyone why 3-2-1 for Backups is so important. :wink:

Further information:

https://www.bleepingcomputer.com/news/microsoft/new-windows-server-updates-cause-dc-boot-loops-break-hyper-v/

https://forums.veeam.com/veeam-backup-replication-f2/beware-possible-raw-refs-volumes-after-installing-january-updates-t78634.html

Update #1:

Microsoft has pulled the updates, thanks @Mildur 

https://www.bleepingcomputer.com/news/microsoft/microsoft-pulls-new-windows-server-updates-due-to-critical-bugs/amp/

Update #2:

Microsoft has released the updates again and it looks like they didn't fix them. (Thanks @MicoolPaul )

https://www.bleepingcomputer.com/news/microsoft/microsoft-resumes-rollout-of-january-windows-server-updates/

Update #3:

Microsoft has released an out-of-band update, which can possibly resolve the issues:

https://www.bleepingcomputer.com/news/microsoft/microsoft-releases-emergency-fixes-for-windows-server-vpn-bugs/

 


106 comments

Userlevel 7
Badge +13

FYI the out of bound update, KB5010794, is still breaking ReFS for 2012r2.

Probably you really need to install both updates, one after the other. Although it’s also strange that the out of band update also introduces the issue…

By the way, just for myself, why did you decide to go with ReFS on 2012R2?

A year or so ago we were in the process of rebuilding our storage infrastructure and creating a new scale-out repository in prep for Wasabi’s immutable functionality. We figured we would take advantage of the features of ReFS while we were at it.

If you were rebuilding the backup repositories, did you think about upgrading to 2016 or 2019? This would have given you the advanced ReFS integration.

Userlevel 7
Badge +13

Thanks @MicoolPaul and @marcofabbri ! For me Windows Server 2016 was the first contact with ReFS as Veeam started to integrated it. With 2012R2 I felt that it wasn’t mature enough for production. You’re right that the integrity part is/was an advantage, but without storage spaces you only had the corroption detection.

It was same for me @regnor 

But I hope that @Asmith can resolve his issue..

Userlevel 7
Badge +20

Thanks @MicoolPaul and @marcofabbri ! For me Windows Server 2016 was the first contact with ReFS as Veeam started to integrated it. With 2012R2 I felt that it wasn’t mature enough for production. You’re right that the integrity part is/was an advantage, but without storage spaces you only had the corroption detection.

Yeah we never looked at it until 2016 as well since it was just new in 2012R2.  Also by the time they came out with 2016 the product had matured and was much better.  Did not like anything below 2016 other than maybe 2008R2.  :joy:

Userlevel 5
Badge

Latest update from MS Support: case is being archived with no resolution at the moment. Advice currently is, if the issue occurs after installing the cumulative patch, you have two options:

- uninstall the patch and copy the data to an NTFS volume and re-install the patch

- refrain from installing the patch entirely

The MS Support engineer even made a personal note:

In any case if you use NTFS I personally recommend using NTFS instead of ReFS, as ReFS is still immature as a file system as has several bugs.


I did let them know my frustration about this issue, since they broke it and still don't have a solution or even a cause after several weeks.

Userlevel 7
Badge +20

Latest update from MS Support: case is being archived with no resolution at the moment. Advice currently is, if the issue occurs after installing the cumulative patch, you have two options:

- uninstall the patch and copy the data to an NTFS volume and re-install the patch

- refrain from installing the patch entirely

The MS Support engineer even made a personal note:

In any case if you use NTFS I personally recommend using NTFS instead of ReFS, as ReFS is still immature as a file system as has several bugs.


I did let them know my frustration about this issue, since they broke it and still don't have a solution or even a cause after several weeks.

Wow that is quite the response.  Passing the buck.  Sheesh :rolling_eyes:

Userlevel 7
Badge +17

Latest update from MS Support: case is being archived with no resolution at the moment. Advice currently is, if the issue occurs after installing the cumulative patch, you have two options:

- uninstall the patch and copy the data to an NTFS volume and re-install the patch

- refrain from installing the patch entirely

The MS Support engineer even made a personal note:

In any case if you use NTFS I personally recommend using NTFS instead of ReFS, as ReFS is still immature as a file system as has several bugs.


I did let them know my frustration about this issue, since they broke it and still don't have a solution or even a cause after several weeks.

This response is… wow :scream: . It makes a very uncomfortable feeling at the moment.

In this case they should drop ReFS completely.

Userlevel 7
Badge +13

“In any case if you use NTFS I personally recommend using NTFS instead of ReFS, as ReFS is still immature as a file system as has several bugs.”

Wow.

So @Franc if you uninstall that update you can access to data?

Userlevel 5
Badge

Yep, uninstalling the patch restores access. That specific machine is running Windows Server 2022.

Userlevel 7
Badge +20

Latest update from MS Support: case is being archived with no resolution at the moment. Advice currently is, if the issue occurs after installing the cumulative patch, you have two options:

- uninstall the patch and copy the data to an NTFS volume and re-install the patch

- refrain from installing the patch entirely

The MS Support engineer even made a personal note:

In any case if you use NTFS I personally recommend using NTFS instead of ReFS, as ReFS is still immature as a file system as has several bugs.


I did let them know my frustration about this issue, since they broke it and still don't have a solution or even a cause after several weeks.

This response is… wow :scream: . It makes a very uncomfortable feeling at the moment.

In this case they should drop ReFS completely.

Agreed, this engineer is definitely overstepping by presenting their own personal views in such a way. Likely out of frustration as Microsoft will have been the ones stating to just uninstall the patch.

 

When Microsoft promised us singular “cumulative” updates to improve their QA, this is a trade off we now are experiencing, that you get all or nothing with their patches, production breaking updates vs exposure to zero day exploits is now the choice to make.

 

The ReFS comment makes me feel uneasy also, but this is why we need to plan for file system issues in our 3-2-1-1-0, as this patch, whilst bad, has the ability to be reversed. What happens next time when the patch corrupts the partitions beyond repair?

 

We need to treat this as an important lesson, and be glad that Veeam have been helping us break free of Microsoft’s chains with the use of Linux operating systems.

 

If I’m allowed to dream, I’d love to see XFS on What nodes, giving Microsoft some actual competition.

Userlevel 2

Can people on here please confirm if they are using Refs on VMWare VM’s   (where the hotplug feature makes the drives appear as removable)
Vmware fix to overcome that is https://kb.vmware.com/s/article/1012225?lang=en_us

Or are some of your system really using removable drives?

For WS2012R2 using refsv1  the OOB fix will not and never will overcome the issue on removable drives
The Microsoft OOB fix is only for refs 3.x on removable drives in ws2016 and later
NOte - some systems running later OS’s may be using old disk formatted with refs v1  and so those refsv1 disks will be affected if they are considered removable

fsutil fsinfo refsinfo x:   will show the refs version
although of course cant run taht if the volume is already showing RAW!

KB5010691: ReFS-formatted removable media may fail to mount or mounts as RAW after installing the January 11, 2022 Windows updates (microsoft.com)
Mentions the OOB fix but unfortunately doesnt link to them
Mentions  applicable to refs v2  (aka 3.x)
Unfortunately doesnt mention the vmware aspect  or the 2012r2/refsv1 aspect

Userlevel 5
Badge

We are running vSphere 7 and have the issue on two VMs. This drive is shown as fixed. The VM runs Windows server 2022.

Userlevel 2

And still having the issue after applying the OOB update as well ?

Userlevel 5
Badge

Yep

Userlevel 2

Our 2012r2 machine I’ve been referencing is on VMware. Host is running 6.7.

Userlevel 7
Badge +13

Nice share of infos guys, this is the power of this community.

Userlevel 2

The only solution for 2012R2 (running on VMWare) is the 
devices.hotplug with a value of false

mentioned in Disabling the HotAdd/HotPlug capability in virtual machines (1012225) (vmware.com)

 

Userlevel 7
Badge +13

Tonight Microsoft start releasing “fix for everything”:


https://www.catalog.update.microsoft.com/Search.aspx?q=%092022-01%20Cumulative%20Update%20Preview

Userlevel 7
Badge +13

Thanks @stephc_msft for jumping into the discussion. I’m wondering if disabling hotadd fixes the issue. Does this mean for a virtualized Windows it looks like the disk is external, with hotadd enabled?

@Everyone: If you set this configuration with a virtual backup server, probably hotadd transport mode will no longer work ?

Userlevel 5
Badge

The only solution for 2012R2 (running on VMWare) is the 
devices.hotplug with a value of false

mentioned in Disabling the HotAdd/HotPlug capability in virtual machines (1012225) (vmware.com)

 

This setting indeed solves the issue for us on a Windows Server 2019 machine. Now the ReFS volume is accessible again after installing the updates. However, other VMs which still have accessible ReFS volumes after installing the patches don't have this configuration setting and they also have the eject option on the taskbar for the virtual disk. So, although the devices.hotplug setting fixes the issue, it's not the whole story.

 

Thanks @stephc_msft ! Questionable though why MS Support isn't aware of this setting.

Userlevel 5
Badge

Can people on here please confirm if they are using Refs on VMWare VM’s   (where the hotplug feature makes the drives appear as removable)
Vmware fix to overcome that is https://kb.vmware.com/s/article/1012225?lang=en_us

Or are some of your system really using removable drives?

For WS2012R2 using refsv1  the OOB fix will not and never will overcome the issue on removable drives
The Microsoft OOB fix is only for refs 3.x on removable drives in ws2016 and later
NOte - some systems running later OS’s may be using old disk formatted with refs v1  and so those refsv1 disks will be affected if they are considered removable

 

@stephc_msft  When running the fsutil command on our affected server it returns:

 

REFS Volume Serial Number :       0xb660b96e60b935c9
REFS Version   :                  1.2
Number Sectors :                  0x0000000004fe0000
Total Clusters :                  0x000000000009fc00
Free Clusters  :                  0x0000000000049906
Total Reserved :                  0x0000000000000000
Bytes Per Sector  :               512
Bytes Per Physical Sector :       512
Bytes Per Cluster :               65536
Checksum Type:                    CHECKSUM_TYPE_NONE

So it looks like we have the old ReFS version although the system is running Windows Server 2019. Why is this volume still on the old version, shouldn't it be upgraded automatically? Other Windows 2019 machines which don't have the issue have ReFS version 3.4.

Userlevel 7
Badge +13

Can people on here please confirm if they are using Refs on VMWare VM’s   (where the hotplug feature makes the drives appear as removable)
Vmware fix to overcome that is https://kb.vmware.com/s/article/1012225?lang=en_us

Or are some of your system really using removable drives?

For WS2012R2 using refsv1  the OOB fix will not and never will overcome the issue on removable drives
The Microsoft OOB fix is only for refs 3.x on removable drives in ws2016 and later
NOte - some systems running later OS’s may be using old disk formatted with refs v1  and so those refsv1 disks will be affected if they are considered removable

 

@stephc_msft  When running the fsutil command on our affected server it returns:

 

REFS Volume Serial Number :       0xb660b96e60b935c9
REFS Version   :                  1.2
Number Sectors :                  0x0000000004fe0000
Total Clusters :                  0x000000000009fc00
Free Clusters  :                  0x0000000000049906
Total Reserved :                  0x0000000000000000
Bytes Per Sector  :               512
Bytes Per Physical Sector :       512
Bytes Per Cluster :               65536
Checksum Type:                    CHECKSUM_TYPE_NONE

So it looks like we have the old ReFS version although the system is running Windows Server 2019. Why is this volume still on the old version, shouldn't it be upgraded automatically? Other Windows 2019 machines which don't have the issue have ReFS version 3.4.

It should be at least 3.4 accordingly to

https://gist.github.com/0xbadfca11/da0598e47dd643d933dc

Userlevel 5
Badge

Yep, but question is why isn't it upgraded to the latest version automatically? I didn't format it with the ReFSv1 paramter as specified on the github article. Is there a way to manually upgrade it to the latest version?

Userlevel 7
Badge +13

Yep, but question is why isn't it upgraded to the latest version automatically? I didn't format it with the ReFSv1 paramter as specified on the github article. Is there a way to manually upgrade it to the latest version?

The 1.2 is default version if formatted by Windows 8.1, Windows 10 up to v1607, Windows Server 2012 R2 and only if specified ReFSv1 on Windows Server 2016. I think it was the case.

In theory once a ReFS volume is mounted on a device that supports a newer ReFS version, it will be automatically upgraded to last possibly version. 

WARNING: after update, you can’t go back.

Userlevel 5
Badge

Yeah, but we for sure didn't format it with ReFSv1. How can I convert it to a higher version so that we don't have the issue any more after installing the patch?

 

I'm going to detach the disk from this specific Win 2019 server and attach it to another Win 2019 server and see if it's upgraded.

 

EDIT: sigh, attaching the disk to another Windows 2019 VM doesn't work. The volume isn't automatically upgraded to a ReFS version higher than 1.2. Adding the devices.hotplug parameter is the only solution for now.

Userlevel 2

My understanding is that only older v3.x versions update to 3.y   eg 3.1 to 3.4   or even to 3.7 now
New volumes get the relevant 3.y version
Older disks that may have been set up when originally attached to WS2012R2 will remain v1.2  and can still be used in the later OS (the OS has a separate refsv1 driver to handle them).
But some customers say they are not using old/existing disks and yet they are v1.2 somehow, and not sure how that can be.


 

Comment