ReFS issues with latest Windows Server Updates (KB5009624, KB5009557, KB5009555)


Userlevel 7
Badge +6

Just a quick post; the latest Windows Server updates for 2012R2, 2019 and 2022 (haven’t seen 2016) can cause ReFS issues. After the installation, ReFS volumes are shown as RAW and are no longer accessible; so if you’re using ReFS for your repositories, then take be aware of this when installating the update. Besides that, those updates can also cause bootloops for domain controllers and break Hyper-V.

Depending on your Windows Server version one of the following updates could have caused the issue: KB5009624, KB5009557, KB5009555

If you’re affected then removing the mentioned update should solve the issue. Don’t (!) try to repair the ReFS volume, because this could cause a dataloss.

This should remind everyone why 3-2-1 for Backups is so important. :wink:

Further information:

https://www.bleepingcomputer.com/news/microsoft/new-windows-server-updates-cause-dc-boot-loops-break-hyper-v/

https://forums.veeam.com/veeam-backup-replication-f2/beware-possible-raw-refs-volumes-after-installing-january-updates-t78634.html

Update #1:

Microsoft has pulled the updates, thanks @Mildur 

https://www.bleepingcomputer.com/news/microsoft/microsoft-pulls-new-windows-server-updates-due-to-critical-bugs/amp/

Update #2:

Microsoft has released the updates again and it looks like they didn't fix them. (Thanks @MicoolPaul )

https://www.bleepingcomputer.com/news/microsoft/microsoft-resumes-rollout-of-january-windows-server-updates/

Update #3:

Microsoft has released an out-of-band update, which can possibly resolve the issues:

https://www.bleepingcomputer.com/news/microsoft/microsoft-releases-emergency-fixes-for-windows-server-vpn-bugs/

 


94 comments

Userlevel 7
Badge +5

ReFS issues and DC bootloop and broken Hyper-v

Just talking about this some minutes ago.

More info here: https://www.bleepingcomputer.com/news/microsoft/new-windows-server-updates-cause-dc-boot-loops-break-hyper-v/

 

Userlevel 7
Badge +4

Microsoft has pulled now the updates from the update servers:

https://www.bleepingcomputer.com/news/microsoft/microsoft-pulls-new-windows-server-updates-due-to-critical-bugs/amp/

Userlevel 4
Badge

We decided to decline the updates from our WSUS after seeing the thread on Reddit before MS pulled them. 
https://www.reddit.com/r/sysadmin/comments/s1jcue/patch_tuesday_megathread_20220112/
 

The servers that got patched don’t show any errors.

Userlevel 7
Badge +4

It’s a good start in 2022. To many things go sideways. :)

Userlevel 7
Badge +4

@Mildur But so far we havn’t experienced any good things since 12/31/2021 from them; except you count everything which is not broken or doesn’t have issues as positive :wink:

 

True. I was talking more general. Not only in 2022, and not only from Microsoft.

Bad things are more discussed than good things :)

 

Userlevel 7
Badge +8

@Mildur But so far we havn’t experienced any good things since 12/31/2021 from them; except you count everything which is not broken or doesn’t have issues as positive :wink:

 

True. I was talking more general. Not only in 2022, and not only from Microsoft.

Bad things are more discussed than good things :)

 

Nope that’s it, Microsoft took it too far this morning! Decided my headset and webcam were untrustworthy (or maybe they meant the person on them, aka me!) 😆

Userlevel 7
Badge +8

Think it’s for the best… 😬

Userlevel 4

KB5009555 for Windows 2022 doesn’t seem to have been pulled. My test server still downloads this update from Windows Update. However, this is the faulty update which causes the raw ReFS issue.

Userlevel 7
Badge +8

All they’ve done is add the problems as known issues… oh Microsoft…

Userlevel 2

FYI the out of bound update, KB5010794, is still breaking ReFS for 2012r2.

Userlevel 2

Unfortunately just finding out about this. Entire ReFS volume appears to be shot even after uninstalling the update. Anyone else run into uninstalling not resolving the issue?

 

***Update - after uninstalling BOTH KB5009624 and KB5009595 our ReFS volumes are back. Running Server 2012 R2 Datacenter

Userlevel 2

So possibly it’s the combination of installing the past ‘bad’ update and KB5010794 together that is needed? We blocked the original so we had KB5010794 install by itself. From your link:
 

So, I pulled it through today (I informed myself and resisted for a long time).
Thank you to everyone here who helped share information.
The new patches were used: KB5010791 (2019 server), KB5010790 (2016 server) and KB5010794 (server 2012R2)
1x 2012R2 server (DC)
4x 2019 server (DCs)
4x 2016 server (RODCs)
All servers had the latest patch status the Dec. Patchday updates.
First the 4x 2019 servers with KB5010791 (because it's a cumulative update, pulled from the update catalog), then the 2012R2 via Windows Update, here I selected the 'bad' January update from 01/11/22 TOGETHER with the optional KB5010794 and installed both at once.
Finally the 4x 2016s with KB5010790 from the Update Catalog.
Everything is running - lucky :)

Userlevel 7
Badge +8

Latest update from MS Support: case is being archived with no resolution at the moment. Advice currently is, if the issue occurs after installing the cumulative patch, you have two options:

- uninstall the patch and copy the data to an NTFS volume and re-install the patch

- refrain from installing the patch entirely

The MS Support engineer even made a personal note:

In any case if you use NTFS I personally recommend using NTFS instead of ReFS, as ReFS is still immature as a file system as has several bugs.


I did let them know my frustration about this issue, since they broke it and still don't have a solution or even a cause after several weeks.

This response is… wow :scream: . It makes a very uncomfortable feeling at the moment.

In this case they should drop ReFS completely.

Agreed, this engineer is definitely overstepping by presenting their own personal views in such a way. Likely out of frustration as Microsoft will have been the ones stating to just uninstall the patch.

 

When Microsoft promised us singular “cumulative” updates to improve their QA, this is a trade off we now are experiencing, that you get all or nothing with their patches, production breaking updates vs exposure to zero day exploits is now the choice to make.

 

The ReFS comment makes me feel uneasy also, but this is why we need to plan for file system issues in our 3-2-1-1-0, as this patch, whilst bad, has the ability to be reversed. What happens next time when the patch corrupts the partitions beyond repair?

 

We need to treat this as an important lesson, and be glad that Veeam have been helping us break free of Microsoft’s chains with the use of Linux operating systems.

 

If I’m allowed to dream, I’d love to see XFS on What nodes, giving Microsoft some actual competition.

Userlevel 2

Can people on here please confirm if they are using Refs on VMWare VM’s   (where the hotplug feature makes the drives appear as removable)
Vmware fix to overcome that is https://kb.vmware.com/s/article/1012225?lang=en_us

Or are some of your system really using removable drives?

For WS2012R2 using refsv1  the OOB fix will not and never will overcome the issue on removable drives
The Microsoft OOB fix is only for refs 3.x on removable drives in ws2016 and later
NOte - some systems running later OS’s may be using old disk formatted with refs v1  and so those refsv1 disks will be affected if they are considered removable

fsutil fsinfo refsinfo x:   will show the refs version
although of course cant run taht if the volume is already showing RAW!

KB5010691: ReFS-formatted removable media may fail to mount or mounts as RAW after installing the January 11, 2022 Windows updates (microsoft.com)
Mentions the OOB fix but unfortunately doesnt link to them
Mentions  applicable to refs v2  (aka 3.x)
Unfortunately doesnt mention the vmware aspect  or the 2012r2/refsv1 aspect

Userlevel 2

The only solution for 2012R2 (running on VMWare) is the 
devices.hotplug with a value of false

mentioned in Disabling the HotAdd/HotPlug capability in virtual machines (1012225) (vmware.com)

 

Userlevel 4

The only solution for 2012R2 (running on VMWare) is the 
devices.hotplug with a value of false

mentioned in Disabling the HotAdd/HotPlug capability in virtual machines (1012225) (vmware.com)

 

This setting indeed solves the issue for us on a Windows Server 2019 machine. Now the ReFS volume is accessible again after installing the updates. However, other VMs which still have accessible ReFS volumes after installing the patches don't have this configuration setting and they also have the eject option on the taskbar for the virtual disk. So, although the devices.hotplug setting fixes the issue, it's not the whole story.

 

Thanks @stephc_msft ! Questionable though why MS Support isn't aware of this setting.

Userlevel 7
Badge +5

In Microsoft must change devs, they've been doing everything wrong lately.

Userlevel 4

Can people on here please confirm if they are using Refs on VMWare VM’s   (where the hotplug feature makes the drives appear as removable)
Vmware fix to overcome that is https://kb.vmware.com/s/article/1012225?lang=en_us

Or are some of your system really using removable drives?

For WS2012R2 using refsv1  the OOB fix will not and never will overcome the issue on removable drives
The Microsoft OOB fix is only for refs 3.x on removable drives in ws2016 and later
NOte - some systems running later OS’s may be using old disk formatted with refs v1  and so those refsv1 disks will be affected if they are considered removable

 

@stephc_msft  When running the fsutil command on our affected server it returns:

 

REFS Volume Serial Number :       0xb660b96e60b935c9
REFS Version   :                  1.2
Number Sectors :                  0x0000000004fe0000
Total Clusters :                  0x000000000009fc00
Free Clusters  :                  0x0000000000049906
Total Reserved :                  0x0000000000000000
Bytes Per Sector  :               512
Bytes Per Physical Sector :       512
Bytes Per Cluster :               65536
Checksum Type:                    CHECKSUM_TYPE_NONE

So it looks like we have the old ReFS version although the system is running Windows Server 2019. Why is this volume still on the old version, shouldn't it be upgraded automatically? Other Windows 2019 machines which don't have the issue have ReFS version 3.4.

Userlevel 7
Badge +7

Thanks for sharing @regnor! Really bad bugs that shouldn't exist! 

These days a customer told me he was suffering from regular reboots of his new 2022 test server. The guess was it had to do with trial-version. No its a feature … eh … bug

Userlevel 7
Badge +6

@Mildur But so far we havn’t experienced any good things since 12/31/2021 from them; except you count everything which is not broken or doesn’t have issues as positive :wink:

 

True. I was talking more general. Not only in 2022, and not only from Microsoft.

Bad things are more discussed than good things :)

 

 You’re definetly right with that :wink:

@Mildur But so far we havn’t experienced any good things since 12/31/2021 from them; except you count everything which is not broken or doesn’t have issues as positive :wink:

 

True. I was talking more general. Not only in 2022, and not only from Microsoft.

Bad things are more discussed than good things :)

 

Nope that’s it, Microsoft took it too far this morning! Decided my headset and webcam were untrustworthy (or maybe they meant the person on them, aka me!) 😆

Better check with a different person, just to be sure :wink:

But perhaps it’s a security feature? Microphones and webcams can spy on you, so better throw it all away :laughing:

@Iams3leI’m sure this won’t be the last time we see such issues; just hope it won’t be in January…

@vNote42Reboots tend to solve problems and make the system more stable; so you’re spot on with “it’s a feature” :nerd:

Userlevel 4

Okay, so the only way to get to the new ReFS version is to create a new disk and copy the data from the 1.2 to the 3.x volume? There’s no way to manually upgrade to the new ReFS version?

Userlevel 7
Badge +4

aww bad patch MS

//VPN /IPSec Issue: Certain IPSEC connections might fail

 

//Domain Controller rebooting issue

 

//Hyper V issue

Userlevel 1

I read that ReFS when they get up in space used it can fall over. Is that true, or still true?

Never heard of this, let’s see what community have to say about it.

Yeah I just re-found the article. ServerFault. May 2019. “When ReFS volumes get filled up to 60-80% they tend to lock up. If you have ReFS user data hashing enabled situation turns even worse.”

So I guess people have gone over that with no issues … ?

Userlevel 7
Badge +8

KB5009555 for Windows 2022 doesn’t seem to have been pulled. My test server still downloads this update from Windows Update. However, this is the faulty update which causes the raw ReFS issue.

That’s because Microsoft released them AGAIN on a FRIDAY 🤯

 

https://www.bleepingcomputer.com/news/microsoft/microsoft-resumes-rollout-of-january-windows-server-updates/

Userlevel 7
Badge +8

All they’ve done is add the problems as known issues… oh Microsoft…

What else should they do...fix the update?🤣

That’s crazy talk! What next, QA testing?

Comment