ReFS issues with latest Windows Server Updates (KB5009624, KB5009557, KB5009555)



Show first post

106 comments

Userlevel 7
Badge +6

What OS?
If 2012R2 and if on a system like vmware where the disks have characteristics of removable, you will alway have issues.
With the patches removed, and the volume visible, can you check the refs version number
fsutil fsinfo refsinfo x:
If its v1.2   then that is the old version which will never work on ‘removable’ drives now
v3.x should be ok with latest updates (since Feb)

If this is a vmware system, and refs v1.2 disks, then the only option is a vmware configuartion chnage to disable hotplug

Disabling the HotAdd/HotPlug capability in virtual machines (vmware.com)


If none of the above, please see if the ReFS event log (under applicationas and services logs) says anything   eg about the version number
eg if somehow the v3.x refs disk have been attached to a later OS and updated to a later v3.x version, then they will no longer be readable if put them back on the earlier OS
 

 

Funny that this popped up again.  A couple days ago I ran across another 2012 R2 machine that was showing the REFS volumes as RAW and disabling hotplug worked perfectly without having to uninstall the offending KB that was applied to the server.  In the past, I had a client that this happened to a couple of times and I did both at the time so wasn’t sure which fixed it.  Can confirm disabling hotplug worked perfectly!

Userlevel 7
Badge +12

In Microsoft must change devs, they've been doing everything wrong lately.

Not everything. Problem is, everybody talks only about the bad things :)

 

Userlevel 7
Badge +12

@Mildur But so far we havn’t experienced any good things since 12/31/2021 from them; except you count everything which is not broken or doesn’t have issues as positive :wink:

 

True. I was talking more general. Not only in 2022, and not only from Microsoft.

Bad things are more discussed than good things :)

 

Nope that’s it, Microsoft took it too far this morning! Decided my headset and webcam were untrustworthy (or maybe they meant the person on them, aka me!) 😆

haha :D

Userlevel 7
Badge +9

I can confirm that at the point in writing this piece, Microsoft hasn't recognised this vulnerability

Userlevel 7
Badge +13

Can people on here please confirm if they are using Refs on VMWare VM’s   (where the hotplug feature makes the drives appear as removable)
Vmware fix to overcome that is https://kb.vmware.com/s/article/1012225?lang=en_us

Or are some of your system really using removable drives?

For WS2012R2 using refsv1  the OOB fix will not and never will overcome the issue on removable drives
The Microsoft OOB fix is only for refs 3.x on removable drives in ws2016 and later
NOte - some systems running later OS’s may be using old disk formatted with refs v1  and so those refsv1 disks will be affected if they are considered removable

 

@stephc_msft  When running the fsutil command on our affected server it returns:

 

REFS Volume Serial Number :       0xb660b96e60b935c9
REFS Version   :                  1.2
Number Sectors :                  0x0000000004fe0000
Total Clusters :                  0x000000000009fc00
Free Clusters  :                  0x0000000000049906
Total Reserved :                  0x0000000000000000
Bytes Per Sector  :               512
Bytes Per Physical Sector :       512
Bytes Per Cluster :               65536
Checksum Type:                    CHECKSUM_TYPE_NONE

So it looks like we have the old ReFS version although the system is running Windows Server 2019. Why is this volume still on the old version, shouldn't it be upgraded automatically? Other Windows 2019 machines which don't have the issue have ReFS version 3.4.

It should be at least 3.4 accordingly to

https://gist.github.com/0xbadfca11/da0598e47dd643d933dc

Userlevel 7
Badge +6

 

@dlosekeAre those virtual disks or physical volumes?

 

Excellent question!  In both cases, these are RDM disks and the underlying storage is an ISCSI volume presented to the ESXI hosts from a Synology NAS.

Userlevel 7
Badge +13

Yep, but question is why isn't it upgraded to the latest version automatically? I didn't format it with the ReFSv1 paramter as specified on the github article. Is there a way to manually upgrade it to the latest version?

The 1.2 is default version if formatted by Windows 8.1, Windows 10 up to v1607, Windows Server 2012 R2 and only if specified ReFSv1 on Windows Server 2016. I think it was the case.

In theory once a ReFS volume is mounted on a device that supports a newer ReFS version, it will be automatically upgraded to last possibly version. 

WARNING: after update, you can’t go back.

Userlevel 5
Badge

Yeah, but we for sure didn't format it with ReFSv1. How can I convert it to a higher version so that we don't have the issue any more after installing the patch?

 

I'm going to detach the disk from this specific Win 2019 server and attach it to another Win 2019 server and see if it's upgraded.

 

EDIT: sigh, attaching the disk to another Windows 2019 VM doesn't work. The volume isn't automatically upgraded to a ReFS version higher than 1.2. Adding the devices.hotplug parameter is the only solution for now.

Userlevel 7
Badge +13

Okay, so the only way to get to the new ReFS version is to create a new disk and copy the data from the 1.2 to the 3.x volume? There’s no way to manually upgrade to the new ReFS version?

When you copy the data from volume to volume you will loose all block cloning savings….

True, but I can’t see other solutions...

Userlevel 7
Badge +13

I read that ReFS when they get up in space used it can fall over. Is that true, or still true?

Never heard of this, let’s see what community have to say about it.

Userlevel 7
Badge +11

From Windows Server 2016 it is recommended to use REFS. Normally there are no issues with it. Before W2016, yes indeed. The advantages with using REFS compared to NTFS are big : you can use synthetic full backups so allows you to have much more restore points on the same size of storage and is much more faster because pointers are being used to blocks being identical that are already located on the storage. I would never go backup to NTFS except when using rotating USB disks, then I recommended to use NTFS over REFS because using GFS is  not possible with the rotated option.

Userlevel 7
Badge +6

 

@dlosekeAre those virtual disks or physical volumes?

 

Excellent question!  In both cases, these are RDM disks and the underlying storage is an ISCSI volume presented to the ESXI hosts from a Synology NAS.

Try to use always iSCSI volumes inside the Windows VM and not to the ESXi host and use then RDM disks. It functions better in my experience and less dependent of things like VMware.

 

I suppose that’s true.  I’ve been using RDM’s because I have better multipathing capabilities, but I suppose that could be worked out with multiple NIC’s on the VM tied to certain port groups that are dedicated to certain physical NIC’s maybe?  I guess I haven’t given it much consideration.  I will say that I hate the ISCSI initiator in Windows and the MPIO driver, but that’s just personal preference...the VMware ISCSI is easier to setup IMO.  But that doesn’t make it better either….

Badge

Apparently, this is still an issue. I had 6 updates automatically applied (KB5011564, KB5011560, KB5010462, KB5010419, KB5010395, KB5009595) I thought I had figured out which one was the culprit but I was mistaken and did not have the time to narrow it down so I removed all 6, rebooted, and the ReFS drive no longer showed as RAW. I do not find any of the KB’s that I listed above as matching the other’s in this thread, has anyone else experienced this and found out which one it was?

Userlevel 7
Badge +20

Apparently, this is still an issue. I had 6 updates automatically applied (KB5011564, KB5011560, KB5010462, KB5010419, KB5010395, KB5009595) I thought I had figured out which one was the culprit but I was mistaken and did not have the time to narrow it down so I removed all 6, rebooted, and the ReFS drive no longer showed as RAW. I do not find any of the KB’s that I listed above as matching the other’s in this thread, has anyone else experienced this and found out which one it was?

We have not applied any of these updates to our ReFS servers due to this problem.  I am surprised MS has not addressed it yet and we cannot have our Veeam services down.

Badge

Apparently, this is still an issue. I had 6 updates automatically applied (KB5011564, KB5011560, KB5010462, KB5010419, KB5010395, KB5009595) I thought I had figured out which one was the culprit but I was mistaken and did not have the time to narrow it down so I removed all 6, rebooted, and the ReFS drive no longer showed as RAW. I do not find any of the KB’s that I listed above as matching the other’s in this thread, has anyone else experienced this and found out which one it was?

We have not applied any of these updates to our ReFS servers due to this problem.  I am surprised MS has not addressed it yet and we cannot have our Veeam services down.

We have been employing the same approach as you and these shouldn’t have been applied but someone else had a different idea. I will need to isolate which update caused it, thanks for the response.

Badge

Apparently, this is still an issue. I had 6 updates automatically applied (KB5011564, KB5011560, KB5010462, KB5010419, KB5010395, KB5009595) I thought I had figured out which one was the culprit but I was mistaken and did not have the time to narrow it down so I removed all 6, rebooted, and the ReFS drive no longer showed as RAW. I do not find any of the KB’s that I listed above as matching the other’s in this thread, has anyone else experienced this and found out which one it was?

We have not applied any of these updates to our ReFS servers due to this problem.  I am surprised MS has not addressed it yet and we cannot have our Veeam services down.

We have been employing the same approach as you and these shouldn’t have been applied but someone else had a different idea. I will need to isolate which update caused it, thanks for the response.

No problem - we use a patch management system so we don’t approve those updates for the specific servers so they never get installed.

We do too, but another engineer decided to release them. It is what it is and won’t happen again if I have anything to say about it!

Userlevel 7
Badge +14

I haven't heard any bad news since some time, so it will be interesting to see which update is causing your problem.

Userlevel 7
Badge +14

Welcome to the Veeam Community @ianv. The community is open for everyone, even if your not a Veeam user :wink:

It’s hard to say what happend in your case and if you did experience something different. I think BSOD weren’t caused by this update, “only” the ReFS volumes turned inaccessible (RAW).

‘strongly suggest’ I contact Microsoft Premier Support Services.

I really love their support for such suggestions. They introduce an issue but won’t help you in anyway or admit that it could be a bug, without going through their premier support, which is by the way, not easily accessible at all.

Userlevel 5
Badge

But no mention of the ReFS issues. Sigh….

Userlevel 7
Badge +13

So possibly it’s the combination of installing the past ‘bad’ update and KB5010794 together that is needed? We blocked the original so we had KB5010794 install by itself. From your link:
 

So, I pulled it through today (I informed myself and resisted for a long time).
Thank you to everyone here who helped share information.
The new patches were used: KB5010791 (2019 server), KB5010790 (2016 server) and KB5010794 (server 2012R2)
1x 2012R2 server (DC)
4x 2019 server (DCs)
4x 2016 server (RODCs)
All servers had the latest patch status the Dec. Patchday updates.
First the 4x 2019 servers with KB5010791 (because it's a cumulative update, pulled from the update catalog), then the 2012R2 via Windows Update, here I selected the 'bad' January update from 01/11/22 TOGETHER with the optional KB5010794 and installed both at once.
Finally the 4x 2016s with KB5010790 from the Update Catalog.
Everything is running - lucky :)

Maybe that’s it :thumbsup_tone1:

Userlevel 7
Badge +14

Thanks for the feedback @Asmith; great that you were able to solve it.

@Bommer I haven't heard of this one so far. You could try to uninstall the updates to see if they were the cause.

Anyone found a solution to solve this on Windows 2012 R2, REFS Volume showing RAW? I tried to remove bad patch, even installed out-of-band patch, still no luck. 

Userlevel 2

If still RAW after removing the Jan patch (patches?), then suggests there is some real refs corruption.
If on vmware try the disable hotplug method as a double check.
And/or try attaching the disk to an unpatched system (any OS version) as another check.

If real corruption is *probably* a co-incidence it happened at the time of patching.
There are some 3rd party tools that can scan disks/volumes and that can understand refs
eg

https://www.r-studio.com/

https://www.isobuster.com/

https://www.diskgenius.com/

For WS2019 their is a built-in refsutil tool  (which can also be copies to and run on WS2016, but doubt will work on WS2012R2

NOte - only do the recover attempt as a last resort, after confirming not hitting the know issues that make it appear RAW even though there is no corruption.

 

Userlevel 7
Badge +13

Thanks @MicoolPaul and @marcofabbri ! For me Windows Server 2016 was the first contact with ReFS as Veeam started to integrated it. With 2012R2 I felt that it wasn’t mature enough for production. You’re right that the integrity part is/was an advantage, but without storage spaces you only had the corroption detection.

It was same for me @regnor 

But I hope that @Asmith can resolve his issue..

Userlevel 7
Badge +7

Just a quick post; the latest Windows Server updates for 2012R2, 2019 and 2022 (haven’t seen 2016) can cause ReFS issues. After the installation, ReFS volumes are shown as RAW and are no longer accessible; so if you’re using ReFS for your repositories, then take be aware of this when installating the update. Besides that, those updates can also cause bootloops for domain controllers and break Hyper-V.

Depending on your Windows Server version one of the following updates could have caused the issue: KB5009624, KB5009557, KB5009555

If you’re affected then removing the mentioned update should solve the issue. Don’t (!) try to repair the ReFS volume, because this could cause a dataloss.

This should remind everyone why 3-2-1 for Backups is so important. :wink:

Further information:

https://www.bleepingcomputer.com/news/microsoft/new-windows-server-updates-cause-dc-boot-loops-break-hyper-v/

https://forums.veeam.com/veeam-backup-replication-f2/beware-possible-raw-refs-volumes-after-installing-january-updates-t78634.html

Update #1:

Microsoft has pulled the updates, thanks @Mildur 

https://www.bleepingcomputer.com/news/microsoft/microsoft-pulls-new-windows-server-updates-due-to-critical-bugs/amp/

Update #2:

Microsoft has released the updates again and it looks like they didn't fix them. (Thanks @MicoolPaul )

https://www.bleepingcomputer.com/news/microsoft/microsoft-resumes-rollout-of-january-windows-server-updates/

Update #3:

Microsoft has released an out-of-band update, which can possibly resolve the issues:

https://www.bleepingcomputer.com/news/microsoft/microsoft-releases-emergency-fixes-for-windows-server-vpn-bugs/

 

Thanks for your update again.

Comment