Solved

"Failed to pre-process the job" // backup fails after data deletion from ReFS storage


Userlevel 2

Hello everyone,

some backup files were manually deleted from D:\ storage (ReFS) and
now backup jobs don’t work.

Failed to pre-process the job Error: A data integrity checksum error occurred. Data in the file stream is corrupt.
Failed to read data from the file [D:\[...].8D2023-07-26T204743_9167.vbk].
--tr:Error code: 0x00000143
--tr:Failed to parse metadata stored in slot [528384]
--tr:Unable to load metadata snapshot. Slot: [528384].
--tr:Failed to load metastore
--tr:Failed to load metadata partition. Delayed loading: [0]
Failed to open storage for read/write access.

We tried to create new repository (on the same storage) - didn’t help.
ReFS partition was formatted (yes, backup data are lost) - didn’t help.
The LUN on QNAP was re-created and we are getting the same error.
Yesterday I tired to uninstall Veeam and I made a fresh install on VM.
I didn’t help as well (note: I’ve removed %programdata% files, etc.).

It seems there are still some old parts of backup chains, surprising...
I didn’t find any solution, so maybe you could help? v. 11.0.1.1261

// I saw a topic where someone at the end suggested damaged RAM:
https://forums.veeam.com/microsoft-hyper-v-f25/case-05295340-refs-corruption-issues-t79430.html

icon

Best answer by Chris.Childerhose 27 July 2023, 14:17

View original

10 comments

Userlevel 2

(repository went to RAW once again, formatted it one more time...)

Userlevel 7
Badge +20

The RAW issue is a Microsoft update that causes it. If you deleted files manually from the repository then you will need to run an active full to start a new chain. Deletion of files manually is a no no with Veeam as it causes broken chains. You need to use the console to do this stuff and delete retention points.

Userlevel 7
Badge +17

Yes, though you performed all those troubleshooting steps, the problem lies with the Repository itself, specifically the fs on the Volume, not necessarily with Veeam.

As Chris stated, this is moreso a problem of Microsoft and their updates. There was a pretty extensive post here on this topic. There was also a reference in a forum thread on how someone was able to recover their data with the Windows 2019 refsutil tool. This person did a decent blog about it and how it helped them recover their data, using this article as an assistance.

Also as Chris stated, indeed manually deleting your files causes all kinds of problems and makes your recoverability from the given backup job null and void. I find it odd though once you recreate/reformat the Volume you’re still having this issue. Personally, I’m not familiar with QNAP, but I did notice from the forum post I shared above using QNAP and LACP instead of MPIO on the Volume can lead to this type of behavior as well. Did you enable MPIO on the Volume in Windows? You first may need to add the ‘Feature’ in Windows then configure it.

Userlevel 7
Badge +6

(repository went to RAW once again, formatted it one more time...)

 

Are you on a virtual machine, and using ESXI as your hypervisor?  There are a few windows updates that can cause this top happen, but uninstalling the patch just gets you back from RAW to REFS temporarily until the patch is reinstalled down the road.  The solid solution is to disable hot-add/hot-plug on that VM.

https://kb.vmware.com/s/article/1012225

 

Userlevel 2

@Chris.Childerhose 
Correct, deletion of Veeam files manually is just bad. I understand chains became broken, but how they still exist after ReFS (repository) was formatted and Veeam has been reinstalled? (maybe some configuration files on C:\ drive stayed or QNAP with its RAID is somehow responsible for this).
I tried to use “Delete from disk” for failed job and start over the backup, but it didn’t help. The other strange thing is that finally 2/3 backups were done… the first one got the same error.

You need to use the console to do this stuff and delete retention points.

You meant I can do this in the GUI/app = console, right? [screen above]
Also, set up “active full” here [screen below] and start a new chain, yes?

 

@coolsport00 
Yes, there (blog) is exactly what I got: “The volume repair was not successful.”. Information about possible recovery you provided are really interesting and very helpful. It’s also odd/strange for me (regarding recreate/reformat), it seems like some files somehow persisted. The MPIO feature isn’t there, but we use VMware’s iSCSI, so I guess there’s no need to configure this?
 

@dloseke
Yes, I am on a VM and we use ESXi as a hypervisor. I’ve checked some VMs and their *.vmx files, there is no such a line

devices.hotplug = "false"

and I think it’s enabled by default. Please note that currently we are using 77 Veeam servers (with QNAPs). If our backups are at risk (because of these ReFS issue) we must implement this to every VM, but so far only in one location we got RAW unexpectedly.

All of you guys gave me a lot of information already, the most important one was actually with ReFS problem…

Userlevel 7
Badge +20

@Chris.Childerhose 
Correct, deletion of Veeam files manually is just bad. I understand chains became broken, but how they still exist after ReFS (repository) was formatted and Veeam has been reinstalled? (maybe some configuration files on C:\ drive stayed or QNAP with its RAID is somehow responsible for this).
I tried to use “Delete from disk” for failed job and start over the backup, but it didn’t help. The other strange thing is that finally 2/3 backups were done… the first one got the same error.

You need to use the console to do this stuff and delete retention points.

You meant I can do this in the GUI/app = console, right? [screen above]
Also, set up “active full” here [screen below] and start a new chain, yes?

 

@coolsport00 
Yes, there (blog) is exactly what I got: “The volume repair was not successful.”. Information about possible recovery you provided are really interesting and very helpful. It’s also odd/strange for me (regarding recreate/reformat), it seems like some files somehow persisted. The MPIO feature isn’t there, but we use VMware’s iSCSI, so I guess there’s no need to configure this?
 

@dloseke
Yes, I am on a VM and we use ESXi as a hypervisor. I’ve checked some VMs and their *.vmx files, there is no such a line

devices.hotplug = "false"

and I think it’s enabled by default. Please note that currently we are using 77 Veeam servers (with QNAPs). If our backups are at risk (because of these ReFS issue) we must implement this to every VM, but so far only in one location we got RAW unexpectedly.

All of you guys gave me a lot of information already, the most important one was actually with ReFS problem…

They probably exist in the database still as an entry in a table.  If you use the “Delete from disk” option, it will remove it from the console and DB table.

Userlevel 7
Badge +17

@superfraktal yes...from the Console = the UI.

MPIO is available if you install it from Programs/Features in Windows (Control Panel). If you use iSCSI...yes, you should indeed use this feature so as to use all/both paths. 

Userlevel 7
Badge +6

 

@dloseke
Yes, I am on a VM and we use ESXi as a hypervisor. I’ve checked some VMs and their *.vmx files, there is no such a line

devices.hotplug = "false"

and I think it’s enabled by default. Please note that currently we are using 77 Veeam servers (with QNAPs). If our backups are at risk (because of these ReFS issue) we must implement this to every VM, but so far only in one location we got RAW unexpectedly.

All of you guys gave me a lot of information already, the most important one was actually with ReFS problem…

 

This would only need to be on the repository servers.  If you have 77 repo servers, I feel for you!  I would tend to recommend Linux repo’s now with XFS given the issues that Windows has had with REFS at times, but with newer versions of REFS, it seems to be getting better.

Userlevel 7
Badge +6

@superfraktal yes...from the Console = the UI.

MPIO is available if you install it from Programs/Features in Windows (Control Panel). If you use iSCSI...yes, you should indeed use this feature so as to use all/both paths. 

This would be if you’re connecting the ISCSI array to Windows directly via the ISCSI initiator.  Sounds like he’s mounting to ESXi and is maybe using a RDM, or is mounting as a datastore and using a VMFS volume (hopefully the RDM). 

Userlevel 2

@coolsport00
Yes, I’ve installed MPIO feature and restarted a VM right after your post and when I wanted to configure it - then I found some more info regarding iSCSI/MPIO.

I’ve asked what you meant by the “Console”, because I’m getting a bit confused with the situation and I thought that maybe there is some more-advanced option I’m not aware of, well, a place like command-line interpreter/shell in Windows (these are sometimes called console/terminal) to do such things, because I already tried “Delete from disk” and it didn’t help.


I also tried to start new “Active full” but it seems that some old chains/entries are still present. If that’s the way to create a new chain [screen below] then it doesn’t work and I’m still getting this error: “A data integrity checksum error occurred. Data in the file stream is corrupt. Failed to write data to the file [D:\Backup\ATVIE02 - Wien - Backup_1\[...].vbk]. Failed to backup text locally.”

(btw, now the backup folder in the repository is a bit different, it is named Backup_1 probably the suffix/postfix was added because of “Active full” - a new chain)

 

I’m gonna wait till my colleague will be back from holiday and then we will open a case with Veeam, as I don’t have any of these roles: License/Case Admin, Support Partner. I really hope there is a solution and what I wrote at the very beginning is not a case ere (“I saw a topic where someone at the end suggested damaged RAM”). If/when the issue will be solved I will update this post.

Thank you guys for all your input.

Comment