Question

How to fix a VSS: Backup job failed. [Microsoft Exchange Writer]


Userlevel 2
Badge

Consistently getting “VSS: Backup job failed. Cannot notify writers about the ‘BACKUP FINISH’ event. A VSS critical writer has failed. Writer name: [Microsoft Exchange Writer]. Class ID: [{long cryptic txt}]  Instance ID:  [{long cryptic txt}]  Writer’s State:  [ VSS_WS_FAILED_AT_BACKUP_COMPLETE ]  Error code: [0x800423f3]

This is happening on our Exchange 2016 Mailbox servers. Following the error above, when Processing finishes it is listed as a “Warning”

“Did the job “Fail” or is this just a Warning? 

How do I fix this Error/Warning?

Will sincerely appreciate any advise I can get!


13 comments

Userlevel 7
Badge +20

It seems that Veeam could not assess the Exchange VSS Writer at the end of the job which completed with a warning as noted.

I would run - vssadmin list writers on your Exchange server to see the state but the only way to fix that writer is either restart Exchange service or reboot.

Userlevel 7
Badge +17

There are several questions to similar issues in the R&D forums, e.g.

https://forums.veeam.com/veeam-backup-replication-f2/warning-when-backing-up-exchange-2016-t44810.html

Please have a look if there is any helpful for your specific problem. Otherwise you should open a support case with Veeam.

Userlevel 7
Badge +7

It’d be interesting to see the outcome of vss admin list writers 

Userlevel 2
Badge

List writers all show as running.

Userlevel 2
Badge

In response to result of VSS Admin, I should have said,   State: [1] Stable  -  Last error: No error

Userlevel 7
Badge +7

Are there any in error state?

Userlevel 7
Badge +12

As you’re writing that you have multiple mailbox servers, are those running in a DAG? And if so, is it possible that the DAG members do get backed up at the same time? If so, this could be the cause of your VSS errors. Try to backup only one member/node and see if the error still occurs.

Userlevel 7
Badge +7

As you’re writing that you have multiple mailbox servers, are those running in a DAG? And if so, is it possible that the DAG members do get backed up at the same time? If so, this could be the cause of your VSS errors. Try to backup only one member/node and see if the error still occurs.

This may help too: https://www.veeam.com/blog/how-to-backup-exchange-database-availability-groups-dags-with-veeam-backup-replication.html

Userlevel 7
Badge +6

Yeah, typically this is a writers issue and there are commands that can be run to emulate the steps that Veeam takes when taking and committing a VSS snapshot so that a case can be opened with Microsoft to troubleshoot, but a Veeam ticket is going to be the best place to start aside from the aforementioned reading.

Userlevel 2
Badge

I split the 4 mailbox servers (2 DAGs) into 4 separate backup jobs. VSS errors gone, however it does take much longer for jobs to complete. Is there anything inherently wrong with only doing a backup on one of the mailbox servers in a DAG with only 2 mailbox servers in the DAG.? 

Userlevel 7
Badge +12

I would say it depends (love that statement since last week 😅).

If you only want to backup your exchange data (mailboxes) then it should be enough to backup only one member of each DAG.

If you want to be able to restore each DAG member in case something happens to them, you would still have to backup all of them. Although restoring an Exchange server needs some planing and especially in a DAG I would probably just setup a new server.

Normally I backup all DAG servers and try to distribute them over the jobs; like one server at the beginning of the job, the other one at the end; or chained jobs.

By the way. Here's the Exchange Compendium from Andreas Neufert. This includes all information on Exchange virtualization and backup.

https://andyandthevms.com/exchange-dag-vmware-backups-updated-list-of-tips-and-tricks-for-veeam-backup-replication/

Userlevel 3
Badge

does this error occur all the time or only during specific time periods?  If specific time then it would make sense to check if the exchange server is doing any internal backup operations of its own to create its local backup copy.

And, yes the writer should be listed as active and noerror on it.  If its says locked/unavailable /stopped then there is probably some other application holding a lock-onto this.

restart this service

 

Comment