Solved

Running backup with very little space left render QNAP NAS unresponsive


Userlevel 2

Hi,

Does anybody has encountered the following odd situation:

 

Situation (a): QNAP NAS full/over quota. When running the Veeam backup, it fails as expected reporting that the target directory is full.

Situation (b): QNAP NAS with very little space (32MB left of 2TB). When running the Veeam backup, it hoses the NAS ....

 

Where in the second situation the QNAP NAS can no longer be accessed.  Any already attached ssh session becomes ‘stuck’ as soon as it tries to access the disk.  Also the number of smb process on the QNAP NAS keeps increase over-time.

 

I have a Windows 10 machine with Veeam 11.0.1.1261 running backups remotely to a SMB shared on a QNAP with QTS 5.0.1.2194.

 

Is there anything that I can do to prevent this from happening (i.e. actually get an error from Veeam in the second situation rather than an unresponsive NAS)?

Thanks.

icon

Best answer by MicoolPaul 20 December 2022, 22:47

View original

17 comments

Userlevel 7
Badge +20

I would think to put a quota on the SMB share you are mapping to the VBR server this way when it hits that backups just fail instead of trying to fill the NAS.  This would be the best way I would think to take care of this since Veeam when backing up will use the space it sees.

Now unless the SMB share was part of a SOBR where you could offload data to a Capacity Tier (S3) then that would keep the main repo space down as well.

Userlevel 7
Badge +12

Situation (b): QNAP NAS with very little space (32MB left of 2TB). When running the Veeam backup, it hoses the NAS ....


As already commented in the forum, only 32MB free space cannot work. Please make sure that you have at least 5-10% unused storage left.


Thanks

Fabian

 

Userlevel 2

> I would think to put a quota on the SMB share you are mapping to the VBR server this way when it hits that backups just fail instead of trying to fill the NAS. 

It is already the case.  It is just the quota that is reached (there is at least 2 more TB for the QNAP OS to use).  

Userlevel 7
Badge +6

I’ve got nothing on this other than recommending a support case with QNAP.  Strange that filling the quota would take it offline.  This sort of action reminds me of an old Lenovo/EMC/Iomega NAS that I had that would take itself offline and become inaccessible if I pushed too much IO to it.

Userlevel 2

My 2 main concerns are whether this is an issue with the NAS or with Veeam.  Either way the failure mode is very bad as tracking down what the problem was (i.e. Veeam filling up its quota and killing the NAS in the process) was very hard (as the symptoms was an unresponsive NAS and no clear connection to the backups).  I would have expected the same failure mode in the 2 situation I described (simply a report that the backup failed).

Userlevel 7
Badge +6

Yes, the backup failing would be typical.  Let me ask this...when you look at the Backup Infrastructure at the repository, does it show low free space there?  Any warnings about low space in the backup job logs?  If space get’s too low, the backup job should fail as Veeam warns about being too low.  Also, how is the NAS connected to the repo server?  SMB?  NFS?  ISCSI?

Userlevel 2

> I’ve got nothing on this other than recommending a support case with QNAP.

I already did and we have beeing spending more then 6 weeks trouble shooting unrelated problems until they asked me to reset the NAS and when I was recording the configuration I noticed the quota for Veeam was reached.  After that a couple of simple tests showed that the backup were the trigger.  When I reported this to them, they pointed me in the direction of Veeam … Since I am new to Veeam I also trying to make sure I am not missing anything obvious in my (trivial) Veeam configuration (at the moment a single PC a simple volume level backup in workstation/managed-by-agent mode of the OS disk .  The destination is connected as a “Veeam backup repository”.

Userlevel 7
Badge +20

Your problem here is definitely a QNAP bug. It’s a software RAID being used by the NAS which unfortunately can be a bit flakey.

 

I’ve seen this behaviour with Linux software RAIDs before exactly as you described. The IO stream effectively hangs.

 

Interestingly your QNAP version included changes to quota handling so you’re likely dealing with a bug caused by these changes. You’ll need to file a bug report like @dloseke suggested.

Userlevel 7
Badge +20

> I’ve got nothing on this other than recommending a support case with QNAP.

I already did and we have beeing spending more then 6 weeks trouble shooting unrelated problems until they asked me to reset the NAS and when I was recording the configuration I noticed the quota for Veeam was reached.  After that a couple of simple tests showed that the backup were the trigger.  When I reported this to them, they pointed me in the direction of Veeam … Since I am new to Veeam I also trying to make sure I am not missing anything obvious in my (trivial) Veeam configuration (at the moment a single PC a simple volume level backup in workstation/managed-by-agent mode of the OS disk .  The destination is connected as a “Veeam backup repository”.

Simple test to see if Veeam is overwhelming the NAS. When this happens, disconnect your gateway server from the network, does service resume?

 

If the quota is reached, the QNAP should just return an error code and Veeam terminates its processing attempt. If it isn’t, Veeam will continue to attempt to write. But the behaviour you describe is QNAP not knowing how to deal with this scenario and locking it’s IO stream or something similar.

Userlevel 7
Badge +6

One thing you can check if you haven’t already is to make sure that your firmware is up to date on the NAS.  Aside from possible bug fixes, NAS’s, especially QNAP is seems are often targeted by Malware because they are often used as backup repositories and they typically have their fair share of bugs to be exploited.

Userlevel 2

I do not have the original error reports.  Today I was able to reproduce the error mode (low space left in quote, running the backup, hosing the NAS) and the error is:

2:39:37 PM    Preparing for backup     00:05
2:39:43 PM    Required backup infrastructure resources have been assigned     00:00
2:39:50 PM    Creating VSS snapshot     01:06
2:40:59 PM    Calculating digests     00:07
2:41:06 PM    Getting list of local users     00:00
2:41:13 PM    Recovery partition (disk 2) (529.0 MB) 284.0 MB read at 670 KB/s    07:14
2:48:45 PM    Finalizing     01:01
2:49:47 PM    Error: The semaphore timeout period has expired. Unable to reset file size: trunk operation has failed. File: [\\Trouville\WindowsBackups\VeeamAgentUserbb213aee-8f1f-486f-5dcb-3c7c3f508075\Agent_Backup_Policy_1_-_Honfleur\Agent Back2022-12-20T143937.vbk]. Failed to backup text locally. Backup: [VBK: 'veeamfs:0:6745a759-2205-4cd2-b172-8ec8f7e60ef8 (bb213aee-8f1f-486f-5dcb-3c7c3f508075)\summary.xml@RelativePath://Agent_Backup_Policy_1_-_Honfleur\Agent Back2022-12-20T143937.vbk']. Agent failed to process me     
2:52:42 PM    Unable to perform threshold check for repository "Trouville": failed to query backup repository disk space     
2:52:42 PM    Error: Failed to call RPC function 'FcIsExists': The network path was not found.     
2:52:44 PM    Processing finished with errors at 12/20/2022 2:52:44 PM     

 

Where the NAS was rebooted around 2:48/49.


The backup repository view seems to reports the correct information (e.g. 1.6 MB Free Space right now).

 

Userlevel 2

> One thing you can check if you haven’t already is to make sure that your firmware is up to date on the NAS. 

It is :(

Userlevel 7
Badge +20

There’s minor newer builds available for the QNAP you could try, don’t know about any downgrade branches that are supported.

 

But a semiphore timeout is stating a lack of response from the device. Veeam can’t be doing this because Veeam simply stops hearing from the QNAP.

 

I know you don’t want to hear this but it’s a QNAP problem. Veeam is sending packets only

Userlevel 2

> Simple test to see if Veeam is overwhelming the NAS. When this happens, disconnect your gateway server from the network, does service resume?

Stopping the backup, disconnecting the Windows machine from the network or even rebooting that machine has no effect on the NAS (technically the NAS is hosed at least in part due to a smb process on the NAS that is unkillable)

Userlevel 2

> I know you don’t want to hear this but it’s a QNAP problem.

Well it is good to hear actually :).   I would mind hearing also if any other QNAP + Veeam users are able to reproduce the problem and/or has seen it before (i.e. I can use all the ammunition I can get to convince them to actually look at the underlying issue :) ).

 

 

> But a semiphore timeout is stating a lack of response from the device.

I did not measure precisely enough but the timing of the timeout is in the area of the time I rebooted the NAS.

Userlevel 2

One more piece of information.  That same NAS also has a single disk mounted by itself (not part of the RAID).  I setup the backup to write to this disk.   At the start of the backup the user still had 2GB of quota left out of 2TB.   After a while the backup filled up the quota and:
(1) hosed the disk on the NAS side (but not the whole NAS, I assume because it is not the system disk)
(2) the Veeam backup failed with following messages:

 

4:01:23 PM    Getting list of local users     00:00
4:01:30 PM    Recovery partition (disk 2) (529.0 MB) Task failed unexpectedly    06:26
4:07:56 PM    Task failed unexpectedly     
4:07:57 PM    Network traffic verification detected no corrupted blocks     
4:07:57 PM    Processing finished with errors at 12/20/2022 4:07:57 PM     

 

No much information there except that it does not seem related to the RAID (or is it?) and this happens even if the initial free space is not small (albeit still not enough to complete).

 

 

Userlevel 7
Badge +13

It’s a known QNAP bug that I’ve encountered too, when it’s full it doen’t respond on the eth interface. You need to hard reboot and free some space.

Comment