Question

Veeam Backup Service completly stuck randomly



Show first post

39 comments

Userlevel 4
Badge

Since I installed the new version 12.0.0.1420, I get this error message: “Processing xyserveryx Error: Transmission pipeline hanged, aborting process”. This message appears almost 5 hours after the process hung up… So is there at least a possibility to shorten the time until the defective job aborts and falls into retry? I cant find any setting options...

Userlevel 4
Badge

Unfortunatly I cant close this topic yet. I updated Veeam to version 12.0.0.1420 recently, because I read that the new version solves some issues. Well, it didnt solve mine… I still have the problem that Veeam backup jobs gets stuck completely randomly. The new version seems to make the problem even worse, because the stuck job doesnt fall into retry. I really optimized the whole infrastructure. So I upgraded the memory of the backup server and host to be backed up. The backup job is executed at night (1:30 am) so nobody interferes the job. Still it gets stuck… Are there any good advices out there? The Veeam support doesnt do anything and closes all cases immediatly...

Userlevel 4
Badge

In retry3 the backup job now got stuck in exactly the same completion position as in retry2 (processed 573,3 GB). Thats exactly the same processed data as in retry2. Could that mean that the data I try to backup is kind of corrupted or something like that? As I mentioned above, yesterday and the day before all backups ran smooth and successfull….

Userlevel 4
Badge

After two days of all successfull backup jobs, tonight one backup job got stuck at 20%, then at 21% in retry1 and now at 24% in retry2. All three atempts together took 12 hours so far… The job itself is an incremental backup job of a fileserver with an SQL server installed (nothing more). Data on the server have been almost unchanged from yesterday, so the throughput shows almost read only at an average 114 MB/s. Then througput speed suddenly drops from 185 MB/s to 0KB/s… I am a bit clueless...

Userlevel 4
Badge

Hi Andanet, I use a very simple and basic setup. I run Veeam on a virtual Windows 2022 Server, which is installed on an ESXi 6.5 Host. The only Software on the Server is Veeam B&R 12. This backup server is my standard gateway also. There is no duplication or deduplication.

I have no storage infrastructure on Veeam. I just have backup repositories, one proxy and jobs. I run 2 subsequent backup jobs, one subsequent backup copy job and one subsequent RDX full backup job. All jobs are scheduled one after the other.

Userlevel 7
Badge +10

For example log file for the hanging job says:

  Resource not ready: gateway server
Processing finished with errors at 20.03.2023 21:17:37

 

This would match witch my perception, that the “backup service” got completly stuck.

 

Hi @Michail are you using gateway server on a deduplication appliance? 
how many concurrent tasks have you set on storage? 

Are you using synthetic full? 

Thanks 

 

Userlevel 4
Badge

No snapshots or replications around. I have a subsequent backup copy job, but this job doesnt start until the restore point of the failed job appeared...

Userlevel 7
Badge +6

Do you have any solution doing snapshots? SRM or any replication solution? may be it fails or stuck because there is another solution is taking a snapshot in the same time.

Userlevel 4
Badge

Hello Moustafa, good to see you again :) I use transport mode “Virtual Appliance” and I added all Veaam sources to exclusions. The bottleneck on the failed job was: Source 30% > Proxy 8% > Network 10% > Target 6%. The bottleneck on the successfull retry was: Load: Source 0% > Proxy 7% > Network 0% > Target 3%  

 

   

 

Userlevel 7
Badge +6

Hello @Michail 

what transport mode do the proxy use? did you configured Antivirus exclusion list? what is the bottleneck in the job progress?

Userlevel 4
Badge

For example log file for the hanging job says:

  Resource not ready: gateway server
Processing finished with errors at 20.03.2023 21:17:37

 

This would match witch my perception, that the “backup service” got completly stuck.

 

Userlevel 4
Badge

I went through some log files but couldnt find anything unusal. The logs basically show that I tried to stop hanging jobs and then restarted the backup server… because of the tons of log information, could some1 advice me, which log file(s) is/are most important?

Userlevel 4
Badge

Well I doubt its caused by a lack of CPU power, because some days all jobs run smooth and success without retry and within an abslolut acceptable time (incremental as well as full). The failures I am facing are absolutly random. But I will collect logs and post them. Tankes in advance :)

Userlevel 7
Badge +6

Chaining is no longer considered best practice as you lose out on the parallel task capabilities.  Also, note that proxy and repository roles are capable of parallel tasks, but it’s going to be dependent on how many CPU cores you have.  I max it to the number of cores available.

Finally, for the failures, grabbing the logs and telling us what the error is when it fails would be helpful as we can’t do much without more detail on the actual failure.

Comment