Linux Backup Repository woes


Userlevel 6
Badge +1
  • Comes here often
  • 74 comments

Hi,

I’m a bit confused about the difference of old Veeam forum and new community discussion board.  I just posted in old forum, but as it’s not allowed to post log file content there, I’ll give it a try here too. 

 

I'm setting up our first Linux based backup repository based on Apollo 4510 server (Veeam 10). Now the jobs have different kind of errors. Before I open a case I wanted to know if those problems are known issues with an easy workaround. This all seems to be related to load, as it happens only when multiple backups are running. But there is not much CPU load and no errors in the usual linux logs.

#1 at this time backup jobs were writing active fulls with 1,5GB/s over LAN to the server, CPU was 90% idle (52 cores). Nothing interesting in /var/log messages.

[15.05.2021 15:19:27] <326> Error Failed to upload file D:\Veeam\Backup\VeeamAgent64 to /tmp/VeeamAgent0bc9a8bd-ebd8-44b8-a373-44510aefd89f
[15.05.2021 15:19:27] <326> Error Failed to find terminal prompt: timeout occurred (60 sec) (System.Exception)

#2 Sometimes the extents are just gone. I don't see any warning in Linux, for the Linux server itself the device was present all time.

backup: 15.05.2021 16:40:19 :: Error: DE-WOP-B01-E01-Test extent is offline.
copy: 15.05.2021 16:29:11 :: Error: Some extents storing required backup files are offline


#3 There seems to be a problem with password too sometimes (sudo), but as this is the same server, same job, just another task, it can't be a general problem with permissions.

15.05.2021 15:37:34 :: Error: Permission denied (password).

#4 connection attempts are failing sometimes

12.05.2021 16:50:37 :: Processing SDET2509 Error: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond

#5 /tmp and Veeam user home are not cleaned up

4-7 GB are still in each of those directories, /tmp was only a 4GB partition at beginning but I had to expand it to 14GB to finish a job without errors. Shouldn't Veeam cleanup it's own mess?
 

-rwxrwxrwx. 1 root root 62964808 May 12 16:45 VeeamAgent655b0e07-2e34-4899-8087-a76ed7a69971
.....

 

-rwxrwxrwx. 1 xxxxxx xxxxx 62964808 May 12 16:51 79bfc3e6-a1bb-4f44-881c-1625b4f7509b
...

Top


11 comments

Userlevel 6
Badge +1

I’ve found some KB’s and forum threads with more information regarding Linux repositories. I also reduced the concurrent tasks to 26 for each of the 2 volmes, as this looks more realistic. 

 

Userlevel 6
Badge +1

Well, I know that there is Veeam support and I’ve created a lot of cases there. But sometimes I got better feedback and hints in forum than from support and usually quicker. Maybe I should then post those questions at reddit where it’s ok to post technical problems.

 

Anyhow, I’ll create a ticket (probably more as those are different issues) on monday. I was a bit naive to believe Linux repositories would be less trouble than SMB shares ;)

Userlevel 6
Badge +1

We have workloads in both DC’s. It’s a stretched setup with mirrored IBM SVC volumes. We are backing up parts in one DC others in the second one. Then we copy the jobs from one side to the over so that we always have either the backup or the copy of a VM available if a DC is down.

 

We have one VBR server in a mgmt cluster, we are thinking about a cold standby VBR VB where we can import config backup if needed.

 

 

 

Userlevel 7
Badge +12

The R&R forums are not intended for free technical support if you face issues. Veeam has a support department for that. Because of that, it’s not allowed to upload log files to the R&D Forum.

You can ask for technical advise or directly talk to veeams product manager there. And of course, If you want to post a technical issue, you can do that. But you have to post your case number for the product manager to have a reference of the issue.

 

————-

Back to your problem.

I will get soon some HPE Apollo appliances. If I face the same issue, i can report it back to you. I don‘t see any problems at the moment with a small Linux Hardened Repo Server (Ubuntu 20.04, 40TB), but I am not sure about the tmp partition. It‘s possible, that it is on the same partition as the root disk. Not a dedicated partition.

 

 

Userlevel 7
Badge +12

Linux works fine, but I have learned that it needs more care as a simple SMB Share to work 100% flawlessly :)

 

The community pages here are a good place for this technical issues. You should be able to upload logs here. Someone will have an answer or the experience to comment on your issues :)

Userlevel 6
Badge +1

Is there documentation about Linux specific settings, other than https://www.veeam.com/kb2216?

Userlevel 7
Badge +12

We have dedicated Teams for managing the linux servers in my company. They install and configure for me the linux OS and hardware. I‘m happy that I hav
I only configure the part in your mentioned KB and the part where the xfs filesystem is created for cfs with reflink support.

I'm glad I don't have to do everything myself under Linux.

 

Userlevel 6
Badge +1

It’s the same here. But it’s not much that is in this KB. There are other KB’s for ssh settings etc, would be nice to have everything in one place.

Userlevel 6
Badge +1

If you have backups and copy jobs between 2 datacenter and lets say 3 servers at each dc, and you have server with 2 RAID volumes, would you use dedicated servers for backup and copy jobs or would you use one volume in each server for backup and the other for copy?

Userlevel 7
Badge +12

I see two scenarios:

 

3 Servers in DC1 —> Backup Jobs

3 Servers in DC2 —> Backup Copy Jobs

but only if all your workload is running in DC1. Consider a SOBR with the Raid Volumes included

 

—————-

 

If you have workload in both Datacenter, consider a mix of Backup Job and Backup Copy Job on each server.

Backup Job from Workload in DC1 on Raid Volumes in DC1, Copy Job to Raid Volumes in DC2

Backup Job from Workload in DC2 on Raid Volumes in DC2, Copy Job to Raid Volumes in DC1

Userlevel 7
Badge +17

But you don’t have a dedicated VBR server in this scenario…

Ok, in such a small environment this is perhaps a bit oversized. :sunglasses:

Comment