Question

Copyjob Best Practices: what are yours?


Userlevel 6
Badge +3

Hi, all

 

I am wanting to get some insight and pick you all’s brains on your best practices when it comes to copyjobs.

 

Here’s the situation I am in:

I have backups servers that have fairly complex backups schedules that range from hourly backups to daily, and even weekly backups.

 

Lately I have been running into copyjob failures that say “the restore point is locked by another job”. This is happening when primary backup file healthchecks, merges, or defrags run. I created a topic about this, last week, and @Chris.Childerhose gave me some good insight. I guess I am at take 2, at this point, and I need you all’s input.

 

The problem is, I have no idea how to schedule the copyjobs around these schedules...there just really isn’t a way, and consequently, i don’t know of a way to avoid these failures. They used to not happen. It’s only been recently that they have started cropping up. I am also working with support on this, but I would love you all’s input and get some insight into how you all tackle primary and copy backups and scheduling.


16 comments

Userlevel 7
Badge +5

Typically depending on your organization backup jobs tend to run overnight and then backup copy jobs can run during the day - if internet speeds permit this.

When you get in to overlapping backups and then add copy jobs on top of that it can become a challenge.

You can check here and possibly investigate the copy seeding as that would then make the copy jobs much smaller increments to send - Backup Copy Job General - Veeam Backup & Replication Best Practice Guide

Userlevel 7
Badge +4

You are getting warnings, aren't you?

The copy job will continue to work when the primary job is finished.

Unfortunately I have not found a way to prevent the warnings up to  now. But perhaps we are going to learn something new in this thread 😎

Userlevel 7
Badge +5

Another thing that could help with the Copy jobs is the Immediate mode so that when the primary job finishes it starts the copy job right away.  That might be something to help with this.

Userlevel 7
Badge +4

Another thing that could help with the Copy jobs is the Immediate mode so that when the primary job finishes it starts the copy job right away.  That might be something to help with this.


Yes, but the warnings remain….

Userlevel 7
Badge +5

Another thing that could help with the Copy jobs is the Immediate mode so that when the primary job finishes it starts the copy job right away.  That might be something to help with this.


Yes, but the warnings remain….

True but only if the primary job kicks off again.  I would not be concerned with warnings as long as my jobs complete. :smiley:

Userlevel 7
Badge +4

Yeah, this is what I said above, the copy job will finish successful.

The warnings are not ideal. It is not a real problem….

Userlevel 6
Badge +3

You are getting warnings, aren't you?

The copy job will continue to work when the primary job is finished.

Unfortunately I have not found a way to prevent the warnings up to  now. But perhaps we are going to learn something new in this thread 😎

They are outright failing when it happens. They do start over on their own at some point. I am not sure if it is after the next restore point becomes available, or if it is a retry.

Userlevel 6
Badge +3

Another thing that could help with the Copy jobs is the Immediate mode so that when the primary job finishes it starts the copy job right away.  That might be something to help with this.

That’s currently the mode I am using. I have it set to immediate with 24 hour scheduling. That’s across the board so far.

Userlevel 7
Badge +4

Mhh, the copy jobs are not failing in my environments…

Then you should consider a support case, I think.

Userlevel 6
Badge +3

Mhh, the copy jobs are not failing in my environments…

Then you should consider a support case, I think.

Already ahead of you :grinning:

 

Working with a good support agent. I wanted to rope you all in as well to get some more insight and maybe see if you all are doing something different that I am not. By chance, did you have to configure anything in the registry to tell Veeam to throw warnings instead of failures, in those instances?

Userlevel 7
Badge +5

Another thing that could help with the Copy jobs is the Immediate mode so that when the primary job finishes it starts the copy job right away.  That might be something to help with this.

That’s currently the mode I am using. I have it set to immediate with 24 hour scheduling. That’s across the board so far.

Maybe due to the number of jobs and scheduling it is time to spin up another Veeam server to move some stuff over?  That might be another way to get around things.

Userlevel 6
Badge +3

Another thing that could help with the Copy jobs is the Immediate mode so that when the primary job finishes it starts the copy job right away.  That might be something to help with this.

That’s currently the mode I am using. I have it set to immediate with 24 hour scheduling. That’s across the board so far.

Maybe due to the number of jobs and scheduling it is time to spin up another Veeam server to move some stuff over?  That might be another way to get around things.

That may be the case.

 

Now that you mention that, on two of the servers this is happening on, there aren’t a large number of VMs or jobs that run. One server has 1 job with 3 VMs; one of the others has 2 jobs, one with 3 VMs, the other with 2 VMs.

 

It is really, really bizarre and just started happening very recently. We’ve narrowed it down to the 3 task types that are causing the issue (merge, health check, defrag), but we haven’t yet found an answer as to why it is happening and why it has started suddenly. I will for certain let you all know what solution I find when it is found.

I really appreciate all of you all’s input and help! I keep wanting to say y’all...the Southerner in me :cowboy:

Userlevel 7
Badge +4

No, I haven't set any registry keys...

Userlevel 5
Badge

Hey @bp4JC one thing I always do that I haven’t seen above is to mess with the bandwidth throttle and leave the jobs in continuous mode. If you aren’t already letting them run all day every day you can set the jobs to continuous mode then go into your bandwidth throttle and while letting them run full speed over night set a window of your operation hours and choke the bandwidth down to a level that you are comfortable with. This will let your backup copy window be larger and hopefully complete.

Regarding seeing this when you do the integrity checks there’s a couple things here. If your type of business can stomach it set the integrity checks to start right after the end of your job run on Friday night/Saturday morning then either turn off your weekend backup copies or maybe in your case a) go in on Friday morning of the day it’s supposed to do the integrity check and disable run on Saturday and Sunday and change the job settings to let it run for 3 days without timeout. 

Finally I was reading in the release notes of 11a today that one of the improvements there is with these checks and their performance, may be worth testing.

Userlevel 7
Badge +5

Hey @bp4JC one thing I always do that I haven’t seen above is to mess with the bandwidth throttle and leave the jobs in continuous mode. If you aren’t already letting them run all day every day you can set the jobs to continuous mode then go into your bandwidth throttle and while letting them run full speed over night set a window of your operation hours and choke the bandwidth down to a level that you are comfortable with. This will let your backup copy window be larger and hopefully complete.

Regarding seeing this when you do the integrity checks there’s a couple things here. If your type of business can stomach it set the integrity checks to start right after the end of your job run on Friday night/Saturday morning then either turn off your weekend backup copies or maybe in your case a) go in on Friday morning of the day it’s supposed to do the integrity check and disable run on Saturday and Sunday and change the job settings to let it run for 3 days without timeout. 

Finally I was reading in the release notes of 11a today that one of the improvements there is with these checks and their performance, may be worth testing.

Good point here to check out 11a based on the improvements made.

Userlevel 6
Badge +3

Hey @bp4JC one thing I always do that I haven’t seen above is to mess with the bandwidth throttle and leave the jobs in continuous mode. If you aren’t already letting them run all day every day you can set the jobs to continuous mode then go into your bandwidth throttle and while letting them run full speed over night set a window of your operation hours and choke the bandwidth down to a level that you are comfortable with. This will let your backup copy window be larger and hopefully complete.

Regarding seeing this when you do the integrity checks there’s a couple things here. If your type of business can stomach it set the integrity checks to start right after the end of your job run on Friday night/Saturday morning then either turn off your weekend backup copies or maybe in your case a) go in on Friday morning of the day it’s supposed to do the integrity check and disable run on Saturday and Sunday and change the job settings to let it run for 3 days without timeout. 

Finally I was reading in the release notes of 11a today that one of the improvements there is with these checks and their performance, may be worth testing.

 

Thanks so much for your input on this! I really appreciate it.

 

I am going to look into some settings today. I did not know that I can get granular with my integrity checks and actually schedule the day and time. I am going to look into that.

 

I do currently have them on continuous...I have been wondering if that is the issue. I am starting to think this may be something we just have to contend with. I will definitely look into 11a. Do any of you know if we use cloud connect, does the cloud connect server have to be on 11a before the backup servers can be upgraded?

Comment