Veeam is not using all available tape drives in library; why?
I have a tape library with 4 drives. Veeam seems to be only using 2 of them. For example, right now I see all 4 drives - 1 is writing to 1 tape job, a 2nd is writing to another tape job, the other 2 drives are sitting, showing “idle”.
Yet I have 5 more tape backup jobs sitting in queue, all showing status “0% completed”. Why isn’t Veeam writing some of those jobs to those 2 idle drives?
I do have a setting on each tape job to limit it to no more than 2 drives, because I don’t want any 1 job to monopolize all tape drives. But Veeam seems to think that means to use only 2 drives *total*, not per-job.
Have I missed some setting? Or set something wrong? I have a lot of data to write to tape every night, and more to come. In fact, I am waiting on 2 more tape drives, for a total of 6. But Veeam isn’t utilizing all the drives in the library.
I did ask Tech Support, while I was troubleshooting another tape issue, but I never got a response to this question.
Anyone? I feel it has to be something wrong that I have done. But I can’t see what …
Page 1 / 1
If the tape jobs run after a main backup job the job could still be running. I know with tape it takes time to show them being used at times. But if filles are locked by a backup job that will delay them starting.
They are not locked. We backup to a Data Domain device (basically, a NAS). Then we clone from there to tape library (a “backup to tape” jobs; clone is the term I’m used to, from using EMC Networker for years). All jobs that write to the DD are done (have been done, for hours). The only running jobs are jobs writing to tape, and all write from (finished) backup jobs.
They are not locked. We backup to a Data Domain device (basically, a NAS). Then we clone from there to tape library (a “backup to tape” jobs; clone is the term I’m used to, from using EMC Networker for years). All jobs that write to the DD are done (have been done, for hours). The only running jobs are jobs writing to tape, and all write from (finished) backup jobs.
Ok that makes sense. Not sure why they are not starting but might check the advanced settings to see if you have the process latest restore point only option selected. Makes things faster.
All tape backup jobs have the “Process latest full backup chain only” set.
If I click on the job, and look at the job progress, I see each job says “waiting for tape infrastructure availability”. Yet there are tape drives available.
The tape server is a Dell PowerEdge R740. The tape library is a Quantum Scalar i3 (LTO7 drives).
There are 4 drives, connected by SAS directly to the tape server.
I see no errors on the tape server event logs, the utilization there is like 10% CPU, less than that on memory. I see no errors or warnings on the B&R screens …
It’s just ignoring half my tape drives ….
Does Veeam see all tape drives? I know dumb question but with partitions on tape libraries this could cause issues. More than likely not the cause but maybe contacting support will be a better option now.
Yes, Veeam shows 4 drives under the library. The library entry shows 94 slots, 4 drives (which is correct).
I have empty tapes in the FREE pool. I even tried moving the FREE tapes directly to the pool of the tape jobs, rather than relying on Veeam to do that. No difference. (not that it needed it, I have multiple tapes in the pool with more than 3TB free on each tape, so there are tapes in the pool available).
Looks to me like everything is available, but apparently Veeam doesn’t agree …
How many Jobs do you have, and how many VM’s are in each job?
If you have 6 drives, 1 job, and 6 VM’s in that job it will use 6 drives provided you add the following setting on the Media Pool.
If Parallel processing isn’t enabled, it will only use 1 drive in sequential order.
2 jobs, without parallel processing would only use 2 drives even if they have multiple VM’s.
Is there a reason to have so many separate jobs vs adding multiple VM’s to less jobs?
Also, On the Tape Job settings, you can limit a job to use less drives.
This is handy if you want to give the media pool access to use all 4 drives in your case, but limit each job to only use 2 or 3 drives at a time. The thing is, Tape jobs will queue up and run, so if you want to get the most data onto tape as fast as possible, you want the drives streaming as much as they can.
If there is a lock, often Veeam will have to wait for that server to finish it’s backup to use the latest restore point. If you backup weekly to tape, selecting “Process the most recent restore point instead of waiting” in the Tape Job settings allow you to grab the one from the day before if it meets your business requirements. (I find it pretty handy if we need a weekly, but they aren’t super sticky on if it’s from a day prior)
Right now I have several at 0%, but that is because all my other available drives are streaming data to the tapes. The last “Action” is backup to media set started at X. so It does sound like your media pools may not be set to use all the drives.
Without seeing your Tape Job and Media pool settings it’s tough to say.
All tape backup jobs have the “Process latest full backup chain only” set.
If I click on the job, and look at the job progress, I see each job says “waiting for tape infrastructure availability”. Yet there are tape drives available.
The tape server is a Dell PowerEdge R740. The tape library is a Quantum Scalar i3 (LTO7 drives).
There are 4 drives, connected by SAS directly to the tape server.
I see no errors on the tape server event logs, the utilization there is like 10% CPU, less than that on memory. I see no errors or warnings on the B&R screens …
It’s just ignoring half my tape drives ….
I see each job says “waiting for tape infrastructure availability”.
- This tells me that the Tape jobs do not see the other drives.
If the media pool used isn’t set to use more drives, there won’t be any available.
This could also be a driver or config issue if Veeam doesn’t see the drives, but if you can see them in Veeam that should be fine.
Now, to add another possibility.
How are your proxies, repos, and Tape servers configured? How many CPU slots and tasks set for each?
If you are using your Proxy, Repo and Tape server all on the same box, Each job is going to use a multiple of CPU slots. Provided you haven't given each a ton of head room, you could actually be waiting on the proxy or repo to fill up a CPU slot or task to move on to the next VM.
This could also happen if your servers are separate, but your proxies or repos are busy with no available slots.
FYI, don’t use my settings, under Veeam Best practices you can read about how many concurrent tasks per CPU thread. Or just make sure the question mark is green.
How many jobs total, or how many running? When I posted, I had 7 running. Now, it’s down to 1 job, the others did finally finish.
As to the number of VMs, it depends. On the jobs today, each one probably had at most maybe 5 VMs and a file share (so 2 objects - 2 different backup jobs, 1 job with 5 VMs, 1 job with a file share backup). But some jobs are just 1 or 2 VMs.
Why? Logiical organization. Plus, it’s faster speard over multiple jobs - if I have 1 job with 20 VMs, and it errors, the whole job stops. With 4 jobs of 5 VMs each, if 1 job errors, the other continue on ...
BUT … I did have that pool definition set to use only 2 drives (probably a leftover from when we first started, before we bought more drives). And yes, I have parallel processing enabled for that pool. I’ve now set it to 4.
However, with only 1 job running, at the moment, it probably won’t matter. But tonight, when I have well over a dozen jobs set to run (at various hours), it should make a difference, and finish faster! (I hope)
Thanks, I hadn’t noticed that option wasn’t set optimally (anymore). We’ll see how it goes tomorrow (and then the weekend, when I have weekly only jobs set to run, as well).
All of my proxies have green check marks, yes. However, 1 is set for 6 concurrent (8 CPUs - 2 cores, 4 sockets), the other is set for 16 (8 CPUs - 8 cores, 1 socket). I think I will adjust it so both are the same.
My media pool was not set to use all tape drives, I believe this is most, if not all, of my issue. We’ll see over the next few days.
All of my proxies have green check marks, yes. However, 1 is set for 6 concurrent (8 CPUs - 2 cores, 4 sockets), the other is set for 16 (8 CPUs - 8 cores, 1 socket). I think I will adjust it so both are the same.
My media pool was not set to use all tape drives, I believe this is most, if not all, of my issue. We’ll see over the next few days.
Glad to hear you figured it out.
Right on. Sounds like that being set to 2 on the pool was your issue.
I logically did something similar with my file server jobs because they are huge… like 40-50TB servers. For portability, or moving one, it takes ages so having a “FILE SERVER BACKUP” job with 20 of those is a nightmare to deal with.
For tape, I put as much as I can into a single job to get the most use out of my tapes. Unless you start all your jobs at once and queue them up, you will end up where you are now with 1 job running and only using 1-2 tapes potentially at some points.
That being said, if it makes sense, separate them.
Those jobs with 1-2 VM’s are never going to max out the potential of your drives though unless you run them at the same time, which is essentially the same as having 1 job.
I usually am creating media pools based on retention policy's… say 6 months, 1 year, 2 years, 5 years etc. Sometimes it’s for a specific area of business or purpose too that has different restrictions.
The jobs are more or less just a schedule. With some settings to add notifications, scripts etc. I do think having more VM’s per job will speed up your backups in this case. You will also benefit that if the Tape job is being run faster, it won’t queue up the backup job if it kicks off shortly after.