Veeam B&R: Performance Impact of Pending Tasks

  • 21 August 2023
  • 7 comments
  • 222 views

Userlevel 7
Badge +20
Alt Text: Blog Post Title on colourful background

Tell me if you’ve seen this situation before. You’ve got some servers & applications that need protecting. You create your backup jobs as necessary, there’s different backup jobs for your various application-integration requirements, backup frequencies, and retention requirements, it’s starting to sprawl. Inevitably there’s some job scheduling overlap. Backup jobs are waiting for other backup jobs to complete, the running tasks section of your Veeam Backup & Replication Server is ticking upwards, no biggie right? It’ll clear down eventually! Well… Let’s explore that shall we?

 

Background/Test Scenario

 

I wanted to see just how much having multiple running jobs would impact VBR, when those jobs were all in a pending state, barring one. I then wanted to compare this impact vs scheduling backup jobs within chains.

I created a Windows VM, installed the OS, then powered it down, so I’d have a static test VM to backup. I configured my backup proxy and repository on a standalone server, with one task concurrency each. I then created a backup job, with this single proxy and repository defined against the job, and no application-aware processing or other such guest processing (as I said, the server is powered down). Since the proxy & repository are on a separate server, the VBR server is being used purely for backup job orchestration, nothing more. For completeness of detail, the VBR server has got a Postgres instance installed on it too. The VBR server was also under-spec’d, due to lab restrictions. The VBR server had 2 vCPU, and 10GB of RAM, and was running on the super-duper fast (thanks Intel + VMware!) Intel Optane PCI-E storage, with the proxy & repository server running on a completely separate Intel Optane PCI-E storage device.

 

Baseline Metrics: The Idle Server

 

I gathered some base metrics of the VBR server, idling without any jobs running, and saw the following:

Average CPU: 15% Consumption
Average CPU Threads: 1500-1600
Average CPU Processes: 160
Average RAM: 4.2GB
Average number of network ports open: 300

With a baseline established, we can now begin testing our scenario.

 

Test One: Simultaneous Backup Jobs

 

I cloned the backup job 19 times to have a total of 20 backup jobs, all with identical configuration, and I manually started them at the same time. As the jobs were setting up, the CPU immediately reached 100% utilisation, and remained there whilst the jobs were being initiated. Once established, the CPU utilisation dropped, and instead I was presented with the following averages:

Average CPU: 25% Consumption
Average CPU Threads: 2600-2700
Average CPU Processes: 240
Average RAM: 6.4GB
Average number of network ports open: 600

I then repeated the test with only 10 backup jobs enabled, to see how the resources scaled at a lower level.

Average CPU: 23% Consumption
Average CPU Threads: 2000-2100
Average CPU Processes: 210
Average RAM: 5.2GB
Average number of network ports open: 550

 

Test Two: Sequential Backup Jobs

 

I then deleted my backups within my backup repository, to ensure these would be new chains again with Active Full backups, to keep things identical. I then converted each job to be chained off of it’s previous. SequentialBackupJob_2 followed SequentialBackupJob_1 etc. I didn’t see a 100% CPU utilisation during the start up of any of the jobs, instead having a brief spike towards 80-90% utilisation, but the system remaining responsive. Once established, I saw the below figures:

Average CPU: 21% Consumption
Average CPU Threads: 1550-1650
Average CPU Processes: 170
Average RAM: 4.5GB
Average number of network ports open: 330

Just like with the first tests, I then conducted this test again with only 10 backup jobs enabled, to see how the resources scaled at a lower level.

Average CPU: 21% Consumption
Average CPU Threads: 1550-1650
Average CPU Processes: 170
Average RAM: 4.5GB
Average number of network ports open: 330

 

Collated Results

 

To make things easier to digest, I’ve collated these into separate tables:

Average CPU Utilisation CPU Utilisation vs Baseline – Metric vs Baseline – %
Idle 15% N/A N/A
10 Simultaneous Backup Jobs 23% +8% +53%
20 Simultaneous Backup Jobs 25% +10% +66%
10 Sequential Backup Jobs 21% +6% +40%
20 Sequential Backup Jobs 21% +6% +40%
Average CPU Threads CPU Threads vs Baseline – Metric vs Baseline – %
Idle 1500-1600 N/A N/A
10 Simultaneous Backup Jobs 2000-2100 +500 +32%
20 Simultaneous Backup Jobs 2600-2700 +1100 +71%
10 Sequential Backup Jobs 1550-1650 +50 +3%
20 Sequential Backup Jobs 1550-1650 +50 +3%
Average CPU Processes CPU Processes vs Baseline – Metric vs Baseline – %
Idle 160 N/A N/A
10 Simultaneous Backup Jobs 210 +30 +25%
20 Simultaneous Backup Jobs 240 +60 +50%
10 Sequential Backup Jobs 170 +10 +6%
10 Sequential Backup Jobs 170 +10 +6%
Average RAM Consumption RAM Consumed vs Baseline – Metric vs Baseline – %
Idle 4.2GB N/A N/A
10 Simultaneous Backup Jobs 5.2GB +1000MB 24%
20 Simultaneous Backup Jobs 6.4GB +2200MB +52%
10 Sequential Backup Jobs 4.5GB +300MB +7%
20 Sequential Backup Jobs 4.5GB +300MB +7%
Average Network Ports Network Ports Consumed vs Baseline – Metric vs Baseline – %
Idle 300 N/A N/A
10 Simultaneous Backup Jobs 500 +200 +66%
20 Simultaneous Backup Jobs 600 +300 +100%
10 Sequential Backup Jobs 330 +30 +10%
20 Sequential Backup Jobs 330 +30 +10%

 

 

Conclusion

 

Now, this was just a couple of tests to evaluate some theories I had around backup job chaining vs simultaneous backup jobs, and it trended as I expected. There were some unexpected values, such as the network port delta between 10 and 20 simultaneous backup jobs, but the simultaneous backup job values were mostly in a scaled amount to each other.

Were I to explore this subject further, I’d add extra values for 5, 25, 30, 40, and 50 simultaneous vs sequential backup jobs. I’d also go further to capture exclusively the threads and processes related to the Veeam tasks, and the specific CPU, RAM, and network consumption of these processes, to better isolate any coincidental resource utilisation by other system processes, such as Windows Update backup checks, or Windows Defender as two typical examples.

The findings are pretty clear however, more jobs running simultaneously, even in a pending state, will demand more of your system. From a RAM perspective, this makes perfect sense, as Veeam has had to enumerate all data related to the backup processing and it sitting awaiting the ability to execute the tasks related to this data.

The above also highlights a potential downside to backup job chaining. When we process all the jobs simultaneously, we enumerate all backup data in advance, such as which VMs, which proxies, guest-processing settings etc, so when it’s our time slot, we can start running near immediately. When using sequential backup jobs however, this data isn’t collated until the backup job starts, increasing the idle time for your proxy & repository between backup jobs. This could have a negative effect on how long your backups run for in total, if you consider a scenario where you had 4 proxies, 3 idle, but one processing a final VM within your backup job, the next backup job won’t start until this single proxy processes the final VM. On the flip-side however, using backup job chaining can also cause your backup jobs to start faster, as VBR doesn’t have to enumerate data for other backup jobs that it doesn’t yet have room to process any workloads for.

In summary, the above should give you a good idea of what Veeam has to do when creating and maintaining multiple simultaneous backup jobs vs backup job chains, which will help you to create optimised backup schedules to get the most out of your environments.


7 comments

Userlevel 7
Badge +20

This is a great article as many ask about performance of backups and tasks.  Really cool to see the comparison data for sure and how it correlates to CPU/RAM and network.

Userlevel 7
Badge +8

Good info.. It would be interesting to compare the time it takes to complete the whole window when running all at once vs sequential.

 

Personally, having high CPU usage means my server is working which I want. Pinned at 100 means it could use more resources or I could spread the jobs out a bit. 

 

After some maintenance the other day I had everything disabled, instead of clicking enable while everything was highlighted, I clicked “Start” That sure sent my Veeam CPU to 100% in a hurry and locked up my console for a bit. To be fair I have a large environment but everything still ran as it should.

 

I chain a few of my file server jobs that are 30+ TB as when I have some of these doing a merge it can create a lot of IO on my Veeam SAN.  When multiple of these a running at once the latency gets pretty high.  

Userlevel 7
Badge +17

Nice write-up Michael. Like the easy-to-read table summarizing it all too!

Userlevel 7
Badge +6

As I read this, I thought to myself, “Michael sure seems to have a lot more free time than I do….”

 

Seriously...thanks for posting this.  Neat to see some solid numbers of how these things work...

Userlevel 7
Badge +20

@Scott ouch 😆 been there myself! I actually had a weird scenario whereby when I converted each job to chained, they started running even though the job that should start them had finished about 10 minutes ago! So I kept getting CPU spikes!

 

It’s a good idea you suggest, and I’ve still got the jobs so I could test it… Though as the VBR CPU a is undersized I worry that would skew the results, but I suppose I can try 🙂

 

@dloseke 😆 I can assure you, I really don’t… sometimes I need to test things for work and that’s where a lot of my inspiration comes from!

Userlevel 7
Badge +6

😁

I hear you on that one…..I seem to have less and less fun testing time and more time allocated to required testing.

Userlevel 7
Badge +8

@Scott ouch 😆 been there myself! I actually had a weird scenario whereby when I converted each job to chained, they started running even though the job that should start them had finished about 10 minutes ago! So I kept getting CPU spikes!

 

It’s a good idea you suggest, and I’ve still got the jobs so I could test it… Though as the VBR CPU a is undersized I worry that would skew the results, but I suppose I can try 🙂

 

@dloseke 😆 I can assure you, I really don’t… sometimes I need to test things for work and that’s where a lot of my inspiration comes from!

It would actually be cool to see it in an undersized and a properly sized and an oversized environment to see the difference.

 

With being undersized and at 100% perhaps the chained backup would end up doing better compared to running everything at once. I’d assume with plenty of CPU concurrency is going to thrive.

 

I always seem to be limited by line speed (fiber/network) or IOPS. my next task is to redesign my environment a bit, add several proxies, and start using SOBR instead of a single SAN.  

Comment