Veeam Backup & Replication: v12 Compression Improvements


Userlevel 7
Badge +21

I was going through some Veeam documentation recently and noticed a major improvement to the compression ratios offered in Veeam Backup & Replication v12.

Historically (pre v12), if you were to leverage ‘High’ as your chosen compression ratio for a backup job then, relative to the ‘Optimal’ setting, you’d expect to see a 10x increase in CPU consumption, for a measly 10% increase in data compression. This has changed with v12, and you can now expect to see only a doubling of CPU consumption, for a massive 66% increase in data compression!

Not to be overlooked however is that Veeam also state you can expect 2x slower restores utilising ‘High’ vs ‘Optimal’. But this could certainly prove a massive benefit for secondary backup repositories, as these are typically the ones storing backups for the longest periods of time, improving density and reducing Total Cost of Ownership (TCO).

‘Extreme’ doesn’t miss out on improvements either! ‘Extreme’ was historically a further doubling of the extra CPU required for ‘High’ to achieve an additional 3% compression over ‘High’. However, in v12 these figures are also much more optimistic. ‘Extreme’ will now achieve a further 33% compression over ‘High’, at an additional cost of 5x the CPU of ‘High’.

That last point might sound like ‘Extreme’ is more CPU hungry than before, but it’s not, and we can prove this mathematically below.

 

Compression Method CPU Consumption relative to Optimal CPU Consumption Equation Data Consumption relative to Optimal Data Reduction Equation
v11 – High 1000% 100% x 10 = 1000% 90% 100% * (1 – 0.10) = 90%
v12 – High 200% 100% x 2 = 200% 40% 100% * (1 – 0.60) = 40%
v11 – Extreme 2000% (100% x 10) x 2 = 2000% 87.3% (100% * (1 – 0.10)) * (1 – 0.03) = 87.3%
v12 – Extreme 1000% (100% x 2) x 5 = 1000% 26.8% (100% * (1 – 0.60)) * (1 – 0.33) = 26.8%

 

Testing:

It’s all good and well seeing these theoretical numbers, but you know what really counts? Real world testing! So I fired up my v11 and v12 instances in my lab, built a Windows Server VM, installed SQL Server 2019 and the AdventureWorks test database, shut it down, and backed it up on optimal, high, and extreme on both VBR v11 and v12. Here’s my findings:

Source VM Size: 20.4GB consumed

Job Run Processing Time Processing Time Relative to v11 Optimal Full Backup Size Backup Size Relative to v11 Optimal
VBR v11 – Optimal 1 minute 9 seconds 100% 13.2GB 100%
VBR v11 – High 5 minutes 44 seconds 498% 11.8GB 89%
VBR v11 – Extreme 8 minutes 19 seconds 723% 11.4GB 86%
VBR v12 – Optimal 51 seconds 73% 13.2GB 100%
VBR v12 – High 1 minute 38 seconds 142% 11.2GB 84%
VBR v12 – Extreme 4 minutes 22 seconds 380% 11.0GB 83%

12 comments

Userlevel 7
Badge +19

Yep...read about the new Compression algorithm as well and was curious about the performance improvements. Thanks for putting it to the test and sharing the results Michael!

Userlevel 7
Badge +7

Thanks for this @MicoolPaul 

Always wondering what the savings were but never actually got round to testing them. 

Userlevel 7
Badge +6

Thank you @MicoolPaul 

Userlevel 7
Badge +21

Great article Michael.  Definitely loving the new compression within v12 and I liked seeing the comparison you did with tests.  Interesting and nice results.

Userlevel 6
Badge +3

Damn this is an awesome find! I’ve never really put much thought in tweaking these in the past because the perf hit was so high. But with these sorts of numbers, it’s not that bad anymore and worth more consideration.

Userlevel 7
Badge +17

The decreased time needed for compression in V12 is great. 😎

I use high and extreme compression levels not very often, but every performance improvement is welcome.

Userlevel 7
Badge +8

That is great. The lower time and added gains are pretty impressive. It seems like Extreme would only be good in rare cases for the performance hit.

 

My environment has so much non compressible data I’d be an awful candidate for advertising this :)   

Userlevel 7
Badge +2

Wow, this is awesome, thank you @MicoolPaul for sharing.

No matter how small the difference is, it is still a good improvements when you backup multiple VMs and keep the retention longer.

Well done and big thanks for the Veeam team for this algorithm improvement.

Userlevel 5

thanks and great information.

I’m working with a customer on use cases where we’d use a hardened repo (possibly with NetApp E-series) but need to understand deduplication and compression against a traditional PBBA like StoreOnce or Data Domain.

Any tests on this or use cases?

Userlevel 7
Badge +21

Hi @MavMikeVBR sorry can’t say I have, but dedupe appliances are always superior due to their ability to dedupe across all backup files, both between different restore points and different backups. I’ve seen over 2PB of backups on a 250TB Data Domain and only 200TB was consumed on the device. But your restores will always be slower. To that end I advocate on-prem object storage or solid state block storage as the shorter term primary backup chain, with dedupe used as a backup copy target when long on-site retention is needed but not on tape

Userlevel 7
Badge +6

Thanks for sharing, @MicoolPaul ! 👏🏻

Userlevel 7
Badge +8

Fantastic write up and I enjoy seeing “real” numbers verses theoretical. 

High doesn’t seem so bad compared to V11.  

It would be cool if there was a way that when there are no backups being run, a service could go and compress things further. Perhaps after a specified time range too so the older the backups get, they would take up less space with a slight restore penalty. 

 

Either way, that is very interesting and a HUGE improvement in V121 

Comment