Veeam: inline deduplication

Forum|Forum|3 years ago
January 26, 2023
10 comments
6685 views

+12

marcofabbri
On the path to Greatness

While I was studying for the VMCE, a detail on deduplication caught my attention.

I’m talking about inline deduplication.

What's about?

Inline deduplication is a feature offered by Veeam B&R and it’s used to reduce the amount of storage space required to store backup data by removing duplicate data. One of the main benefits of inline deduplication is in fact that it can significantly decrease the amount of storage space, amount of time and resources required for backups and this can be especially beneficial for organizations that generate a large amount of data on a regular basis.

Another advantage of inline deduplication is that it can improve the efficiency of backups, ‘cause backups can be completed faster and the amount of data that needs to be transferred and stored is reduced.

An example?

An immaginary company named CatsFood has a file server that contains 10 terabytes of data and the organization wants to create a backup of this file server using, proudly, Veeam. Without inline deduplication the backup would require 10 terabytes of storage space. However, with inline deduplication enabled Veeam would identify and remove duplicate data before creating the backup: in fact if file server contaisn multiple copies of the same file or multiple versions of the same file that have been modified slightly, Veeam's inline deduplication feature would identify these duplicate files and “remove them”, reducing the amount of data that needs to be stored in the backup. Let's say that the inline deduplication feature was able to remove 2 terabytes of duplicate data. This would mean that the backup would only require 8 terabytes of storage space, as opposed to 10 terabytes without deduplication.

This is a significant reduction in storage requirements and can help organizations save on storage costs. It's worth noting that Veeam's inline deduplication feature can be configured with different settings to optimize the performance and storage space usage. For instance, you can configure the level of deduplication, the block size and the compression level.

And what if repository’s a dedup storage?

Best practices says disable inline deduplication setting when writing into deduplication storages.

Find more here:

Deduplication Appliance Best Practices: https://www.veeam.com/kb1745

Help center: https://helpcenter.veeam.com/docs/backup/hyperv/compression_deduplication.html?ver=110

dips
On the path to Greatness
Forum|Forum|3 years ago
January 26, 2023

Great thank you @marcofabbri This will come in really handy for me when studying.

Dipen N. K. | IT Security Specialist | Veeam Legend 2022 - 2025 | Former Cyber Security Space VUG Leader

BertrandFR
Influencer
Forum|Forum|3 years ago
January 26, 2023

Hey @marcofabbri , great article. I will add the behavior of inline deduplication will be different if you’re using per vm backup files or per jobs. Obviously better results on per jobs but less speed!

+18

JMeixner
On the path to Greatness
Forum|Forum|3 years ago
January 26, 2023

Yes, inline deduplication is a great feature.

Please be aware that deduplicating storage is not recommended for primary repositories. Your restores will be very slow with this configuration.

Better use a “normal” block storage (or object storage with V12) and Veeam's inline deduplication. And use a deduplicating storage for backup copy repositories only.

+13

Andanet
Veeam Legend
Forum|Forum|3 years ago
January 26, 2023

Thanks @marcofabbri . I’ve 2 little question to clarify my ideas….
1. with compression the backup size is reduced to 50% of source… from helpcenter:

(Veeam Backup & Replication assumes that the following amount of space is required for backup files:

The size of the first full backup file is equal to 50% of source VM data)

for better performance every backup must be written in different reporsitory depending on VM size using local or LAN target if repository type is SAN/DAS or NAS.

Correct?

I’m studiyng for VMCA exam and I don't want to get confused! :P

Thanks

+22

Chris.Childerhose
Veeam Legend, Veeam Vanguard
Forum|Forum|3 years ago
January 26, 2023

Yes, inline deduplication is a great feature.

Please be aware that deduplicating storage is not recommended for primary repositories. Your restores will be very slow with this configuration.

Better use a “normal” block storage (or object storage with V12) and Veeam's inline deduplication. And use a deduplicating storage for backup copy repositories only.

Totally agree with this for sure.

Great post Marco.

+12

marcofabbri
Author
On the path to Greatness
Forum|Forum|3 years ago
January 26, 2023

Better use a “normal” block storage (or object storage with V12) and Veeam's inline deduplication. And use a deduplicating storage for backup copy repositories only.

Totally agree with you!

Backups are like pizza, I love them. | Linkedin: @marco-fabbri-it

+12

marcofabbri
Author
On the path to Greatness
Forum|Forum|3 years ago
January 26, 2023

The size of the first full backup file is equal to 50% of source VM data)

for better performance every backup must be written in different reporsitory depending on VM size using local or LAN target if repository type is SAN/DAS or NAS.

Correct?

Ad help center says:

The size of the first full backup file is equal to 50% of source VM data.
The size of further full backup files is equal to 100% of the previous full backup file size.

So yes for the first question.

For the second one, did you mean this? ‘cause there’s a great post about that:

https://community.veeam.com/blogs-and-podcasts-57/object-storage-impact-of-job-storage-optimization-settings-in-v11-2601

Backups are like pizza, I love them. | Linkedin: @marco-fabbri-it

+13

Andanet
Veeam Legend
Forum|Forum|3 years ago
January 26, 2023

The size of the first full backup file is equal to 50% of source VM data)

for better performance every backup must be written in different reporsitory depending on VM size using local or LAN target if repository type is SAN/DAS or NAS.

Correct?

Ad help center says:

The size of the first full backup file is equal to 50% of source VM data.
The size of further full backup files is equal to 100% of the previous full backup file size.

So yes for the first question.

Ok Marco this means your 10TB of VM’s size they become 5TB of backup size.

For the second one, did you mean this? ‘cause there’s a great post about that:

https://community.veeam.com/blogs-and-podcasts-57/object-storage-impact-of-job-storage-optimization-settings-in-v11-2601

Correct…. for second item this storage optimization permit us to save a little bit of space (also better throughput and processing time) on repository due to block size optimization.

In addition, your focus on Inline deduplication, permit the achievement to save more space on repo.

dloseke
Veeam Vanguard
Forum|Forum|3 years ago
January 26, 2023

I was looking for the deduplicating storage device caveat. I would like to note that if Veeam’s deduplication is anything like most deduplication, and it proabably is but I haven’t tested it, deduplication (and compression) doesn’t work well for things like encrypted data, video and images. But again, if the files exist multiple times on a server, it can make a big difference.

Derek M. Loseke, Senior Systems Engineer | Veeam Vanguard 2025-2026 | Veeam Legend 2022-2024 | https://technotesanddadjokes.com | @dloseke

dloseke
Veeam Vanguard
Forum|Forum|3 years ago
January 26, 2023

It should (probably) be noted that per-VM is the default with V12 vs previous versions of VBR, and in all honesty, while per-job backups are more space efficient, they’re not as flexible, such as for cleaning up old VM data such as after a decommission as one example. I like to use Per-VM when I can just because of the flexibility, although I’m not sure I’ll go forward with using VeeaMover to convert my per-job restore points to per-VM, although I’m sure I have some clients without enough space to perform that conversion anyway.

Derek M. Loseke, Senior Systems Engineer | Veeam Vanguard 2025-2026 | Veeam Legend 2022-2024 | https://technotesanddadjokes.com | @dloseke

What's about?

An example?

And what if repository’s a dedup storage?

Sign up

Login to the community