Bad Practice vs Good Practice in a real use case

23 days ago
6 April 2024
9 comments
159 views

Userlevel 7

+11

Nico Losschaert
Veeam Legend
663 comments

This post is related to a real use-case at a customer.

Some time ago, this customer asked me to perform a backup audit on their infrastructure because they had doubts about the current setup 🤔

So of course I did, I do this often for existing and new customers…

If you want to know more about it, just go on reading this topic 😁

Some info about the customer and their infrastructure :

they are one of the biggest companies in Belgium in their sector
they are having a large vmware cluster : 1 cluster with HA
that vmware cluster is divided over 2 locations at the same site but with +/- 500 meters between and no buildings in between, so ideal in some DR scenario’s
between the 2 locations they are having a 40Gbit fiber connection
the current backup-setup is being done by another IT company in Belgium
they are having 2 backup-servers, 1 on each location
they are using on each location a physical SAN as shared storage with synchronous mirroring between the 2 SANs using hybrid storage with auto tiering of NVME, SSD and SAS HDD
2x 10Gbit was being used for storage connection and 2x 10Gbit was being used for active synchronous mirroring between the storage

So, what was my verdict about the current backup-setup?

Shortly : very bad !!!

More in detail :

they were using 2 physical backup-servers with Veeam installed on both of them : no problem so far
but, on both of them jobs were created for the same HA cluster - VMs can be migrated from host 1 of location 1 to host 6 of location 2 : so why ???
they created a separate job per VM : what ???
the primary backups jobs are put on the local repository of the backup-server : OK
the secondary backup copy job was put on the other backup-server : OK, but
they were using SMB : OMG, can it be worse ???
firmware was not installed in 3 years : not real ???
drivers were not even installed on the backup-servers : never seen ??? - a lot of unknown devices in the device manager : ???
the proxy-server being used was the physical backup-server with 2 concurrent tasks, while the backup-server is having 16 cores each : forgotten to adjust ???
they were using 10Gbit LACP on the backup-server to the core-switches : OK, but
they were using a separate VLAN for the backups per location with as default gateway the firewall with 1Gbit while the vmware cluster was running in the management VLAN : Oh now !!!
the transport mode being used was NBD : I’m not the biggest fan of that, in my opinion it should be used as the last resort : too easy and mostly for people not knowing much about Veeam and performance is not always good
the performance was bad indead : average of 30MB/s for a backup with that kind of infrastructure !!!
they were not using airgapped/immutable/offline backups
not all VMs were being backed up : really ???
...

I asked the customer :

What do you think about the current performance of the backups?

The answer of the customer :

We don’t think it’s that good, that is the reason why we asked an expert, you, to look into this matter.

My answer :

My suggestion is to perform a totally new backup-setup with no extra infrastructure investments but with following criteria :

only 3 days of consultancy of me
a guaranteed much easier and logical setup
a much more performant setup : at least with a factor of 5 à 10 times increase of the current performance
using the current Veeam best practices and setup the servers from scrath and fully up-to-date

The customer agreed and so recently I implemented the new backup-setup.

A small resume :

firmware up-to-date
drivers up-to-date
OS up-to-date
only 1 backup-server is being used as a managed Veeam-server with only 3 primary backup-jobs
using vmware folders : 1 job per folder
using Direct SAN transport method : I used the 10Gbit interfaces using iSCSI to connect to the SANs because the customer is only using thick provisioned VMDKs
using copy-jobs to the secondary backup-server using that as managed server with REFS 64K instead of SMB
Veeam installed as it should be using the current best practices
Implemented MFA
using a copy-job to my immutable BaaS offering of my company
...

→ I tested the performance of the backups : instead of an average of 30MB/s before I was having now 500MB/s à 800MB/s : what a speed !!!!

So a performance increase of factor +/- 20 !!!

The customer asked me to perform a restore of a certain VM that had a size of several hundreds of GBs.

OK, I said and started that restore.

They had to restore that particular VM some time ago and it took more than 6 hours !!!

Then they decided that this is not what they expected and asked me to perform an audit.

The result of that restore now : only 20 minutes !!!

The customer said to me :

You’re a true wizard!

When we change our current infrastructure, you’re the only one that is setting up the new backup-infrastructure 🤗, you truly know what you are doing...

I see often setups that are not being setup ideally.

Why is that?

That is in my opinion the most dangerous thing about Veeam.

It’s easy to setup, it’s very accessible to start with it and it will work!
But, will it being setup as it should?
Only when you know what you are doing and having much experience with Veeam...

Like
Quote
Share

9 comments

Userlevel 7

wesmrt
Veeam MVP
155 comments
23 days ago
6 April 2024

As support engineer at Veeam I see many, many customers like this.

Congrats on make your customer life easier, Nico :)

Userlevel 7

+20

Chris.Childerhose
Veeam Legend, Veeam Vanguard
6307 comments
23 days ago
6 April 2024

Great story and ending Nico. Glad to see things on the right track after you checked things.

Userlevel 7

+17

coolsport00
Veeam Legend
2741 comments
23 days ago
7 April 2024

Great job revamping their environment @Nico Losschaert 🙌🏻

Userlevel 6

kristofpoppe
Veeam Vanguard
72 comments
22 days ago
7 April 2024

thanks for sharing Nico.

We often see this. What did you do with the previous and historical backup points? Had they enough space to start new chains?

Userlevel 7

Scott
Veeam Legend
816 comments
22 days ago
8 April 2024

What a change. Sounds alot like my environment when I started.

That is a ton of issues. I try and re read the best practices often to keep it in my brain for each upgrade.

Userlevel 7

+11

Nico Losschaert
Author
Veeam Legend
663 comments
18 days ago
12 April 2024

thanks for sharing Nico.

We often see this. What did you do with the previous and historical backup points? Had they enough space to start new chains?

No probs @kristofpoppe. Luckily the customer had enough space to start new chains. With customer having not enough, we have to be creative 😉. Normally the way to do is deleting the oldest regular chain, afterwards again and again until everything of before is being erased. Not ideal, therefore being on its limits regarding available storage is not a good idea.

Userlevel 7

marco_s
Veeam Legend
273 comments
18 days ago
12 April 2024

The “problem” is that Veeam is very easy to install and configure for backups with a simple setup.

Many customers/IT companies don’t understand the importance of a real knowledge of the product..

Userlevel 7

+10

Andanet
Veeam Legend
245 comments
18 days ago
12 April 2024

This post is related to a real use-case at a customer.

Some time ago, this customer asked me to perform a backup audit on their infrastructure because they had doubts about the current setup 🤔

So of course I did, I do this often for existing and new customers…

If you want to know more about it, just go on reading this topic 😁

….

More in detail :

they were using 2 physical backup-servers with Veeam installed on both of them : no problem so far
but, on both of them jobs were created for the same HA cluster - VMs can be migrated from host 1 of location 1 to host 6 of location 2 : so why ???
they created a separate job per VM : what ???
the primary backups jobs are put on the local repository of the backup-server : OK
the secondary backup copy job was put on the other backup-server : OK, but
they were using SMB : OMG, can it be worse ???
firmware was not installed in 3 years : not real ???
drivers were not even installed on the backup-servers : never seen ??? - a lot of unknown devices in the device manager : ???
the proxy-server being used was the physical backup-server with 2 concurrent tasks, while the backup-server is having 16 cores each : forgotten to adjust ???
they were using 10Gbit LACP on the backup-server to the core-switches : OK, but
they were using a separate VLAN for the backups per location with as default gateway the firewall with 1Gbit while the vmware cluster was running in the management VLAN : Oh now !!!
the transport mode being used was NBD : I’m not the biggest fan of that, in my opinion it should be used as the last resort : too easy and mostly for people not knowing much about Veeam and performance is not always good
the performance was bad indead : average of 30MB/s for a backup with that kind of infrastructure !!!
they were not using airgapped/immutable/offline backups
not all VMs were being backed up : really ???
...

Reading all of this I would say... you like to win easy! 🤣

My cousin's friend did better in the beginning.

You had to turn the screw in the right direction.

Bravo @Nico Losschaert

Userlevel 5

bene
Comes here often
10 comments
15 days ago
15 April 2024

Main challenge is… Veeam “Simply works” (since the good old times...). You can run the installation and configuration by just clicking next next next finish.

We struggle since many years with this challenge. We brought out the best practice analyzer to have a rough attempt into the right direction.

Another thing is, that some customers grow with their environment or started with veeam with a small devision but then figured out that it works like a charm and moved ,more workloads from time to time into the veeam environment… but never touched or expanded the based installation from VBR.

Comment

Sign up

Login to the community

Scanning file for viruses.

This file cannot be downloaded