Largest VM you backup?


Userlevel 7
Badge +4

Here’s one to find out who has the largest backup in the forums.

 

Currently my largest server I backup is 45.1TB.  

The largest one I had backing up was 115TB.  It took a while but worked fine.  I’ve been working on splitting it up in to multiple file servers to allow more concurrent streams to our backup/tape jobs.

 


45 comments

Userlevel 7
Badge +3

Yep...always going to be Linux.  I think my Pihole server is pretty darned small….but still a bit oversized I’m pretty sure at 16GB.  But I have seen VM’s down in the 2GB or less range.

Userlevel 7
Badge +4

I created a VM with No OS the other day so with data reduction essentially zero! I just wanted to test mounting a VMFS datastore to see if the VMDK file was still there. 😂

 

As a windows shop I’ve been giving a bit more to the system disks lately so I don’t really have “tiny” VM’s.  I like to increase logging a fair bit for audits so the extra space is nice to not have to shuffle stuff around. 

Userlevel 7
Badge +7

Ok, let’s turn this around ….. smallest VM?

Well in Rickatron Labbin, I do many powered off "empty disk" VMs but I have backed up the SureBackup network appliance that has no disks.

Userlevel 7
Badge +10

Ok, let’s turn this around ….. smallest VM?

Some tiny Linux VM with something between 3 and 6 GB… Was some very small Linux variant, don’t remember which one though….

Userlevel 7
Badge +4

Ok, let’s turn this around ….. smallest VM?

Userlevel 7
Badge +7

Well, Just got told I need to add another 50TB to this server, and I have another 300TB incoming,   It’s quite critical too.

 

I’m going to have to create a few servers most likely to spread the load, but maybe I’ll do a test in Veeam after to see how long it takes.

 

*Sigh* I’m going to need a few more SANs just for backup data, and a few in production. lol.   

Whoa, Scott! Pushing it!

Userlevel 7
Badge +4

Well, Just got told I need to add another 50TB to this server, and I have another 300TB incoming,   It’s quite critical too.

 

I’m going to have to create a few servers most likely to spread the load, but maybe I’ll do a test in Veeam after to see how long it takes.

 

*Sigh* I’m going to need a few more SANs just for backup data, and a few in production. lol.   

Userlevel 7
Badge +4

Sometimes on support I got customers with VMs around 30-40TB.

It's always a pain to troubleshooting problems with VMs like that, I hate. :(

Haha, “Will you stay on the phone with me while I restore this?” 🤣😋

 

 

Userlevel 5
Badge

Sometimes on support I got customers with VMs around 30-40TB.

It's always a pain to troubleshooting problems with VMs like that, I hate. :(

Userlevel 7
Badge +10

My biggest vm being backed up was a “Giant” 2TB SQL Server, and a 1.5TB Oracle Server.

they were not so big, but very critical, both running in Mechanic Discs, so the backups took so long to perform, and some times, the machines jammed, so they were backed up at night, out of working ours, and every time we did a change, fingers crossed for not messing the Full backup. 😂.

 

I was made aware of an 18TB Oracle backup (with the RMAN plugin)

OK, my Oracle databases are not that big. The biggest is around 5 -6 TB. But it is growing 😎

I think my biggest DB server is 14TB. I’m not doing app aware though as the SQL admins get scared when I say I should be taking over their backups..

 

Even after a demo...

I have convinced them, took me five years… 😂😂😂

They have seen that it is much easier for them to do the db backups via the plugins or the application aware backup.

I am looking forward to the MS SQL plugin in V12. 😎

Userlevel 7
Badge +4

My biggest vm being backed up was a “Giant” 2TB SQL Server, and a 1.5TB Oracle Server.

they were not so big, but very critical, both running in Mechanic Discs, so the backups took so long to perform, and some times, the machines jammed, so they were backed up at night, out of working ours, and every time we did a change, fingers crossed for not messing the Full backup. 😂.

 

I was made aware of an 18TB Oracle backup (with the RMAN plugin)

OK, my Oracle databases are not that big. The biggest is around 5 -6 TB. But it is growing 😎

I think my biggest DB server is 14TB. I’m not doing app aware though as the SQL admins get scared when I say I should be taking over their backups..

 

Even after a demo...

Userlevel 7
Badge +7

My largest is about a 120TB file server….which is getting ready to grow as the client is starting to put a lot of 4k video on it.  It’s been rough getting it through error checking backup defrags.  Unfortunately, we don’t have enough space on the repo to setup incremental full’s due to size.  It’s a process…..and we’re still trying to find a better way for it.

The 120 TB File server is that a VM, NAS or Lin/Win System?

Userlevel 7
Badge +10

My biggest vm being backed up was a “Giant” 2TB SQL Server, and a 1.5TB Oracle Server.

they were not so big, but very critical, both running in Mechanic Discs, so the backups took so long to perform, and some times, the machines jammed, so they were backed up at night, out of working ours, and every time we did a change, fingers crossed for not messing the Full backup. 😂.

 

I was made aware of an 18TB Oracle backup (with the RMAN plugin)

OK, my Oracle databases are not that big. The biggest is around 5 -6 TB. But it is growing 😎

Userlevel 7
Badge +7

My biggest vm being backed up was a “Giant” 2TB SQL Server, and a 1.5TB Oracle Server.

they were not so big, but very critical, both running in Mechanic Discs, so the backups took so long to perform, and some times, the machines jammed, so they were backed up at night, out of working ours, and every time we did a change, fingers crossed for not messing the Full backup. 😂.

 

I was made aware of an 18TB Oracle backup (with the RMAN plugin)

Userlevel 7
Badge +4

I’ll add, I’m not anti VUL or NAS backup 😀

I’ll probably get some VUL licenses for NAS backup going forward, I’m just going to be picky about where I use them.  NAS backup is awesome and I plan to use it, but more for backing up a NAS or share rather than a Windows Server VM.

 

Each backup should be treated it’s own way when it comes to requirements and pricing. If you have a NAS, it’s the perfect product.

Userlevel 7
Badge +4

What kind of workload is it for you, that “needs” those monster VMs? 

For our customers with VMs of ~30TB max. it’s mostly fileservers that have gone nuts during decades of lacking governance… 😉

There we usually have the discussion about NAS backup being an alternative with much better parallelity throughout the whole job run. Especially when V12 brings us NAS2tape.

Monster VMs tend to be the onces rolling in via a single thread at the end of a VM backup job.

Challenge with NAS backup for Windows/Linux servers is, that it gets waaaaaay more expensive. 1VM=1VUL vs. 30TB=60VUL…

I looked into NAS backup, Can’t afford it.

 

lets say i have 17 file servers at 50 TB each.

I have sockets currently 2 CPU hosts,  I could handle all those servers on 1-2 hosts, so 4 sockets max

Even at a 6-1 ratio for conversion that is 24 VUl.

 

Now, if I switched to VUL license, full VM backups =17 VUL, but I can handle another 50 VM’s probably on these servers so I’m still ahead with sockets, but pay upfront so VUL’s could potentially work here at slight increase.

 

17*50=850TB.  I’m going to let you guess how much that comes out to in NAS backup / VUL licenses.

At 500GB a VUL, that is like 1700 instead of 17.          

 

Our file servers have crazy growth, mostly from video and legal requirements of how much data we have to keep. Some of it is 60+ years, some I have been told “forever”

 

The DB, Application, Web, and other servers are all reasonable in my environment.

 

With our DR and backup requirements too, it means keeping multiple copies of these on multiple SANS and Tape.   My vendors all love me and probably owe me a few more lunches lol.

 

 

 

 

 

 

Userlevel 6
Badge +5

What kind of workload is it for you, that “needs” those monster VMs? 

For our customers with VMs of ~30TB max. it’s mostly fileservers that have gone nuts during decades of lacking governance… 😉

There we usually have the discussion about NAS backup being an alternative with much better parallelity throughout the whole job run. Especially when V12 brings us NAS2tape.

Monster VMs tend to be the onces rolling in via a single thread at the end of a VM backup job.

Challenge with NAS backup for Windows/Linux servers is, that it gets waaaaaay more expensive. 1VM=1VUL vs. 30TB=60VUL…

Userlevel 7
Badge +4

I was made aware of a 98 TB VM backed up by a customer in South Africa. And some Windows Servers in the ½ PB range as well.

½ PB is pretty good.  I know that VMware, Windows etc all support these monsters, but the manageability, portability, and time for backups is crazy.      I guess it’s like the previous generation of techs never allowing volumes over 2TB.  lol. That doesn’t scale today, but either I’m getting older and out of touch, or that is a ton of data in one spot. 🤣

Userlevel 7
Badge +7

I was made aware of a 98 TB VM backed up by a customer in South Africa. And some Windows Servers in the ½ PB range as well.

Userlevel 7
Badge +4

My biggest vm being backed up was a “Giant” 2TB SQL Server, and a 1.5TB Oracle Server.

they were not so big, but very critical, both running in Mechanic Discs, so the backups took so long to perform, and some times, the machines jammed, so they were backed up at night, out of working ours, and every time we did a change, fingers crossed for not messing the Full backup. 😂.

 

I’m a huge fan of redundancy. I used to have some fiber / SAN infrastructure I inherited from the guy before me that would cause outages every upgrade, change, reboot etc.  Backups would cause systems to halt too.

 

I basically started fresh and rebuilt it all from the ground up. It works fine now but I still get nervous. lol

Userlevel 7
Badge +5

My biggest vm being backed up was a “Giant” 2TB SQL Server, and a 1.5TB Oracle Server.

they were not so big, but very critical, both running in Mechanic Discs, so the backups took so long to perform, and some times, the machines jammed, so they were backed up at night, out of working ours, and every time we did a change, fingers crossed for not messing the Full backup. 😂.

 

Userlevel 7
Badge +4

Awesome topic @Scott!

 

Some great conversation here around handling Backup & restoration of these beasts. I’d chime in that unless it’s ransomware I’d generally use quick rollback if I needed to restore a volume. If it was ransomware and the customer didn’t have space for an extra copy I’d ask them to engage their infosec response teams if leaving the OS disk is sufficient and purging the rest. Mileage may vary.

 

As for largest VM, I had a really interesting one for sizing. Customer was somewhere between 100-150TB of data on a file server. What’s that big you might ask? It was high resolution, lossless, images of the ocean floor and other such ocean-related imagery.

 

Because it was lossless, the compression algorithm of Veeam had a whale of a time (pun intended 🐳) compressing the data. IIRC the full backup was 20-30TB. The main kicker was that the imagery was updated periodically, so that entire 100-150TB of data was wiped from their production storage, and new data sets were supplied. But again, it was 99.9% uncompressed images. Other system data churn was about 1-2GB per day.

 

Fast Clone kept the rest of the data requirements in check to avoid unnecessary repository consumption, and the backup repo could handle the data overall easily. Retention was also a saving grace as although they relied on their backups for “oh can we compare this to 2-3 weeks ago”, IIRC they only retained about 6 weeks of imagery. As for restorations, once this wasn’t the live dataset they only wanted to compare specific images, not entire volumes.

 

Glad I talked them into a physical DAS style repo instead of the 1Gbps SMB Synology design they were going for!

Very cool.

 

I didn’t want to ask “What type of data” as I know there are many rules and regulations, but it’s very interesting finding out what people have to store and keep.   I have a TON of photos and videos that make up most of my data. I can’t really get into what for, but it doesn’t compress very well. Lucky for me I have a ton of databases that compress really well to make up for it. 

 

It’d be pretty neat looking at those ocean floor images. 

 

Good point with fast clone if you need a quick restore. I find in most cases I’m doing file level restores off these monster file servers as people delete things, or more often accidently copy something and lose track of where it went. We’ve all had a mouse slip. To go one step further, ADAudit is a great tool for finding those. Every time someone requests a restore I do a search to see who, when and where the file was deleted from. half the time someone slipped and dropped the folder another level down. I move it back saving duplicate data. 

 

Another thing I try and do, is keep low churn on the same servers. that way some of the real big archive servers don’t end up growing anymore.  I find organizing by year in subfolders is a great way to talk to a department/division and say, “Can we archive everything pre 2015 etc.” or How many years of X do we need to keep for legal purposes.

 

 

 

Userlevel 7
Badge +14

Awesome topic @Scott!

 

Some great conversation here around handling Backup & restoration of these beasts. I’d chime in that unless it’s ransomware I’d generally use quick rollback if I needed to restore a volume. If it was ransomware and the customer didn’t have space for an extra copy I’d ask them to engage their infosec response teams if leaving the OS disk is sufficient and purging the rest. Mileage may vary.

 

As for largest VM, I had a really interesting one for sizing. Customer was somewhere between 100-150TB of data on a file server. What’s that big you might ask? It was high resolution, lossless, images of the ocean floor and other such ocean-related imagery.

 

Because it was lossless, the compression algorithm of Veeam had a whale of a time (pun intended 🐳) compressing the data. IIRC the full backup was 20-30TB. The main kicker was that the imagery was updated periodically, so that entire 100-150TB of data was wiped from their production storage, and new data sets were supplied. But again, it was 99.9% uncompressed images. Other system data churn was about 1-2GB per day.

 

Fast Clone kept the rest of the data requirements in check to avoid unnecessary repository consumption, and the backup repo could handle the data overall easily. Retention was also a saving grace as although they relied on their backups for “oh can we compare this to 2-3 weeks ago”, IIRC they only retained about 6 weeks of imagery. As for restorations, once this wasn’t the live dataset they only wanted to compare specific images, not entire volumes.

 

Glad I talked them into a physical DAS style repo instead of the 1Gbps SMB Synology design they were going for!

Userlevel 7
Badge +4

I’m not sure if my team has ever had to do a restore on this VM…..if so, it would likely have been an individual file or two, but not the whole VM.  I actually don’t have a place to drop the VM to test restoring this one.

I’m in a similar situation. I have spun it up in Veeam and pulled files off of a 100TB+ file server, but don’t usually have over 100TB just kicking around to test something.  I did test restoring a few 20TB servers from TAPE a few weeks ago to see how long it would take, and it wasn’t great, but it worked at least.

 

 

 

I have a client that I installed a Nimble array for last week (HF20 to replace a CS1000 if you care).  I will be finishing the VM migration from the old to the new next week and one thing they wanted to try before we tear out the CS1000 is a full restore of all VM’s from the tape server I installed about 6 months ago.  I don’t expect blazing fast speeds or anything, but they wanted to make sure it worked which I’m all for testing and I’d love to get a baseline on how long it takes.  I think we’re looking at about 40 VM’s for a total of around 6TB or 8TB of data.

That shouldn’t be too bad. For my environment i’m sure its around 400TB to test that which is not reasonable. although when it comes to ransomware, they will be more than happy to wait vs the alternative. plus i do have hot backups on production SANS. the tape is a historical/last resort.

Userlevel 7
Badge +3

I’m not sure if my team has ever had to do a restore on this VM…..if so, it would likely have been an individual file or two, but not the whole VM.  I actually don’t have a place to drop the VM to test restoring this one.

I’m in a similar situation. I have spun it up in Veeam and pulled files off of a 100TB+ file server, but don’t usually have over 100TB just kicking around to test something.  I did test restoring a few 20TB servers from TAPE a few weeks ago to see how long it would take, and it wasn’t great, but it worked at least.

 

 

 

I have a client that I installed a Nimble array for last week (HF20 to replace a CS1000 if you care).  I will be finishing the VM migration from the old to the new next week and one thing they wanted to try before we tear out the CS1000 is a full restore of all VM’s from the tape server I installed about 6 months ago.  I don’t expect blazing fast speeds or anything, but they wanted to make sure it worked which I’m all for testing and I’d love to get a baseline on how long it takes.  I think we’re looking at about 40 VM’s for a total of around 6TB or 8TB of data.

Comment