Hi,
I think crucially it’s down to what bandwidth you have available for your budget. For example, your 30TB Exchange and 40TB in other data, assuming a 2:1 reduction ratio is 15TB and 20TB respectively, and quite achievable. It’s worth remembering this is for the initial seed / full restore. It’s also worth remembering that when recovering from object storage, only data not present within your performance tier datastores is downloaded, not everything for the sake of it.
So in your scenario:
15TB Exchange & 20TB Other Data = 35TB /36,700,160 initial seed.
First lets calculate the incremental, as this is your day to day utilisation metrics and lets us know if it’s sustainable for you as a backup solution:
Let’s assume a 10% daily change to your data that then needs uploading: 3.5TB / 3,670,016MB
Now if we assume your backups are daily and you have an 8 hour backup window, if we assume that entire 8 hour window is saturated for your backups (likely to be far less than this but ensuring we consider worst case scenarios). That leaves us with 16 hours in the day to offload the data.
If we want to complete within this window, we need a minimum throughput of 63.71MBps (3,670,016 divided by 57,600 seconds = minimum consistent throughput). That’s a connection speed of just over 500Mbps (approx 509.68Mbps).
We’d then need to consider day to day traffic layered over the top of that connection if it’s a shared connection too, which falls outside the scope of what I’m detailing here. But on a 1Gbps circuit (normal throughput on this will realistically be between 850-950Mbps) you’d be looking at 8 hours to upload, certainly achievable.
This still leaves the elephant in the room, what about that initial upload? If we consider a 1Gbps connection uploading 24x7, with an effective throughput of 900Mbps / 112.5MBps, to upload 36,700,160 will take approximately 90.61 Hours / 3.77 Days
There are options available to ship data via a data drive to these public cloud vendors to ingest the data that way when faster, so if a connection speed less than 1Gbps was available, that can work around your bottleneck. But then we still have the recovery layer of this, what RTO are you trying to achieve when recovering your data?
I don’t know the size of your organisation, but if you have 30TB of Exchange data that is all active use / short term retention, you’re a massive organisation that hopefully can afford 1/10Gbps connectivity to afford rapid upload and retrieval of data. If that data includes archival, I’d suggest if you’re licensed, looking at utilising Archive Databases. It may then become a scenario that your org would be happy with for DR perspective, to have a much smaller (and far faster to recover) Exchange Database with say 3TB of data, focusing on live mailboxes and current mailboxes, with the other 27TB of data on one or more Exchange Servers that takes a few days to recover.
Without knowing your topology, I can’t recommend specifics to your environment, but initially it does look like you’ve got a lot of data, either by necessity or because it needs a clean, and a WAN connection with low throughput.
Finally onto the point of enterprises, I’ve certainly seen plenty with 1/10Gbps+ connections uploading far larger quantities of data, with RTOs that suit. I’ve also seen sites connected with Dark Fibre at far higher speeds and delivering off-site backups that way.
With your data set size, you could get away with a Dell R740XD2, stacked full of 18TB disks (IIRC it supports 18TB now, might be 16TB), it can house up to 26 3.5” drives, assuming you just wanted a RAID6, no hot spare, that’s 432TB RAW capacity approx, and definitely costs FAR FAR less than the 100k figures you’re thinking for Nimble/Pure. You’d configure as a Linux hardened repository for immutability, give it 10/25/40Gbps network connectivity, and you’d certainly have fantastic read performance, or could slice up the RAID6 into multiple RAID6’s and create a RAID60 for even better write performance.
In summary, what you want isn’t a unicorn, it’s what a lot of people are achieving right now, in a variety of methods. Take a step back, look at what’s out there and ask any follow up questions you may have 😊