Hypothetically - How would you go about restoring data with infinite retention 30/50/100 years from now


Userlevel 7
Badge +6

This little thread got me thinking, and makes me wonder….if you’re tasked with backing up data with a 50 year or 100 year or infinite data retention, this is more than just running a backup and storing it somewhere.  Because what good is a backup that is 30 years old if you have no infrastructure to do anything with it? On a somewhat related topic, I once had a client that needed to keep their Dell/Quest RapidRecovery backups for 7 or 10 years.  They were looking to move to something like a Dell IDPA appliance (I proposed Veeam with a DataDomain).  But the question what, what do you do to keep the servers and MDxxxx DAS’s appliances with the data available for the next 7-10 years?  Or do you somehow convert or restore all of the data and move it to your new backup solution whenever there is some sort of major change?  And then avoiding bit rot?

Hypothetically speaking, or practically speaking for you folks that are tasked with such extraordinary retention policies, what is your plan for restoring 30+ year old data?  Do you keep separate cold infrastructure in place or at least in storage?  If it’s in storage, do you have a plan written out how to revive that environment?  Tapes and compatible tape drives?  ESXI hosts that run what would quickly become a very obsolete version?  Network hardware that is compatible with those hosts?  Oh, and all the passwords and encryption keys to access that data once things were running?

How do you plan for such a term retention, because I’ve never had to do this, but was curious how those of you do, how you did it or how do you plan to do it?


3 comments

Userlevel 7
Badge +17

I think the system to run the applications to read your data after this long time will be the biggest problem.

To keep the backup data readable and copy it from one storage generation to the next is achievable. For these long retention times is tape a good medium, keep them in two copies, copy your data then every second generation to new media and you should be rather save.

But to have a hardware or physical system available with the correct application to work with it… I am not sure.

 

The best case is when you have to archive documents in PDF-A. This should be readable in the future...

Userlevel 7
Badge +12

For such long backup retentions I would also go with tape. It's the perfect media for long term retention. I would do the same as @JMeixner suggested, keep multiple copies and migrate them to newer generations. Also keep copies of the backup Software from that time, so that you will still be able to restore in case you need it.

That's on the backup part, but have you thought about an archiving solution for the data? Such systems/software are built for long term retention and secure storage of the data (immutability). The vendor itself will make sure that the data stays readable with each software upgrade. Of course you would still have to backup the whole data, but it's very unlikely that you'll have to restore a single file or part, as long as the archive is not tampered.

Userlevel 5
Badge +4

I worked as a storage and backup admin at a North Carolina state agency. We had backups on DAT (often unreadable), Optical Jukebox disk (delaminated often), reel to reel and Super DLT. Some of it was backed up with Arcserve (Cheyenne), NetBackup, Backup Exec, cpio, CommVault, and to JMeixner’s point, copies of documents. Email was backed up with Netscape (can you believe they had a backup solution?). You get the idea. Boxes of various media were stored in the state archives, without a clear plan or documentation of the software used to back it up (and needed to restore) nor the equipment to run that software and do the restore. I call it the “Bell Jar”: put your old backup server and tape/whatever equipment under a bell jar and hope it works in 5 or ten years. Long term management of long term archives is a significant problem and I don’t think there is a workable solution for it, other than the transfer every few years to new media and preservation of the software used to back it up.

Comment