Have you ever worried about the lifetime of your stored data? You should.
- Google did a study that has shown an hard drive reliability to be even lower than previously thought.
- Professionally pressed DVDs and CDs have a shorter lifespan than recordable ones due to the materials involved. (Sorry, no link. Let me know if you have one.)
- Hard drive density is now so high that error correction is constantly active (Again, let me know if you remember where I read this)
Now that I’ve scared you a bit, you’re probably asking about the one format type I seem to be leading to. Recordable Optical discs.
If you’re curious, I’ll provide links, but I’ll summarize for the busy: When backing up, use DVD+R (Avoid DVD-R) discs and make sure they were manufactured in Japan or Taiwan. (Especially avoid Indian discs) If your job depends on it, make sure you get “Archival-grade” discs.
Here are the details:
- How to Choose CD/DVD Archival Media
- Answers to your questions about CD/DVD archival capacity and testing
And, as usual, store your stuff in a cool, dry place and use cases made from acid-free plastic. That alone will probably make your old DVD-Rs and CD-Rs stay readable for twice as long. 🙂
Archival and Optical Media by Stephan Sokolow is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
I’ve seen similar information before. Actually, the last time I read it, I think I remember someone saying that many of the common manufactured CD-Rs, billed for 50+ years of life, were actually lasting about 10 years. No, I don’t hve a cite. It was years ago.
There are other aspects that I find frightening about data storage. First of all, I remember reading an article about a common minor error in the Intel north bridge at the time that meant that there might be a subtle data corruption going on, just a bit or two every few hours, that would eventually make the data unusable and unrecoverable. They further went on to hypothesize that, if a country really did want to apply electronic terrorism to take down a country, such a mechanism whether it’s physical or software-based, would be frighteningly effective. It’s small changes, so no one notices the difference, but before you know it, none of the data can be trusted. Could you imagine what would happen to the banking systems if it was found out that one bit in a billion had been flipped every day?
Lastly, even if the media survives, will the format? We started by carving into stone and clay, artifacts which have survived for thousands of years. We moved to paper and papyrus which will still last centuries with some basic care. Disks and discs will last decades. Webpages… last for days and are often then lost forever. And have you ever tried to open a 10-year-old Microsoft Word document let alone a document written in software from a company that has gone out of business? Even if our media survive, future generations may be able to do little more than say, “Yup, looks like they were using some kind of binary storage here.”
That’s why I constantly upgrade my media (I’m just finishing the transition from CD-R to DVD+R) and have a little cron script which sends me a daily report on any of the following “disallowed formats”:
– Formats where the viewer/converter/extractor is non-free. (eg. RAR and ACE archives)
– Binary Blob formats (eg. MS Office Docs)
– etc.
I’m also working on an extension that will make it auto-correct such problems. (converting RAR and ACE to 7-Zip, converting MS Office to WAR (HTML plus images in a tar archive) or OpenDocument, etc.)
If you weren’t aware, OpenDocument (which OpenOffice uses) is a handful of XML files (and any embedded images) inside a zip archive with a non-zip extension.
Oh, and I’m planning on writing a proxy server which archives every file (excluding certain formats like archives, music, and videos) that the browser requests. The problem of website mortality has always bothered me… especially since some webmasters are rude enough to abuse robots.txt to block the Wayback Machine’s archival crawler.