Data Backups for Ordinary Media Users

At the beginning of each new year, I back up essential video, audio, photographic and text files onto SSD memory and take them to a safe deposit box at my bank. I do this off-premise backup in addition to online cloud backup and redundant back-up on a Drobo hard disk array at my office.

As a small entrepreneur, my backup strategy is modest, but essential. Only those who have lost valuable data in the past normally go to this level of trouble, but I am one of those people. It’s business for me and I take care of it.

Each year, I have friends and associates — yes, even in the media business — who lose their data. They learn the backup lesson the hard way. Saying “I told you so” doesn’t seem to help. It’s only the experience of loss that does the trick.

For novice users, it has never been easier or cheaper to backup data. But one does have to think about it and acquire some basic knowledge. A video or audio editor who simply leaves their material on a working drive is asking for trouble.

There are many high-quality professional solutions available from major vendors, including nearline storage and archival storage. This level of storage is for professional facilities and can get expensive. But is mandatory for all those who work with or store media for clients.

From smaller users or individuals like myself, there are also equally reliable solutions priced far more attractively. One I like and use is Drobo, a system that is easy to set-up and use.

Drobo

Drobo

I have found Drobo to be extremely reliable and have gone through two generations of equipment. It just works. There are many models at a range of costs, but all have the same basic features. They key ones is Drobo is self-healing, self-managing and self-optimizing.

Drobo will let you know when you’re running low on capacity and need to install bigger drives. You simply insert the drives. Drobo will configure everything itself. No engineering or IT services required.

Every Drobo uses a technology called BeyondRAID. Built on the foundation of traditional RAID, BeyondRAID provides all the data protection of traditional RAID, but without any of the complexities or limitations. It delivers a simple, automated approach, which I really like.

In a basic four-drive Drobo — which costs only a few hundred dollars with drives — the system warns the user of drive failures and can expand capacity on the fly by adding new drives, or replacing smaller drives with larger ones.

It is a simple plug and play device with no downtime. Some Drobo models can combine both hard drive and SSD media for storage, and an accelerator bay can add SSDs to existing hard drives.

CrashPlan backup onscreen application

CrashPlan backup onscreen application

For basic cloud storage, services like CrashPlan, Carbonite, Backblaze, Mozy, iDrive, SpiderOak, Nero BackitUp, OpenDrive and others work silently online to back-up your home or business computer. They start at about $60 per month and range up to $130 a month.

These services can be set to backup your computer at certain hours of the day. Users should consider how much data they want to back up before making a decision of which service to use. The cloud can be used just to back up video or audio files, or the entire computer. Remember, however, in case of a loss of data, these services can take several days to restore the data.

Cloud storage like this is only one part of a multi-part backup plan. It is useful but there should be other methods of backup as well. It is foolhardy for any one making a living with computer files not to backup. Disaster is lurking behind the corner and could come at any moment.

What I’ve talked about so far is short term backup — the kind all novices should do for their own protection. But a far more complex issue is long-term archiving of media assets. This is not for amateurs, but something all computer users should be aware of and prepare to do with extremely valuable content.

James Snyder, archiving expert at the Library of Congress

James Snyder, archiving expert at the Library of Congress

James Snyder, who serves as senior systems administrator for the Motion picture, Broadcasting, and Recorded Sound division (MBRS) of the Library of Congress, has also served as a manager of SMPTE’s Washington DC Section. He sees long term backup not just in terms of years, but decades — especially where the vast number of high-value content for the Library of Congress is concerned.

For the library, Snyder views long-term archiving as a minimum of 150 years. While normal requirements aren’t that long for most routine jobs, it’s good to know how an institution stores its data for the ages.

One of the first things the library did to prepare for long-term sustainability was development of an “evergreen” file format specification for moving image content.

For such material, the library chose JPEG 2000 mathematically lossless essence coding in an MXF wrapper. Unlike DPX and IMF, which are made up of multiple files for a single asset, the evergreen files are single files for a single asset. Using single files makes long-term sustainability much easier, since an archive needs to track only one media file per asset.

The library, Synder said, keeps the original IMF or DPX file for film scans, and can go back and forth between the original file types and their evergreen files to maintain compatibility with industry production workflows.

For a moving image asset, that would be a JPEG 2000 lossless MXF 0P1a file, and for an audio file, it is a Broadcast WAV RIFF64 file. (Though they may move to MXF-wrapped audio only files as differently coded and more complex types of audio, such as DSD [Direct Stream Digital], are created.)

The library makes certain that everything that was in the original file or setup files like an IMF goes into the single evergreen file. That means all that’s necessary is to maintain the ability to process and decode the essence of the JPEG 2000 file — the image’s essences plus metadata and audio — and make sure that file survives through the decades with its migration plan.

Snyder said that some materials in the archive are coded in older formats, but were not always strictly compliant with those formats.

“Sometimes, with older MPEG files, for instance, it can get harder to decode them, because the early encoders were not truly complaint with the MPEG spec,” he said. “That is why an evergreen file definition is so important — the originals might become unreadable over time. The evergreen file is the insurance policy that the moving image content won’t get lost as the decades go by and technologies and codecs change.”

The media archiving industry, Snyder said, identified the need for a file specification designed exclusively for the requirements of media archiving. A project, initiated through the Advanced Media Workflow Association (AMWA) at the request of the government, eventually produced an archive-specific MXF Application Specification called AS-07: MXF Archiving & Preservation.

One thing that applies to everyone, whether a large or small operation, is the need to continuously plan to migrate to new media every few years. Currently, the library migrates every three to seven years from one data tape format to another. However, it should be remembered that every data storage format migration increases the chance of loss.

Figuring out what kind of approach would be best for any particular media operation depends on a host of factors, Snyder said. One of the reasons there is no certain, industry standard, one-size-fits-all answer is that the industry’s conversion to digital, though well underway, is still in its infancy.

Metadata standards and the ability to share data across multiple media production, distribution, and archiving platforms, are still evolving. No one single approach has been in place long enough to know for sure a given method is superior to other options over many decades.

Formats, media players, encoders, decoders, and migration techniques come and go, and media companies have to carefully make bets on which approaches suit them best. However, institutions like the Library of Congress and others have learned a great deal about some basic factors archivists would be wise to take into account.

Data managers as well as ordinary people have to be very careful in evaluating new storage media technologies, because many have come and gone and are now obsolete, with hardly any way to read them, let alone lift data off them for migration purposes.

That is why, Snyder said, the real name of the game is data storage, management and migration.

“Data management is a truly brave, new world,” he said.

You might also like...

Designing IP Broadcast Systems - The Book

Designing IP Broadcast Systems is another massive body of research driven work - with over 27,000 words in 18 articles, in a free 84 page eBook. It provides extensive insight into the technology and engineering methodology required to create practical IP based broadcast…

Demands On Production With HDR & WCG

The adoption of HDR requires adjustments in workflow that place different requirements on both people and technology, especially when multiple formats are required simultaneously.

If It Ain’t Broke Still Fix It: Part 2 - Security

The old broadcasting adage: ‘if it ain’t broke don’t fix it’ is no longer relevant and potentially highly dangerous, especially when we consider the security implications of not updating software and operating systems.

Standards: Part 21 - The MPEG, AES & Other Containers

Here we discuss how raw essence data needs to be serialized so it can be stored in media container files. We also describe the various media container file formats and their evolution.

NDI For Broadcast: Part 3 – Bridging The Gap

This third and for now, final part of our mini-series exploring NDI and its place in broadcast infrastructure moves on to a trio of tools released with NDI 5.0 which are all aimed at facilitating remote and collaborative workflows; NDI Audio,…