Closing In On Methods For Long Term Archiving

As the amount of data in the world keeps exponentially multiplying, a Holy Grail in research is finding a way to reliably preserve that data for the ages. Researchers are now closing in on methods to make data permanent. The problem is there is no way to be absolutely sure it will work far into the future.

By 2023, Microsoft predicts that over 100 zettabytes of data — including movies, television programming and audio — will be stored in the cloud. That staggering amount of data requires a fundamental re-thinking of how large-scale storage systems operate.

In 2016, Microsoft began a partnership with the University of Southampton Optoelectronics Research Centre in the UK to tackle the archiving issue. It is called Project Silica. The project is designed to store cold data — or data that is infrequently accessed. It doesn’t need to sit on a server for instant use.

Through the project, Microsoft is testing glass as a long term storage medium. Recently, it did an experiment with Warner Brothers to store a copy of the 1978 film, Superman, on a glass disc that is 7.5 cm x 7.5 cm x 2 mm.

Microsoft Project Silica senior optical scientist Patrick Anderson loads the system to write data to glass. Photo by Jonathan Banks <br />

Microsoft Project Silica senior optical scientist Patrick Anderson loads the system to write data to glass. Photo by Jonathan Banks

The glass contains 75.6 GB of data plus error redundancy codes. It is said to be the first test of the new archiving technology for long term storage of films and television programs.

Theoretically, the glass storage could last thousands of years. If it works, a studio like Warner Brothers, who houses some 20 million film assets in temperature controlled warehouses, would have an extra level of protection.

Glass has long been used to preserve audio programming, going back to the radio drama days. In World War II, metal record platters were banned due to metal shortages and glass was substituted for recording. Though glass lasts a long time, it is also delicate. Everyone who has worked with glass discs have opened boxes to find the platters shattered.

However, Microsoft’s methods are different. Project Silica uses lasers similar to those used for Lasik eye surgeries to burn small geometrical shapes, also known as voxels, into the glass. The multiple bits for each voxel is encoded and the data is applied in multiple layers. For the Warner Brothers experiment, 74 layers were used for the Superman film.

Once the data for the program is embedded into the glass, the content is accessed by shining a light through the disc and capturing the data with microscope-like readers. The Warner’s film was checked bit by bit and it was flawless.

Microsoft senior optical scientist James Clegg reads data with a specialized microscope. Photo by Jonathan Banks <br />

Microsoft senior optical scientist James Clegg reads data with a specialized microscope. Photo by Jonathan Banks

So what about the easy breakage of glass? Microsoft said it did extensive tests to make sure that Project Silica storage media didn’t easily damage. It was baked in hot ovens, submerged in boiling water, microwaved and scratched with steel wool. But, all glass still breaks. Apple’s iPhone screens are supposed to be the toughest glass in the world and the screens still easily break when dropped. Only time will tell if the Project Silica glass is tough enough.

Also, there is a question of whether or not the readers for such discs will still be manufactured a thousand years into the future. Technology changes and companies go out of business. It is anybody’s guess how this will play out.

Microsoft’s own cloud, called Azure, already has a major interest in safekeeping vast amounts of both hot and cold data. Azure still uses tape, which has to be checked frequently and re-copied to maintain data integrity. Glass could one day be a more secure solution to safekeep data for the company and its customers.

Much work remains to be done on Project Silica. Read- and write-operations need to be unified into a single device, and the amount of data stored on one piece of glass needs to increase. But the company is betting that the future of long term archiving is in glass.

Microsoft also has a parallel project using DNA molecules for archival storage. The beauty of DNA is it can archive an exabyte per cubic millimeter and have a life of over 500 years. But how will it be read far into the future?

Others are also researching long term archiving. Group 47, formed in 2008 to secure the patents, designs and manufacturing processes for DOTS, developed by the Eastman Kodak Company.

DOTS (Digital Optical Technology System) is a 100-year archival technology that is non-magnetic, chemically inert and immune from electromagnetic fields including electromagnetic pulse (EMP). The storage media can be stored in normal office environments or extremes ranging from 15 to 150-degrees F.

DOTS is stored on a phase change media composed of a metallic alloy sputtered on an archival polyester base. To tackle reader availability in the future, DOTS is a true visual “eye-readable” method of storing digital files. With sufficient magnification, any eye can actually see the digital information.

A “Rosetta Leader” specification calls for microfiche-scale human readable text at the beginning of each tape with instructions on how the data is encoded and instructions on how to actually construct a reader. Because the information is visible, any high magnification camera can read the information.

Long term archival systems are incredibly complex because computer operating systems, hardware/software and technology as a whole are constantly changing. What works today may not work tomorrow, much less a 1,000 years from now.

And perhaps most problematic of all, how does anyone living in today’s world know how long anything will last? It’s a major problem with no easy solutions. 

You might also like...

Designing IP Broadcast Systems - The Book

Designing IP Broadcast Systems is another massive body of research driven work - with over 27,000 words in 18 articles, in a free 84 page eBook. It provides extensive insight into the technology and engineering methodology required to create practical IP based broadcast…

Demands On Production With HDR & WCG

The adoption of HDR requires adjustments in workflow that place different requirements on both people and technology, especially when multiple formats are required simultaneously.

If It Ain’t Broke Still Fix It: Part 2 - Security

The old broadcasting adage: ‘if it ain’t broke don’t fix it’ is no longer relevant and potentially highly dangerous, especially when we consider the security implications of not updating software and operating systems.

Standards: Part 21 - The MPEG, AES & Other Containers

Here we discuss how raw essence data needs to be serialized so it can be stored in media container files. We also describe the various media container file formats and their evolution.

NDI For Broadcast: Part 3 – Bridging The Gap

This third and for now, final part of our mini-series exploring NDI and its place in broadcast infrastructure moves on to a trio of tools released with NDI 5.0 which are all aimed at facilitating remote and collaborative workflows; NDI Audio,…