Storing & Optimizing M&E Content in a Big Data-Fast Data World
New NVME and object storage platforms can support the needs of big data and fast data media applications.
The storage requirements for digital media never stop growing. Image size and resolution increases. Metadata is added to every frame. Finally, the data must live forever. All these factors and others combine to place serious demands on a production or broadcast facility’s storage infrastructure. Fortunately there is a solution.
Film studios and post-production houses are constantly challenged when technology evolves. Video resolutions have evolved to 4K, and 8K is rapidly emerging. Frame rates are also increasing, transitioning from 30fps (2K resolutions) to 60fps (4K resolutions). As camera capture technology also evolves, additional storage capacity and bandwidth will be required to support these higher resolutions and frame rates. If a studio changes a camera resolution from 2K (and 30fps), to 4K (and 60fps), approximately eight times as much data will need to be stored and streamed to support this increase in resolution. Looking at it another way, 4K imagery and digital cameras can produce about 2.63TB of data per hour, and emerging 8K video can generate 100TB per hour. To ingest petabyte-scale content requires a new approach to storage tiering that will protect valuable footage, enable global collaboration, as well as data analysis.
As digital film content continually grows in numbers and size, data storage capabilities MUST expand at similar rates, and take into account the ever-changing content activity in the production workflow, as well as protecting data to last forever. The way by which studios approach their storage strategies are becoming increasingly important, and once implemented, these strategies can be optimized by big data and fast data to deliver added value, intelligence and unexpected outcomes. It is no longer about just saving film or digital assets, but more about understanding what has been saved, how to access it, and how to extract value from it. To reach that data storage nirvana in film production, a tiered strategy is required that best categorizes the content and the storage medium for which the content is stored.
The Traditional M&E Storage Workflow
In the classic tiered enterprise storage pyramid model, data that is accessed continuously is the most important and considered hot, or tier 0 data. It requires very fast, high-performance media to store and retrieve it. Enterprise flash SSDs and all-flash arrays (AFAs) are commonly used as tier 0 storage. When tier 0 content becomes less frequented, it moves to warm status, or tier 1 data, and is typically housed in more cost-efficient, slower performing media such as enterprise hard drives. When the content is only accessed once in a while, it becomes cold, or tier 2 data, and is typically stored on even slower and lower-priced media such as commercial hard drives, until it becomes really cold, or tier 3 data, and is moved to very inexpensive media, such as tape, for archival.
When raw footage is captured by a camera, that content is as hot as it gets (Figure 1). It is moved from embedded storage (SD and microSD cards) within the camera to enterprise SSDs or HDDs as tier 0 storage. From the storage device, the footage goes to a digital imaging technician (DIT) cart that immediately makes two to three copies of the footage for post-production use associated with editing, transcoding, controlling image quality, performing on-set color correction, adding audio and special effects like 3D, collaborating in real-time, and troubleshooting.
Once the post-production work has completed, the production footage is moved to tier 1 storage and considered warm as further revisions can be made to the content. To protect these post-production revisions, daily back-ups to tape or capacity disk are performed and considered tier 2 storage that eventually moves to tier 3 storage the colder the content gets, and does not need to be modified – only read from time to time. Tape has been widely used for tier 3 storage.
In the traditional M&E storage process, there are many challenges. The main problem is the speed that is required to ingest petabyte-scale content to the DIT cart or workstation for post-production. Flash-based SSDs or HDDs used in this tier 0 configuration are RAID-based, do not scale well and are not optimal for incredibly large workflows. If the storage media cannot properly manage the influx of content, dropped frames can occur that impact the technical quality of the film and cause on-screen distractions and a flawed viewing experience.
Additionally, if a hardware or data integrity issue occurs, it could take weeks to rebuild the content, negatively affecting workflow productivity and production schedules. The risk of content loss skyrockets if a subsequent issue occurs. When tape is used to archive data, content may become unreadable or difficult to access. As such, to ingest petabyte-scale content requires a new approach to storage tiering.
The New M&E Storage Tiering Model
There are two key adjustments that can be made to improve on the traditional tiered M&E storage model (Figure 2). For example, deploying high-performance SSDs based on the NVMe (Non-Volatile Memory express) standard has become a popular choice for ingesting film content for post-production use. As NVMe was designed specifically for flash media, it delivers significant improvements in latency and throughput when compared to hard drives or legacy SSDs based on SCSI commands. It features a streamlined memory interface, command set and queue design that bypasses the storage device stack to deliver significantly faster performance than traditional interfaces, and enables petabyte-scale film assets to be stored on fewer devices, and with smaller physical hardware footprints.
A second adjustment to the tiered storage structure can be made when the post-production work completes and data is moved accordingly. Instead of placing traditional file-based content in a NAS or SAN environment, then migrate it to a production workstation for editing, and then archive it, the new paradigm is to simply place all of this content in an object storage system, bypassing the need for other storage media, and the pivot point for everything.
Object storage is an architecture that stores unstructured data as objects, whether a document, film, video, audio, image, photo, etc., and includes metadata that provides descriptive information about the object and the data itself. Since object data and metadata can be placed in a flat address space, the need for a hierarchical file structure is eliminated, simplifying data access. Since metadata is defined by users, data analytics and other discovery techniques can be enabled, and the film assets can also be aggregated to deliver very efficient capacity scaling.
Global Collaboration Requires Fast Data
Collaborating with team members and partners is also considered a workflow in the film production process requiring seamless and immediate access to all content at a global scale. File-based and block-based storage systems of the past have created silos of data as storage capacity grew, making it difficult to ensure a consistent global view to all data, and fast access to it. Leveraging object storage (on-premises or in the cloud) can help to eliminate these challenges and enable fast data to deliver the speed and performance to meet the global collaboration objective.
Data Analysis using Big Data
Many studios and post-production houses have captured a ton of film assets over time but have not taken advantage of the buried treasures that may lie beneath. There lies an abundance of data that can be mined to hopefully find some golden nuggets of importance or analyzed as part of a big data application to extract further value, intelligence, predictions, associations or some desired outcome. As such, the content must be stored without losing any of it.
Object storage is where the film industry is headed as these systems replicate data across three locations (similar to the triple mirroring model), but only requires storing about one-third of the object data in each location. These systems can also detect and self-heal sector-level bit errors in the background using erasure coding and data scrubbing technologies to achieve up to 19 nines of data durability. The result is data that has high integrity to deliver very precise analytic outcomes.
Final Thoughts
Studios and post-production houses are increasingly collecting high resolution content at every stage of film production. The workflow is growing exponentially and getting more complex and compute-intensive requiring large scale storage and processing, as well as efficient infrastructure solutions that maintain comprehensive digital libraries and support a global organization. The answer is in NVMe-based storage media to ingest film content for post-production use, and object storage to do everything else from migrating the film asset to a production workstation for editing, to archiving it, to performing global collaboration and data analysis. To ingest and process this much film content requires a different approach to data storage and a new tiering structure.
Erik Weaver is the Director of Product Marketing for media and entertainment solutions within the Data Center Systems group of Western Digital Corporation.
You might also like...
HDR & WCG For Broadcast: Part 3 - Achieving Simultaneous HDR-SDR Workflows
Welcome to Part 3 of ‘HDR & WCG For Broadcast’ - a major 10 article exploration of the science and practical applications of all aspects of High Dynamic Range and Wide Color Gamut for broadcast production. Part 3 discusses the creative challenges of HDR…
IP Security For Broadcasters: Part 4 - MACsec Explained
IPsec and VPN provide much improved security over untrusted networks such as the internet. However, security may need to improve within a local area network, and to achieve this we have MACsec in our arsenal of security solutions.
Standards: Part 23 - Media Types Vs MIME Types
Media Types describe the container and content format when delivering media over a network. Historically they were described as MIME Types.
Building Software Defined Infrastructure: Part 1 - System Topologies
Welcome to Part 1 of Building Software Defined Infrastructure - a new multi-part content collection from Tony Orme. This series is for broadcast engineering & IT teams seeking to deepen their technical understanding of the microservices based IT technologies that are…
IP Security For Broadcasters: Part 3 - IPsec Explained
One of the great advantages of the internet is that it relies on open standards that promote routing of IP packets between multiple networks. But this provides many challenges when considering security. The good news is that we have solutions…