Scalable Dynamic Software For Broadcasters: Part 6 - Containers And Storage

Monolithic software has defined a model that provides persistence using database storage while maintaining data integrity, but at the same time this restricts scalability. Microservices present a new method for maintaining data integrity and at the same time facilitating scalability.

Data Consistency

In the ideal world, we would have one application accessing one database that would meet all the needs of the broadcaster. But it doesn’t take too long to realize this is an unrealistic expectation and something that could never be achieved. Even the simplest of broadcast facilities with one playout workflow would require a massive database that would hold an abundance of data including media asset library, playout lists, and advertiser metadata.

Other business units would soon demand access to the central database. This would allow sales departments to understand which Ads were played and when, resulting in two applications attempting to access the same data. Although the single database approach may work in theory, the reality is quite different as the structure of the database places limits on how each application can be developed. If the playout service was being upgraded and a change to a database record was required, then any application accessing the same record would also need to be changed.

Making one database application change is difficult but with two or more this becomes a real challenge as multiple vendors will need to collaborate and agree on roll-out schedules. Consequently, the playout and sales applications will often create their own database. Creating two databases holding similar, if not the same information, is a recipe for disaster as we suddenly have two versions of the truth. When they disagree, we have data inconsistency. And this results in massive errors within the broadcast facility.

One solution is to hold one single database and allow the principal application to provide access to the data through an API for all other applications. In essence this would work, however, it’s a method that is difficult to scale as the principal application would need to be installed on faster and faster servers to meet the changing demands of the service, but there is a limit to how fast a server can operate.

Distributed Processing

To overcome the monolithic solution just described, distributed processing was developed leading to parallel processing of data. Although this better facilitates scaling, the applications are still accessing the same database. This leads to poor database access resulting in latency and potentially lost or corrupted data.

In adopting faster servers with common APIs, all we’ve done is to just kick the proverbial bottleneck of congestion down the road from the application servers to the database.

Microservices deliver the distributed processing to provide scalability but still have some challenges to overcome with accessing central databases. For example, if a group of microservices were operating in California, but the database resided in Paris, there would be clear latency issues due to the physical distance between them. Although it’s somewhat romantic to think of the public cloud as a massive blob of limitless scalable resource, it really does physically exist, and one eye must always be kept on this inconvenient fact.

Another solution, and probably the most reliable and efficient, is for each microservice to have its own database. However, more fundamentally we should be asking why a microservice needs access to a database at all. In isolation, it’s foreseeable to operate a microservice without a database. The information needed to facilitate a task may not need to be persistent as it will be working in a stateless operation, so a database is an unnecessary overhead. However, in real-world applications the microservice will be working as part of a complex workflow where multiple microservices providing different functions will need to work together and collaborate.

Workflows With Persistence

It is this collaboration that determines the need for data persistence and therefore, the use of database applications. The database exists independently of the microservices but holds important information which needs to be maintained. For example, during ingest, a media file will need to be moved to a storage device and then information such as the video rate, color space, frame size, and audio sample rate will need to be maintained. The transcoding microservice would then process the file to transfer it into the house mezzanine format, thus creating another media file as well as the associated metadata which will be different to the original.

In the case of a broadcaster the original file format will probably be kept, along with its associated metadata so that there is a reference should anything go wrong with the mezzanine format, or if another format needs to be derived. Transcoding from the original will always provide the highest quality. Each process during the workflow will add more meta data. For example, if the media file is a movie and needs to be edited to create Ad break junctions, then a timecode list will be required to allow the playout system to insert the Ads, thus adding another level of metadata.

Any process that makes use of a microservice has the potential to create more metadata for the media file and this must be stored for as long as that media file exists.

Figure 1 – Access to the microservices’ local database through the API makes systems much more reliable as upgrades can be performed to individual microservices without impacting the rest of the system.

Figure 1 – Access to the microservices’ local database through the API makes systems much more reliable as upgrades can be performed to individual microservices without impacting the rest of the system.

Polyglot Persistence

There’s another potential caveat, that is, not all database technologies are the same. The database might be hierarchical, non-SQL, or relationship based, to name but a few. All these provide different technologies that serve different applications. A relationship-based database may be needed for the orchestration layer to keep track of how workflows and their associated media assets are progressing, and a non-SQL database may be needed to maintain simplicity and improve execution speeds.

Because microservices can have their own associated database, the developers can choose the technology that best fits the application. Furthermore, they can be distributed so they physically reside near the servers hosting the associated microservice thus keeping latency low while maintaining the highest data integrity possible.

Polyglot programming is a term that expresses the idea that not all applications benefit from the same language. The workflow aspect of a web server application may be written in Node.js to deliver a product to market quickly, but a high-speed video or audio processing function may be written in C or C++ to build the fastest application possible. The same principle applies to databases: by using a polyglot approach, developers can choose the best database technology that delivers for their application.

Accessing Data

Microservices benefit greatly from APIs as they provide a unified method of communicating with the service while maintaining backwards compatibility. The same is true for database access.

Applications that need access to the microservice’s data will achieve the greatest reliability by using the API. This way, the underlying database structure can be changed and updated as required without any impact on the rest of the system. New APIs can be generated to provide extra data as it becomes available through the natural development cycle, but the existing APIs will be maintained without change. This way, any dependent services or functions will continue to work reliably even if a major change to the database or microservice has taken place.

Using this method of API data access allows any process within the ecosystem to reliably and efficiently read data from each microservice. Orchestration and monitoring systems can take full advantage of this so that they can mine data from all over the infrastructure, no matter where it resides in the world.

Microservices hold a wealth of application-specific information and their associated databases can make this available to any function or service to log and manage the infrastructure effectively.

Part of a series supported by

You might also like...

Designing IP Broadcast Systems - The Book

Designing IP Broadcast Systems is another massive body of research driven work - with over 27,000 words in 18 articles, in a free 84 page eBook. It provides extensive insight into the technology and engineering methodology required to create practical IP based broadcast…

Demands On Production With HDR & WCG

The adoption of HDR requires adjustments in workflow that place different requirements on both people and technology, especially when multiple formats are required simultaneously.

If It Ain’t Broke Still Fix It: Part 2 - Security

The old broadcasting adage: ‘if it ain’t broke don’t fix it’ is no longer relevant and potentially highly dangerous, especially when we consider the security implications of not updating software and operating systems.

Standards: Part 21 - The MPEG, AES & Other Containers

Here we discuss how raw essence data needs to be serialized so it can be stored in media container files. We also describe the various media container file formats and their evolution.

NDI For Broadcast: Part 3 – Bridging The Gap

This third and for now, final part of our mini-series exploring NDI and its place in broadcast infrastructure moves on to a trio of tools released with NDI 5.0 which are all aimed at facilitating remote and collaborative workflows; NDI Audio,…