The Meaning Of Metadata

Metadata is increasingly used to automate media management, from creation and acquisition to increasingly granular delivery channels and everything in-between. There’s nothing much new about metadata—it predated digital media by decades—but it is poised to become pivotal in broadcast technology’s current phase of rapid evolution.

From the earliest days of film and professional audio, metadata has been part of creative people's workflows, even if it wasn't talked about as such. The moment you put a strip of film in a "bin," write a label on a reel of tape, or even put it on a particular shelf, you're creating metadata. Many of our current working methods originated in the analog world, and we needed the administrative techniques to be efficient even back then.

In those pre-digital, pre-database days, we didn't call it metadata at all; we just called it "organization." As computers inexorably became part of our working lives, we clung to pre-digital concepts like files, filing cabinets, folders, and even desktops. We did this with good reason; these ideas were effective.

Eventually, we would find good reasons to change them, but only when these old-world concepts started to constrain digital organization and data manipulation.

At the risk of meandering through a philosophical wormhole, it's worth looking more closely at what we mean by "metadata" and what it is exactly. This is not to suggest that our current understanding of metadata is flawed in some way but to see how far we can expand the scope of metadata and perhaps use it in new and fruitful ways.

"Meta" means, essentially, "beyond", as in metaphysics and (very topical) metaverse. "Metaphysics" is a description of things beyond the boundaries of physics: religion, aesthetics, etc, and the "Metaverse" is what the internet is likely to evolve into. You find similar uses in words like "Metamorphosis" and "Metabolic", which ultimately refer to changes beyond the initial state of something. (So the idea of "beyond" refers more to a temporal and physical relationship rather than one simply of abstraction).

When we talk about digital media—and all the activities surrounding its production and distribution—we tend to use the notion that metadata is "data about data." You can't really argue with that, except that it doesn't tell us very much. What we do know is that metadata can be a transformative element in modern media workflows, allowing for smarter, more adaptable, and more robust workflows.

Here's a simple example to illustrate that.

In media asset management, the idea is essentially to take assets (video, still photography, documents, audio, etc.), store them somewhere, and eventually retrieve them. As MAM systems grow, they often incorporate facilities like transcoding and other processes such as adding subtitles, potentially in several languages.

A chunk of media with no associated metadata can only be treated as a closed box. With no idea what's in it, all the system can do is deliver it. The more metadata you have, the more you can do with the payload inside the box. In this context, metadata lets you see what's in the box without opening it. One distinction might be, "Is this a proxy file, or is it a full-resolution file"? If the box is labelled "proxy", then the system will know to treat it as such (for example, not sending it down a path that would lead to it being transmitted to end users).

The more metadata, the smarter the system gets. Ultimately, you could design a system where the content contains sufficient information to "find its own way" to its intended destination, complete with processing like transcoding during its journey.

Metadata is typically associated with moving media, even if that media is in storage, because, somehow, it got there, and somehow, it needs to get from there to whoever needs it. Metadata is information, so how do you move information? Does that information exist in one place, or is it, like an idea, independent of location and, essentially, everywhere at once?

How do you move the number 7 from London to New York? You can't. That's a category mistake. You can't weigh it nor measure its dimensions. It simply exists everywhere at the same time. It's a concept. You can't put the idea of "Red" in a box and ship it.

Think about a history book about an ancient battle. The book doesn't "contain" the battle because it took place in the past; it happened somewhere other than where the book is. The book isn't the battle; it is about the battle—there are no real soldiers fighting between the pages.

But if the book is the only physical record of the battle, you need to care for it when you move it.

So, metadata exists in a kind of duality. At the same time, it is a bunch of concepts that don't exist in any particular place, but it can also be as rare and fragile as the Crown Jewels, depending on its designated value and the precariousness of its storage medium.

In practical terms, you can often break down and simplify metadata by structuring the process that you're describing. In this sense, structure is also metadata, but it's built into the entity being described. You don't need to write the word "tree" on every tree because you can see it's a tree.

You might never have considered the distinction between "Denotation" and "Connotation". These are relevant here. The word "tree" denotes the thing (i.e. a component of a forest) that we identify when we're looking where to stick our label that says "Tree". The connotation is "Branches, leaves, bark, roots" etc. If you understand the connotation of a label, you only need to denote a package: you don't have to fully describe it because that information will be in a database or look-up table somewhere.

Think of a freight ship. A ship dedicated to carrying cars, frozen fish, electronics, or even fresh produce is easy to label. You need one label because wherever you look on the ship, you'll find the same thing. You may need to associate specific handling instructions with that label (frozen fish can't be allowed to thaw out when it's unloaded; fresh food needs prompt onward shipment, etc.), but you only need that detail.

Container ships are different. It's easy to handle containers, but they could contain anything, so the labels have to go to another level, where each container needs its own specific instructions.

However, some containers might not have uniform cargo. They might be loaded with individual packages (teddy bears and lawnmowers, for example), and these parcels would need individual labels.

You can immediately see how helpful metadata is in all walks of life. In broadcasting, containers can convey compressed digital media in a standardized container format (like MP4) and are agnostic about specific formats.

Metadata opens up file and content transfer from being one-dimensional (let's say with a Media Asset Management system at each end of a wire) to being multidimensional. This means that the more you know about the media (though its metadata), the more intelligent ways to manage and orchestrate it you have. You can set up conditional workflows where conditional movement allows you to apply quality control and send different content to different destinations; you can regionalize, personalize and customize, as well as add new levels of security.

You can orchestrate media movement because each piece of media essence carries associated metadata that informs the system where it should be going. There's no limit to this. If your metadata is detailed enough, you can achieve an extraordinary level of automation.

How do you achieve this practically?

Here are two common approaches.

You can have a giant, overseeing database of all metadata.

Or have an object-oriented system where all media essence carries its own metadata in its container or a sidecar file (a file containing metadata that is always associated with the media it is describing).

Each approach has its advantages, but that depends on the complexity of the metadata. Structured databases can become unwieldy when presented with data schema that is too granular. Systems where the "payload" can find its own way through the maze can be more robust, but it is hard and perhaps foolhardy to generalize.

Most metadata is currently generated at ingest and can be as fine-grained as necessary, even to the level where it records what the sound supervisor had for breakfast. Today's AI developments mean that metadata is likely to be generated in every segment of a workflow, which will make an almost unimaginable difference to its usefulness.

Eventually, digital media essence and metadata will merge, creating a kind of singularity in which frame rates and resolution become irrelevant. My prediction is that this will happen sometime between the next century and next week.

You might also like...

Designing IP Broadcast Systems - The Book

Designing IP Broadcast Systems is another massive body of research driven work - with over 27,000 words in 18 articles, in a free 84 page eBook. It provides extensive insight into the technology and engineering methodology required to create practical IP based broadcast…

Demands On Production With HDR & WCG

The adoption of HDR requires adjustments in workflow that place different requirements on both people and technology, especially when multiple formats are required simultaneously.

If It Ain’t Broke Still Fix It: Part 2 - Security

The old broadcasting adage: ‘if it ain’t broke don’t fix it’ is no longer relevant and potentially highly dangerous, especially when we consider the security implications of not updating software and operating systems.

Standards: Part 21 - The MPEG, AES & Other Containers

Here we discuss how raw essence data needs to be serialized so it can be stored in media container files. We also describe the various media container file formats and their evolution.

NDI For Broadcast: Part 3 – Bridging The Gap

This third and for now, final part of our mini-series exploring NDI and its place in broadcast infrastructure moves on to a trio of tools released with NDI 5.0 which are all aimed at facilitating remote and collaborative workflows; NDI Audio,…