Viewpoint: Artificial Intelligence for Supercharging Media Asset Management

Artificial intelligence (AI) for media analysis is beginning to change how media organizations meet their most daunting media asset management (MAM) challenges. When paired with leading-edge MAM tools, a new and emerging breed of AI platforms offers tremendous potential for transforming media workflows and making it easier than ever for operations to access, manage, and archive large volumes of content.

New AI solutions promise automated logging and tagging content with capabilities such as speech to text, language translation, and the ability to tag content based on people, places, things, and even sentiment.

These capabilities couldn’t come at a better time. In today’s typical media operation, new content is being generated at a breathtaking pace, and production teams are simply unable to keep up with content management tasks unless they have specialized tools. In addition, time is running out for them to digitize for historical content that exists in legacy, analog formats (there is still a lot of tape about), before the original content degrades. It’s essential that these assets be logged and tagged so they can be found easily, but too often such tasks fall by the wayside as too expensive or time consuming.

Connecting the Dots Between AI and MAM

As AI technologies continue to mature, strong MAM capabilities will become even more essential – the best tools can index, search, and easily correct a huge volume of time-based metadata whilst managing complex online, offline and cloud storage landscapes. However, there’s plenty of room for improvement with the current generation of MAM tools, especially from a metadata perspective. Until now, content teams have been left with few options beyond pulling technical metadata from media files or streams, extracting the meaning from file and folder names, or manual logging.

Kansas City Chiefs football organization uses Square Box Systems CatDV for its fan-based communications and marketing to create a range of content for both TV and web broadcasts.

Kansas City Chiefs football organization uses Square Box Systems CatDV for its fan-based communications and marketing to create a range of content for both TV and web broadcasts.

An AI-powered MAM solution offers a way forward. A great approach is to add to the MAM’s existing logging, tagging, and search functions through integrations with best-of-breed AI platforms and cognitive engines such as those from Google, Microsoft, Amazon, IBM, as well as a host of smaller, niche providers. These AI vendors and AI aggregators enable the MAM to leverage these AI analysis tools for speech recognition and video/image analysis, with the flexibility to be deployed either in the cloud or in hybrid on-premises/cloud environments.

Here are a few examples of AI-driven MAM capabilities:

  • Speech-to-text, to automatically create transcripts and time-based metadata
  • Language translation
  • Place analysis, including identification of buildings and locations where GPS was not available
  • Object and scene detection (e.g. daytime shots or shots of specific animals)
  • Sentiment analysis, for finding and retrieving all content that expresses a certain emotion or sentiment (e.g. “find me celebrations (in a sports event)”)
  • Logo detection, to identify when certain brands appear in shots
  • Text recognition, to enable text to be extracted from characters in video
  • People recognition, for identifying people, including executives and celebrities

AI: Still Coming of Age

Media companies are looking forward to the day when they can rely on AI capabilities to help them not only gain traction on digitization projects, but also get better control over digital content in their existing libraries. But AI technologies still have some maturing to do before they can deliver on these benefits and truly become mainstream.

Accuracy is of particular concern. AI analysis is improving every day, especially with speech-to-text solutions, but there’s still plenty of room for fine-tuning. Some engines might not be able to distinguish between U.K. or American English, and they will probably trip over abbreviations and jargon. Therefore, the industry is currently focused on training AI engines to recognize these language variations and correct mistakes. Also, the sophistication of AI tools varies considerably when it comes to image or video analysis.

Taking the AI-MAM Plunge

So how do you pick the AI tool that’s best for your requirements, and understand the cost for each style of analysis? Some vendors might have different price tiers based on the format of the content; 4K assets might cost more to analyse. At the same time, some lack of transparency in cost structures across the AI industry can make it difficult to work out the total expense of applying AI to a media library. Some AI aggregators are helping customers sidestep some of the costs and complexities of setup by making it easier to choose the right AI engine for a specific task, albeit at greater cost.

Once you’ve chosen your vendor, vendors or aggregator, your next challenge is to get your content uploaded to the AI engine, which is often in the cloud. That’s more complicated than it might seem – some of the steps include creating a video proxy, separating the audio files, creating an image sequence, and making sure the content is in the right format for the AI engine to understand. Once the content is uploaded, the tasks are centered around managing its lifecycle.

Another challenge – actually the good news and the bad news – is the rapid pace at which AI technology is advancing. As the tools improve, any AI analysis performed on your content today might need to be repeated in the future to take advantage of new capabilities. This could result in multiple refreshed data sets, with additional layers of complexity if the content has been corrected or updated by a human after the original analysis. And you should not discount security concerns, especially with cloud providers.

Looking Ahead

The future is bright for AI-powered MAM, and the capabilities listed above are only the beginning. MAM vendors can do their part by offering strong metadata organization with careful and configurable user interface design. This helps keep the system from overloading users with too much information. On the workflow and automation front, truly enterprise-worthy MAM systems will be able to feed the AI engines with the right data and automate the analysis, while keeping down costs.

Executed properly, the MAM can also play a powerful role in training and improving AI engines. For example, manually tagged content in the MAM could be used to identify the executives in a corporation. The MAM could use this manual tagging to train AI engines to do a better job of logging and tagging new content.

In summary, the media and entertainment industry is being transformed by AI. In the right hands, AI becomes the key that unlocks the next generation of MAM technologies.

But only the most powerful, flexible, easy-to-integrate, secure, and scalable MAM platforms will enable media operations to take maximum advantage of AI for unlocking the potential of digital assets and making them searchable, reusable, and monetizable.

Dave Clack is CEO of Square Box Systems.

Dave Clack is CEO of Square Box Systems.

You might also like...

HDR & WCG For Broadcast: Part 3 - Achieving Simultaneous HDR-SDR Workflows

Welcome to Part 3 of ‘HDR & WCG For Broadcast’ - a major 10 article exploration of the science and practical applications of all aspects of High Dynamic Range and Wide Color Gamut for broadcast production. Part 3 discusses the creative challenges of HDR…

IP Security For Broadcasters: Part 4 - MACsec Explained

IPsec and VPN provide much improved security over untrusted networks such as the internet. However, security may need to improve within a local area network, and to achieve this we have MACsec in our arsenal of security solutions.

Standards: Part 23 - Media Types Vs MIME Types

Media Types describe the container and content format when delivering media over a network. Historically they were described as MIME Types.

Building Software Defined Infrastructure: Part 1 - System Topologies

Welcome to Part 1 of Building Software Defined Infrastructure - a new multi-part content collection from Tony Orme. This series is for broadcast engineering & IT teams seeking to deepen their technical understanding of the microservices based IT technologies that are…

IP Security For Broadcasters: Part 3 - IPsec Explained

One of the great advantages of the internet is that it relies on open standards that promote routing of IP packets between multiple networks. But this provides many challenges when considering security. The good news is that we have solutions…