Can Content Create Itself?

A revolution in storytelling for TV, cinema, VR, and related forms of entertainment has just begun, enabled by artificial intelligence (AI). This computer-science and engineering-based technique, including machine learning, deep learning, language understanding, computer vision, and big data - is poised to dramatically shake-up both production and the form of future content for entertainment.

“There has been a real need within the M&E industry for a workable AI solution that is ready to be deployed at scale, delivers a measurable return and can withstand the stresses of a relentless, demanding modern workflow,” says Jason Coari, Director, Scale-out Storage Solutions at Quantum.

Media and entertainment users are now leveraging cognitive services and applications to extract new value from their video and audio content, including massive stores of archived content.

Quantum’s solution, in partnership with Veritone, searches for the most appropriate, best in class, AI engine for the task required, and then automates the service to help unlock any unrealised value of an organisation’s media content.

The most obvious, perhaps universal, benefit of AI, is to automate more of the processes required to sift, assemble and distribute content at scale.

Producers are being challenged to create better, high-quality, compelling content which is being shot at increasing ratios, frame rates and resolutions. According to Coari, this is making it more difficult to monetise all the valuable assets an organisation holds, primarily because there is not enough metadata associated with all of this media to assist content creators finding what they need for their work.

“Implementing AI into the workflow speeds up the process and enables more extensive metadata tagging upon ingest to make the video and audio files easier to find,” he says.

Amazon insists AI means Assisted intelligence. Usman Shakeel, AWS worldwide technical leader in Media & Entertainment explains, “Different aspects of the workflow exist across different organisations, systems and physical locations. How do you make sure it’s not lost? ML-aided tools can curate disaggregated sources of metadata.”

The natural next step to automated production is automated publication of personalised media experiences.

Tedial claims we are already here. Its sport event tool Smartlive uses ‘AI enhanced’ logging to automatically create highlight clips and pitch them to social media and distribution.

“This extends the abilities of current live production operators to manage hundreds of automatically created stories,” says Jay Batista, general manager at Tedial (US).

“Applications are being developed, especially in reality television production, where auto-sensing cameras follow motion, and AI tools such as facial recognition augment the media logging function for faster edit decisions as well as automatic social media deliveries.”

At the moment, AI is not accurate enough to operate without human intervention. In the case of iconik’s AI framework, the user can set rules so that, for example, any metadata tag with a confidence level below, say, 50% is discarded, anything above 75% is automatically approved, and anything else is sent for human approval.

“AI-based technology will make mistakes, but the best thing about it is that it will learn from them and become more accurate over time,” says Parham Azimi, CEO, Cantemo.

IBM suggest that AI is intended to be a resource, rather than a replacement, but already its Watson AI has selected clips for assembly of a trailer for the film Morgan.

At the World Cup, IBM worked with Fox Sports to debut an online tool for playing and sharing highlights from this and previous World Cups going back to 1958. Watson uses acoustic, visual, and text-based machine learning to produce metadata for game videos, which is then used by editors when creating clips and viewers when searching for highlights.

These are steps on the road to fully automating the production of scripts, storyboards, video streams, and sound tracks.

At first, ‘the machine’ will assist producers, directors, and artists in producing content, but, as in many industries, this will progressively assume a more comprehensive role.

“The risk and challenge lies not in our ability to move certain types of programming to an automated process, but rather the loss of editorial judgement that can change based on external factors,” suggests David Schleifer, COO, Primestream. “Systems that produce content in this manner will adhere to specific rules and as a result will produce consistent content that will never challenge us to get out of our comfort zone.

He says, “The challenge will be to figure out how a system like this can continue to push the envelope. After all, media as a form of communication is focused on surprising, challenging and helping us grow.”

The risk with news is that our already polarised culture will be exacerbated by automatically generated clips/news, delivered to people based on social media preference and profile.

“While the process could be tuned to give more balanced content, the money will be in giving people what they want which will lead to reinforcing an opinion rather than informing,” says Schleifer.

Paul Shen, CEO, TVU Networks thinks that with the growing capability of technology to collect every bit of data to analyse consumer behaviour, it could one day become plausible to create a formula for how content should be produced based on the target audience.

“It could be possible for the technology to learn from previously created content (i.e. how successful it was, what content generates strong emotions, etc.) and segment these analyses for different demographics such as teenagers, young couples or seniors,” he says. "However, this would require machines and AI to have a deeper understanding of human society and what humans find entertaining.”

In the meantime, TVU is promoting its MediaMind Platform as a means to transform raw media assets “from passive ingredients in to active agents”.

“With Media 4.0 or Enabled Media, someday programmes will be created automatically to cater for different consumer tastes,” Shen says.

Cantemo's iconik AI infused MAM.

Cantemo's iconik AI infused MAM.

Sentiment Analysis

Right now, AI-based media management solutions can already determine whether a scene or piece of footage is sad, happy or funny. This is, of course, beneficial when automating the creation of content for assets like film trailers which can be done in a fraction of the time. But in the future, automated sentiment analysis - the ability to search based on a video or audio clip’s emotional connection - could have much bigger implications.

“For content discovery, sentiment analysis could allow a content creator to search for all assets where President Trump appears to be happy, or sad,” suggests Azimi. “If we think about how long it would take for a news broadcaster to manually analyse and tag every piece of footage containing the president with a description of his emotion, it simply wouldn’t be viable.”

The Internet of Things (IoT) is another opportunity where it could be possible to serve content suggestions to users based on the emotional mood of a viewer.

“If a user has been playing sad songs on one connected device, another could suggest a tear-jerker movie,” says Azimi. “At the moment we’re not quite at this stage, but the technology is in continuous development so almost anything is possible.”

On a larger scale, AI frameworks could also present different types of video assets depending on the mood of the user base, i.e. by determining if the broader audience is happy or sad. AI allows for non-linear adaptive storytelling that might yet provide exceptional user experiences, ideally taking into account the user’s state of mind such as emotions and gaze.

Studies have shown that it is possible for AI systems to set a standard ‘mood’ index for a certain region based on tweets, forums and other social media posts. According to Azimi, this gives broadcasters, marketers and anyone else looking to engage with the public a better understanding of their target audience and the best way and time to communicate with them.

Piksel has investigated automated sentiment analysis through some proof of concepts and its use in content recommendation based on customer habit.

“While these concepts are novel, we feel strongly that with the advancement in AI and machine learning fields, the cost vs benefit curve for use of such tools will intersect sooner than later,” says Kristan Bullett, joint MD, Piksel. “The production benefits would rely on mining of metrics that will become available over the next few years as customer actions are matched to the content sentiments. The earliest use case could be to identify what content sentiment cause customer abandonment and edit such scenarios in post-production. The ability of metadata models to store and expose this extensive metadata would be the key to such solutions.”

Yves Bergquist, ceo at AI firm Novamente informed NAB that he is creating a “knowledge engine” with the University of Southern California in order to analyse audience sentiment across TV scripts and performance data.

“The aim is to link together scene-level attributes of narrative with character attributes and learn how they resonate or not with audiences,” he explained. “There’s an enormous amount of complexity in the stories we should be telling. You need to be able to better understand the risk.”

In the long term he forecast, “You’ll see lots of mass-produced content that’s extremely automated; higher level content will be less automated.”

You might also like...

HDR & WCG For Broadcast: Part 3 - Achieving Simultaneous HDR-SDR Workflows

Welcome to Part 3 of ‘HDR & WCG For Broadcast’ - a major 10 article exploration of the science and practical applications of all aspects of High Dynamic Range and Wide Color Gamut for broadcast production. Part 3 discusses the creative challenges of HDR…

IP Security For Broadcasters: Part 4 - MACsec Explained

IPsec and VPN provide much improved security over untrusted networks such as the internet. However, security may need to improve within a local area network, and to achieve this we have MACsec in our arsenal of security solutions.

Standards: Part 23 - Media Types Vs MIME Types

Media Types describe the container and content format when delivering media over a network. Historically they were described as MIME Types.

Building Software Defined Infrastructure: Part 1 - System Topologies

Welcome to Part 1 of Building Software Defined Infrastructure - a new multi-part content collection from Tony Orme. This series is for broadcast engineering & IT teams seeking to deepen their technical understanding of the microservices based IT technologies that are…

IP Security For Broadcasters: Part 3 - IPsec Explained

One of the great advantages of the internet is that it relies on open standards that promote routing of IP packets between multiple networks. But this provides many challenges when considering security. The good news is that we have solutions…