MPEG-H Broadcasts Bring Viewers Unprecedented Control

Fraunhofer IIS has been demonstrating its real-time MPEG-H Audio Encoder System at industry trade shows for the past year.

With consumers viewing (and listening to) content on more devices and in more places than ever before, broadcasters are being challenged to meet demands for new and better audio experiences in the most cost-effective way. This means upping the ante on multichannel audio from the existing 5.1 surround sound systems found in homes across the world. Consequently, broadcasters are assessing the capabilities of existing infrastructures and determining how new developments in audio and video technology will affect their ability to deliver enhanced services to a broad array of end-user technologies—from high-end home theaters, to tablets and smart phones.

There’s been a lot of talk of the next-generation ATSC 3.0 television broadcast specification, which looks to be standardized in two years and will include much higher resolution video signals (UHTDTV, or 4K) and multichannel “immersive” audio. However, ATSC 3.0 is not backward compatible with current HDTV receivers, so implementation concerns persist.

Similar to the video side, several companies are vying to have their scheme for sending new types of audio experiences include in the ATSC 3.0 spec. Fraunhofer IIS has developed an audio processing scheme called MPEG-H, which has been demonstrated at the IBC Show in September and the more recent SMPTE Fall conference and is vying to be included in the upcoming ATSC 3.0 broadcast standard.

MPEG-H is a new soon-to-be adopted (perhaps in spring 2015) standard that offers “object-oriented audio” features for TVs and mobile devices. A special SMPTE committee is expected to adopt MPEG-H in stages and be rolled out over the ensuing months (or years).

The goal is to give viewers more audio capabilities that they can use in the home or on their mobile device in the hopes of retaining eyeballs. This includes allowing the consumer to choose a language, choosing the home team announcer versus the away team announcer, listening to a specific race car driver communicating with his pit crew, increasing audio levels for dialogue only (or ambiance sounds), as well as other control parameters that can not be done with the current ATSC 2.0 standard, which uses AC-3 audio compression.

To this end, the Germany-based Fraunhofer IIS, Technicolor and Qualcomm have joined forces to form the MPEG-H Audio Alliance, which is promoting its version of “the next generation for interactive and immersive sound.” The alliance said it has developed a roadmap for MPEG-H Audio deployment that allows broadcasters to add new functionality at the rate and pace of their choosing, while preserving existing investments in technology and processes. The standard is backward compatible with the systems and practices currently used for AC-3 or HE-AAC surround sound broadcasting.

“Basically, the broadcaster will have control over what interfaces are presented to the viewer at home, while giving the viewer more audio features than they have ever had available to them before,” said Robert Bleidt, General Manager, Audio and Multimedia Division, Fraunhofer USA Digital Media Technologies, told the website Display Daily during the 2014 SMPTE Fall conference in Hollywood. “This can range from nothing to a couple of preset buttons, to giving the viewers full control. This is a significant challenge for broadcasters.”

Fraunhofer’s Robert Bleidt said broadcasters using MPEG-H will have control over what interfaces are presented to the viewer at home, while giving the viewer more audio features than they have ever had available to them before.

Bleidt said MPEG-H is about personalising the audio experience by allowing the viewer to set parameters that make them most comfortable or engage with a broadcast. They can turn up or down the dialogue (if the broadcaster and content creator have authored the content to permit that). In addition, different TV genres have different levels of creative intent. There might be more interactive capabilities during a live sports broadcast than a postproduction feature film.

New interactive menu features will make MPEG-H Audio-based offerings a key service for customers interested in more control over their listening experience, whether at home or on mobile devices. MPEG-H Audio content will also automatically optimize audio playback across different speaker configurations or headsets, allowing consumers to enjoy the best sound quality possible – no matter where they are or what device they use.

MPEG-H Audio lays the foundation for broadcasters to deliver a more personalized, interactive, and immersive audio experience for end-users by offering:

Interactive “sound mixing” through object coding, which allows viewers to customise the levels of different sound elements—for example, boosting selected commentary or creating a “home team” mix for sports broadcasts.
Rich 3-D sound with the ability to capitalize on additional front- and rear- height speaker channels. This enhances today’s surround sound broadcasts and creates a truly realistic audio experience.
Higher Order Ambisonics (HOA), to provide a fully immersive sound experience that is ideal for live broadcasts and performances, such as sporting events.

Qualcomm Technologies said it is already incorporating MPEG-H Audio support into its roadmap for future mobile chipsets. This is an important step toward widespread distribution of new audio functionality across a range of consumer devices.

Fraunhofer IIS (Institute for Integrated Circuits) is a veteran developer of MPEG audio standards, with its technology currently used in more than seven billion devices worldwide. The company will lend its advanced MPEG-H codec to the alliance as well as other engineering support. Technicolor is a co-developer of the MP3 standard and also provides production services for content creators and distributors around the world. It will implement MPEG-H decoding technology into set top boxes. Finally, Qualcomm will make MPEG-H receiver chips for mobile devices.

As a group, The MPEG-H Audio Alliance is hoping that by promoting a practical, end-to-end TV audio solution from the broadcaster to the home, the ATSC might look more favorably on it. The equipment appears to be ready.

At this year’s IBC Show in Amsterdam, Fraunhofer IIS showed a real-time MPEG-H hardware prototype with the ability to encode audio for live broadcasts from stereo up to 3D sound in the 7.1+4 H format with additional tracks for interactive objects including commentary in several languages, ambient sound or sound effects. The company’s system is comprised of real-time encoder for contribution from outside broadcasts to the studio, where a professional decoder recovers the uncompressed audio for further editing and mixing; real-time encoder for emission to consumers—over the Internet for new media use or for trials of upcoming over-the-air broadcast systems such as ATSC 3.0; and a professional decoder used to monitor the emission encoder's output.

“Our work with the new MPEG-H TV audio system so far has been done by capturing audio from a live event, and then encoding it with software on a computer. At IBC we were showing the next step for live broadcast use—the world’s first real-time encoder for interactive and immersive TV audio. With this prototype hardware, we will be able to demonstrate how we can integrate MPEG-H into a broadcaster’s plant for live trials and tests,” said Bleidt. “The system will encode elements of the audio as interactive objects so viewers at home may adjust the sound to their preference. This new hardware will give broadcasters the ability to encode true 3D sound, enhancing today’s surround sound broadcasts to create a truly realistic audio experience.”

Dolby Labs, with its Atmos 3D audio system, and several other patent-holding companies are also vying for inclusion in the final ATSC 3.0 spec.

For broadcasters, MPEG-H’s advanced compression scheme offers the ability to send more audio data using less bandwidth. Today’s 5.1 broadcasts (using AC-3 and requiring 448 Kbps) could be delivered with the same quality at 160 Kbps. Or the broadcaster could elect to add 30 kbps for interactive audio elements (another announcer, sound effects, pit crew radios, etc.). The new standard will also be able to transmit the latest immersive sound systems (up to 22 speakers) found in movie theaters to audiophile consumers using a single wireless 3-D immersive 7.1+4 channel sound bar.

At the end of the day, the MPEG-H Audio standard will allow TV broadcasters to offer live broadcasts with object-based 3D audio across all devices, providing viewers the ability to tailor the audio to suit their personal listening preferences.

Dolby Labs (Atmos) and several other patent-holding companies are also vying for inclusion in the final ATSC 3.0 spec.

You might also like...

Phil Rhodes Image Capture NAB 2025 Show Floor Report

Our resident image capture expert Phil Rhodes offers up his own personal impressions of the technology he encountered walking the halls at the 2025 NAB Show.

Microphones: Part 9 - The Science Of Stereo Capture & Reproduction

Here we look at the science of using a matched pair of microphones positioned as a coincident pair to capture stereo sound images.

Monitoring & Compliance In Broadcast: Monitoring Cloud Networks

Networks, by their very definition are dispersed. But some are more dispersed than others, especially when we look at the challenges multi-site and remote teams face.

Audio At NAB 2025

Key audio themes at NAB 2025 remain persistently familiar – remote workflows, distributed teams, and ultra-efficiency… and of course AI. These themes have been around for a long time now but the audio community always seems to find very new ways of del…

Production Control Room Tools At NAB 2025

We return to the high pressure world of the Production Control Room where Switchers, Replay and Graphics are always at the heart of the action. The 2025 NAB Show will bring a myriad of new feature releases and opportunities for hands-on…