Scalable Dynamic Software For Broadcasters: Part 7 - Connecting Container And Microservice Apps

Messaging is an incredibly important concept in microservices and forms the backbone of their communication. But keeping systems coherent and resilient requires an understanding of how microservices communicate and why.

There are two fundamental types of messaging in a microservice ecosystem. There are those messages that are public facing, and those that are private. The public facing messages are needed by users to execute a process, such as an ingest operation. And the private messages are those used by the microservices to communicate with each other without presenting them to the outside world.

As users will generally be accessing the operational side of the microservice through a web browser or over the internet, then the public messages must be internet compliant. Generally, this means HTTPS/TCP/IP. The TCP layer provides a logical connection with guaranteed delivery, and the HTTP provides the transfer protocol to allow web browsers and servers to exchange data.

Messaging Resilience

Messaging becomes particularly interesting when we consider what happens when things go wrong. A message may be lost in transit or corrupted before it is delivered, a node may fail, an instance might crash or a be overwhelmed with requests rendering it unable to process jobs in the message queue.

There may be tens, hundreds, or even thousands of instances of microservices running in a particular ecosystem. All these need to exchange messages to maintain the workflow and provide the relevant updates.

Failures fall into two camps: transient or complete failure. With transient failures, a failure may only exist for a short period of time as the system effectively repairs itself. This could be caused due to a network switch failing resulting in the IP packets taking another route, or a node becoming temporarily overloaded but recovering quickly. Complete failures exist for much longer periods of time and can be caused by situations such as a server failing or the power system not having sufficient failsafe systems.

Recovery

If a recovery strategy is not adopted, the messages will either cause congestion in the network and servers or render the system unstable. Consequently, the transient and complete failure types require different message recovery strategies, and these fall into two different types: retry and circuit breaker.

The “retry” method deals best with suspected transient failures as the sender will keep transmitting messages to the target microservice assuming the failure will recover quickly. If it does, then the previous messages will be discarded or serve no practical use and the system will continue to function. However, this is a big assumption as continually sending the same message to a microservice assumes it is idempotent, that is, sending multiple messages will not have a detrimental effect on the microservice.

Figure 1 – If microservice ‘B’ is not idempotent, resending a failed message will result in the microservice executing the instruction twice. This will have potentially disastrous side effects for the system making it unstable.

Figure 1 – If microservice ‘B’ is not idempotent, resending a failed message will result in the microservice executing the instruction twice. This will have potentially disastrous side effects for the system making it unstable.

Maintaining idempotent software is incredibly important, otherwise the system can become unstable. For example, if microservice ‘A’ sends a message to microservice ‘B’ requesting an ‘id’ to check if an operation has been executed, this is considered idempotent as the message from microservice ‘A’ does not change the status of microservice ‘B’ in any way. So, if multiple messages were sent, there would be no unintended side effects for microservice ‘B’. However, if microservice ‘A’ sent a message instructing microservice ‘B’ to read the next file in a directory, microservice ‘B’ would certainly not be idempotent as there would be a massive side effect. If the same message was sent ten times by microservice ‘A’, and some of the messages were lost, then the state of microservice ‘B’ would be unknown, it would be indeterminant.

If too many failed messages are transmitted by the sender, a bottleneck can easily occur leading to congestion in the network and the destination microservices as the messages overflow in the queue buffer. The circuit breaker will detect the failure of the messages after reaching a predetermined threshold and stop the sender transmitting. The error will then be raised to the orchestrator software where other action can be taken such as re-routing or raising of an alarm.

Synchronous And Asynchronous Messaging

When discussing synchronous and asynchronous messaging it’s important to note that these refer to the protocol not the underlying I/O. An operating system can create either synchronous or asynchronous I/O access, also referred to as blocking and non-blocking. With a blocking I/O access the operating system will have to wait for an event to complete before the software can continue, but with non-blocking, the software execution is not affected. Synchronous and asynchronous messaging have a similar idea, but they work at the protocol level, not the I/O level. But it is entirely possible for a synchronous message to operate over an asynchronous I/O.

There are many pros and cons between synchronous and asynchronous messaging systems, but they differ in whether the sender waits for a response from the receiver (synchronous) or not (asynchronous). That is, with asynchronous messaging the sending thread is not blocked, which potentially results in an improved latency response. For example, if microservice ‘A’ calls microservice ‘B’, and this in turn calls microservice ‘C’, then with a synchronous system microservice ‘A’ could find itself blocked until microservice ‘C’ completes its action, reports back to microservice ‘B’ and then microservice ‘A’.

An asynchronous messaging system can also be thought of as fire-and-forget communication system.

Figure 2 – In the top diagram Microservice ‘A’ sends a synchronous message to microservice ‘B’ but it must wait until ‘B’ has responded before it can continue. In the bottom diagram, microservice ‘A’ sends messages to the message broker using a fire-and-forget methodology, so microservice ‘A’ does not have to wait for any responses. Furthermore, the messaging broker allows multiple microservices to subscribe to, and receive the messages from microservice ‘A’.

Figure 2 – In the top diagram Microservice ‘A’ sends a synchronous message to microservice ‘B’ but it must wait until ‘B’ has responded before it can continue. In the bottom diagram, microservice ‘A’ sends messages to the message broker using a fire-and-forget methodology, so microservice ‘A’ does not have to wait for any responses. Furthermore, the messaging broker allows multiple microservices to subscribe to, and receive the messages from microservice ‘A’.

As asynchronous events do not need to wait for a response from the receiver, they lend themselves well to broadcasting or streaming messages to multiple receivers, often referred to as the pub/sub model (publisher – subscriber). Conceptually, this is similar to IP broadcast methods where routers subscribe to a broadcast stream to receive the data for multiple destination devices. Within microservice architectures, a separate messaging bus is often employed to deliver this type of event-driven communication. For example, five microservices may subscribe to a microservice sender through the messaging bus. As the sender does not expect a response from any of the receivers it will send (publish) the message and any service subscribing to it will receive the message.

Message Brokers

To facilitate message delivery microservice architectures often employ a messaging broker. This is effectively an intelligent queue management system that buffers the messages in memory and makes sure they are delivered to the microservices that have subscribed to the publishing microservice.

The messaging broker also handles all the management of the subscriptions and facilitates their delivery. A microservice architecture may employ thousands of publishers and subscribers making the whole asynchronous messaging system incredibly complicated.

Without a messaging broker, the microservices themselves would have to provide the necessary management and protocols to facilitate asynchronous delivery. This is a complex and arduous task and so is often left to the expert service providers that deliver this type of service thus leaving the broadcast vendor to focus on building their application.

Messaging within microservice architectures not only provides a method of communication for different microservice apps but also delivers resilience and scaling to greatly improve system reliability and flexibility.

Part of a series supported by

You might also like...

Designing IP Broadcast Systems - The Book

Designing IP Broadcast Systems is another massive body of research driven work - with over 27,000 words in 18 articles, in a free 84 page eBook. It provides extensive insight into the technology and engineering methodology required to create practical IP based broadcast…

Demands On Production With HDR & WCG

The adoption of HDR requires adjustments in workflow that place different requirements on both people and technology, especially when multiple formats are required simultaneously.

If It Ain’t Broke Still Fix It: Part 2 - Security

The old broadcasting adage: ‘if it ain’t broke don’t fix it’ is no longer relevant and potentially highly dangerous, especially when we consider the security implications of not updating software and operating systems.

Standards: Part 21 - The MPEG, AES & Other Containers

Here we discuss how raw essence data needs to be serialized so it can be stored in media container files. We also describe the various media container file formats and their evolution.

NDI For Broadcast: Part 3 – Bridging The Gap

This third and for now, final part of our mini-series exploring NDI and its place in broadcast infrastructure moves on to a trio of tools released with NDI 5.0 which are all aimed at facilitating remote and collaborative workflows; NDI Audio,…