Cloud Broadcasting - Resilience
In the last article on Cloud Broadcasting we looked at the concept of “Cloud Washed” and “Cloud Born” and the considerations vendors must look at when delivering true cloud systems. In this article, we look more at resilience and cloud system up time.
To get the best uptime from a cloud based system, software should be based on the HTTP (Hyper Text Transfer Protocol) client server model through a web browser. One of the reasons web-browsers have become so popular is that the application software lives on the server, which is under the control of the service provider facilitating easier and more reliable software upgrades.
Service providers have more control over the back-end part of the software, such as database servers and the ability to spin up new instances and allocate resource to meet peak demand. Advances in language designs such as HTML5 and CSS give better graphics display and control handling.
Load Balancers
Cloud providers such as Amazon Web Services (AWS) take this model one step further and encourage the use of Load Balancers. These are a single point of entry for HTTP/IP traffic and work by splitting the messages between web servers. The load balancer keeps a record of TCP client-server connections so it knows where to send future datagrams.
Load balancers provide another valuable function; they allow servers to be physically separated across locations, thus improving resilience. AWS achieves this through their High Availability (HA) infrastructure. Essentially, two instances are created behind a load balancer and each server is in a different availability zone (AZ), defined by AWS as a datacentre in a different flood plane to other datacentres.
AWS spreads its services throughout the globe split by geographic area giving resilience and localization for improved network access. Each region is completely independent and consists of multiple AZ’s, and each zone can be thought of as a datacentre. Although they are physically separated, each zone within a region has high speed low latency networks between them.
Smooth Software Upgrades
Locations of datacentres are a closely guarded secret and are not generally known. A region may consist of more than two AZ’s; Virginia in the USA has four and Frankfurt in Germany has two, and AZ’s are identified by names such as us-west-1a and us-west-1b for North California. Load balancers split traffic equally between zones within a region and multiple servers can be enabled in each zone.
Another advantage of load balancers is they provide a smooth process for software upgrades without any downtime. Servers are no longer upgraded in the traditional way, once a software release is available a new server is spun up with the appropriate operating system, the new software is installed on it and the whole system is copied. Amazon refers to this copy as the AMI (Amazon Machine Images), creating a new server with this AMI will exactly clone the original.
Cloud Scaling
If we have a service running one instance in eu-central-1a and another in eu-central-1b, a third server could be spun up in eu-central-1a. Through the software dashboard the first server in eu-central-1a will have all incoming traffic disabled, and when it’s finished processing its current jobs it can be switched off. The same procedure is repeated for eu-central-1b, and when complete both servers will be deleted, thus upgrading without any downtime, a procedure called “rip and replace” in AWS terms.
AMI’s form the basis of scaling within AWS, when a new server is needed, the application software simply spins up a new instance with the current AMI, and then switches it online making it available for use. Once user demand subsides, the application software simply deletes one of the server instances, leaving the vendor to only pay for the uptime use of the server.
Cloud Washed software cannot take advantage of this automation and would instead rely on a developer or engineer to detect the peak demand, and then manually spin up new servers and enable them, remembering to disable them once the peak demand has gone, failing to do so will result in high cloud costs.
Cloud Born software is fully automated and will detect peak demand, spin up new servers and switch them off again, all without any human intervention. Usually, advanced monitoring and alarm systems are integrated into the software to make systems engineers aware of any changes. The costs of allocating additional resources is directly proportional to the demand placed on the system by its clients. Assuming the correct costing model has been adopted the costs will be directly proportional to sales, with minimal overhead and setup costs.
In-built Monitoring
Users can easily transfer AMI’s between zones in a region allowing server instances to be launched quickly. However, if you need to move an AMI to another region, for example from Ohio to Singapore, then the transfer could take a few hours. By doing this, AWS are effectively discouraging users from moving AMI’s between regions.
Load balancers are relatively intelligent and can detect if an attached instance is healthy or not. If the server starts to drop packets, maybe due to overloading or a software bug, the load balancer will detect this and stop sending it datagrams, it will continue to test the server and start sending messages once it recovers.
Load balancers and high availability zones provide a simple, cheap method of improving resilience in cloud infrastructures. Cloud Born systems take advantage of this to meet peak user demands and improve performance without human intervention, thus reducing costs and improving response times. Cloud Washed solutions can still take advantage of these systems but will be slow and expensive due to the manual intervention of expensive humans.
You might also like...
HDR & WCG For Broadcast: Part 3 - Achieving Simultaneous HDR-SDR Workflows
Welcome to Part 3 of ‘HDR & WCG For Broadcast’ - a major 10 article exploration of the science and practical applications of all aspects of High Dynamic Range and Wide Color Gamut for broadcast production. Part 3 discusses the creative challenges of HDR…
IP Security For Broadcasters: Part 4 - MACsec Explained
IPsec and VPN provide much improved security over untrusted networks such as the internet. However, security may need to improve within a local area network, and to achieve this we have MACsec in our arsenal of security solutions.
Standards: Part 23 - Media Types Vs MIME Types
Media Types describe the container and content format when delivering media over a network. Historically they were described as MIME Types.
Building Software Defined Infrastructure: Part 1 - System Topologies
Welcome to Part 1 of Building Software Defined Infrastructure - a new multi-part content collection from Tony Orme. This series is for broadcast engineering & IT teams seeking to deepen their technical understanding of the microservices based IT technologies that are…
IP Security For Broadcasters: Part 3 - IPsec Explained
One of the great advantages of the internet is that it relies on open standards that promote routing of IP packets between multiple networks. But this provides many challenges when considering security. The good news is that we have solutions…