IP Monitoring & Diagnostics With Command Line Tools: Part 12 - Pulling It All Together

When the distributed monitoring system is deployed and running, gather the results and present them on wall-mounted displays, desktop browsers, mobile phones or tablets.


More articles in this series:


Present the monitoring results in a useful and human friendly fashion. Putting a display on the wall showing the current system status is straightforward. Drive the display with web pages that can also be viewed on desktops and mobile devices.

System Overview Displays

Implement the system status displays as web pages that automatically update the details on a regular basis. An inexpensive Raspberry-Pi single board computer with an HDMI output will drive a wall mounted display screen. Auto-start the Raspberry-Pi web browser with a preset starting page so the display comes up on its own on a reboot.

Similar web pages viewed on the support team desktop screens will support clickable widgets to call up more detailed information from each item. When the engineers visit the server room, they can observe the effects of their work on a tablet or rack mounted console.

There are initially three basic kinds of display needed in an operation control centre:

  • Network diagram.
  • Status board.
  • Arrivals board.

There are many ways to visualise the measurement results data, especially if you want to drill down and analyse long-term trends. Add your own ideas for more diverse and useful displays.

Why Are The Symbolic Names So Important?

Every measurement is tagged with a hostname, measurement symbolic name and a timestamp.

When analysing measurements, the hostname is used as a filter to select one machine.

The timestamp is used to create time-windows, compare values against earlier measurements or ensure you have the latest recorded result.

The symbolic names propagate from the initial detection to the display manager. They must always be spelled consistently throughout because they are used to construct SQL queries, fetch cached data and merge results from several hosts.

The HTML widget elements in the display also use the symbolic names to create consistent ID values and embed metadata. The JavaScript code in the page can exploit that metadata to construct XHR requests to fetch new results from the caches or database. and update the display with the latest data.

Building The Network Diagram Screen

Draw a picture of your network in an illustrator app with each host node as a rectangle. Inside the rectangle, add placeholder text blocks with recognisable dummy strings to describe the measurements. Add a separate text block for each value you intend to update with new results. Now save the diagram as a Scalable Vector Graphic (SVG) file.

Open the SVG in a code editor to see the raw SVG code.  Remove the unnecessary heading items. Look for your recognisable text string tags. They may be inside <text> or <tspan> tags. Incorporate the tag ID value constructed from the {target-hostname} and {measurement-symbolic-name} separated by an underscore character (_). These ID values must each be unique within the page, so append a suffix if they appear more than once. Do not alter any other attributes on the tags.

<text id="NODE_NAME_DISK_SPACE">{percentage-value}</text>

Embed the SVG into your web page when you are done. The SVG is a first-class citizen in a web page and JavaScript interacts directly with the object model constructed from it.

Manufacture the JavaScript to request the latest data from the database. Call the server with an XHR request to avoid reloading the page.  Implement the SQL query in PHP and return a JavaScript Object Notation (JSON) formatted payload as a response. Parse the JSON result with JavaScript to extract the hostname, symbolic measurement name and the new value for each measurement.

Iterate through the new results. Assemble the HTML Element ID in the script using the same rules as the diagram object and search for the object in the Document Object Model (DOM) using a getElementById() function call. The returned object has a textContent property. Store the new value there and the browser will update the display immediately. Here is a fragment of JavaScript to update a displayed item as an example:

myNewValue = "75%";

myTargetHostName = "NODE_NAME";

mySymbolicName   = "DISK_SPACE";

myTargetId = myTargetHostName + "_" + mySymbolicName;

myTargetObject = document.getElementById(myTargetId);

myTargetObject.textContent = myNewValue;

If you want to highlight the containing rectangle to indicate the host status, define the ID value to be just the host name:

<rect id="NODE_NAME" ... />

Use JavaScript to locate the host named rectangle object and change the fill property with a new colour to indicate the node status:

myTargetObject = document.getElementById("NODE_NAME");

myTargetObject.style.fill = "red";

Encapsulate the whole update process in a JavaScript function and call it with a setInterval() timer to schedule it to run on a regular basis. Every minute is fine since that is the granularity of cron when it runs the measurement probes.

Building The Status Board Screen

The status board is a web page whose layout is dynamically controlled by a database table. This layout steering table has the following columns:

Column Description
Primary key ID This is used to create a unique HTML Element on the page.
Host name Required to filter results from the measurement cache in the database.
Symbolic process name Identifies which result value to use as a value source.
Selected widget type The type of widget display determines whether we only need the latest value for a numeric cell or a range of values to draw a small graph. Other formats are possible.
Left Left position on screen.
Top Top position on screen.
Width Width of the widget container box.
Height Height of the widget container box.
Background colour The default background colour.

 

The page building logic requests the display controlling records and iterates through them. Each one provides the information needed to dynamically create a <div> element and position it on the page:

Set the ID of the of the <div> block container to a unique value.

<div ID="widget_{primary-key-id}">

Construct a CSS style block from the database values and add this as a style="" attribute inside the opening <div> tag, the values shown here would be derived from the database query result:

style="position: absolute;
       top: {top-value}px;
       left: {left-value}px;
       height: {height-value}px;
       width: {width-value}px;
       background-color: {colour-value};"

The inner content of the <div> block depends on the type of widget. A simple value can place a number inside a <span> block. A series of values could have a small table grid. You could insert an SVG to draw a graph or insert an image (<img>) tag to mimic a display indicator LED or other iconic symbol. Carefully factor the design of these widgets, and create a library of reusable code to draw them.

Use the host and symbolic names to identify values within the widgets and update them periodically like the network status diagram. The refresh logic can update graphs, pie-charts and progress-bars. If you design your widget collection properly, the same drawing code can be reused multiple times.

Building the arrivals board screen

An arrivals board is similar to the one you see at an airport. Use this to display the media processing queues. Track the workflow job status dispositions in a simple table grid. Each row represents one job running through the workflow. The columns indicate the various attributes of the jobs. The workflow manager can update a cache with progress information as the jobs run. That cached progress status can be acquired by the arrivals board update logic.

Here are some ideas for the columns you might want to implement:

Column Description
Job name Identifies the job.
Submitter Identifies who submitted the job.
Type You may be processing multiple kinds of jobs.
Current disposition What stage of processing the job is currently at.
Status Indicates whether the job is waiting, running, completed or failed.
Submit timestamp When the job was submitted.
Processing started When the job started processing.
Completion timestamp When the job completed.
Location Node name where the job is running.


 

The report manager

Each morning. the central cortex gathers measurements from the caches and runs the daily analysis. Filter and process the results and deliver the daily, weekly, monthly and other reports automatically by email.

Collating the results into reports or aggregating them for display from a SQL database cache is very easy to do. Use TCPDF with PHP to manufacture PDF reports and PHP_MAILER to dispatch them. Both of these libraries are open source and very easy to use.

Conclusion

Very few code changes are needed to alter the behaviour when the measuring system is data-driven. This significantly reduces the maintenance overhead. Because things are controlled by data and configuration files, implementing a dashboard control surface is quite easy.

Create new measurement tools and drop them into one of the probe containers. The only new code to write is the kernel of each new measurement.

Modifying the layout and content of the status display just requires some minimal changes to the SQL database table that steers the display generator. The dashboard can manage changes to that.

The major key to flexibility in this design is the use of unique symbolic names for each measurement and how they are propagated through the entire monitoring complex.

In closing, here are some prime directives to bear in mind when designing your own monitoring solution:

  • Always look for opportunities to pass data values to modify script behaviour rather than duplicating code.
  • Go the extra mile in your code so your users do not have to perform complex actions.
  • Design for easy maintainability at the expense of brevity and obfuscated single lines of code.
  • Avoid namespace collisions with carefully designed file system structures.
  • Use defensive coding techniques to pre-empt problems.
  • Comment everything in the source code.
  • Design your data flow so that things only need to defined and measured once.
  • Document everything thoroughly and keep it up to date with changes.
  • Provide contextual online help to the users where appropriate.
  • Read the UNIX man pages in full for a command before using it and look online for examples that illustrate how it is used.

You might also like...

Live Sports Production: Part 1 - New Sports Production Workflows

Welcome to Part 1 of ‘Live Sports Production’ - This new multi-part series uses a round table style format to explore the technology of live sports production with some of the industry’s leading system designers. It is a fascinating insight i…

Automating HDR-SDR Conversion

Automation seems like an obvious solution but effective conversion involves understanding what the image content is and therefore what the priorities are for how it should look.

Building Software Defined Infrastructure: Virtualization Vs Microservices

How virtualization and microservices differ, and workflows where virtualization and microservices would be used or avoided in terms of reliability, flexibility and security.

IP Security For Broadcasters: Part 8 - RADIUS Network Access

Maintaining controlled access is critical for any secure network, especially when working with high-value media in broadcast environments.

Standards: Part 25 - Designing Client-Side Video Players

Here we chart the historical development of client-side video players, describe the building blocks used to create them and the relevant standards.