Future Technologies: Artificial Intelligence & The Perils Of Confirmation Bias

We continue our series considering technologies of the near future and how they might transform how we think about broadcast, with a discussion of the critical topic of training AI models and how this is potentially compromised from the outset by innate confirmation bias.

One of the points of demarcation between deep fakes and confirmation bias is that creating deep fakes using AI is often conscious, whereas the results of confirmation bias are often unconscious.

Another way of thinking about the is that creating deep fakes requires a purposeful action, it’s unlikely that somebody creating a video recording representing another person doesn’t realize that they are doing this. With unconscious bias, by definition, we don’t necessarily acknowledge our actions.

AI is not only a deeply technical subject but is also a discipline that goes to the core of human understanding. Generally speaking, statistics falls into two camps, the first is making sense of a system through measurement, and the second is to predict future outcomes, both using available data. The implication being that the more data we have available then the more accurate our statistical measurement and predictions. But this assumes that the available data is itself relevant, accurate and independent.

Anomalies Of Statistical Understanding

Statistics generates some interesting misunderstandings, the most amusing of these being the gamblers fallacy where the gambler assumes there is some kind of connection between events that are random and independent. Take tossing a fair coin for instance, the probability of getting a head is ½, and the same for tails. If by some strange and interesting coincidence, the tossed coin falls on heads one hundred consecutive times, the probability of the next toss being a tail or a head doesn’t change, it is still ½. The consecutive events are independent and random, and this idea can be extrapolated to most forms of gambling such as the roulette wheel.

Before AI became mainstream, statisticians and scientists used methods of inference that provided classification and regression to try and predict events in the future. For example, take Ohms law V=IR. Anybody looking at this through a regression-lens may draw a comparison to the straight-line equation y=mx+c. Here, m is the gradient (R), y is the voltage (V), or dependent variable, and x is the current (I), or independent variable. The constant c disappears as no voltage is developed across a resistor without a current flow. When Ohm was formulating his experiments, he would have plotted this current and voltage measurements on a graph and found the best fit line through them. Using more modern techniques we now use the Least Squared Error (LSE) to determine the best fit line, and hence, a prediction. Using Ohm’s law, we can now predict the future voltage value across a resistor when we know the current that will pass through it.

One of the challenges of statistical regression, that is, trying to find the best fit line through a sea of datapoints, is that it relies on both domain expertise and a relatively low number of dimensions. Domain expertise cannot be underestimated, especially when considering the skills shortage we have in broadcast television.

Reversing The Regression Process

AI, through machine learning flips this method on its head and instead of relying on scientists to find the equation through domain expertise and determining the best fit line, we let the ML model do this for us through the concept of training. In essence, an ML model can be thought of as an architecture of interconnected neurons that have two parameters: weights and biases. The genius of ML is not just the neural based model, but the method of learning that it uses called backwards-propagation, or backprop, and that it can work with hundreds or even thousands of dimensions, thus providing a much more generalized solution.

When training an ML model, the data is passed from the input to the output and compared to a known measure, that is, the label (assuming supervised learning). Inevitably, a measurable difference will occur between the calculated output from the ML model and the actual observed data from the label. This error is then backpropagated through the model so that the weight and bias parameters in each neuron are updated. The data is once again passed through the network, an error is created and backprop is used to update the neurons weights and biases. This process, known as training, continues until the model converges and the error of the calculation relative to the labels virtually disappears.

Learning Through Backpropagation

Learning through backprop calculates the partial derivative of the error with respect to each weight through application of the chain rule. Consequently, the act of learning is to look for the global minima of the model for the applied data. The great news is that anybody looking to use applied ML doesn’t need to know how backprop actually works as there are multiple opensource software libraries available such as Keras and Pytorch which already do this. The ML architecture is based on the number of neurons and each of these needs to have its gradient calculated with respect to the models’ error, hence the reason ML training takes so long. A model with millions of neurons is going to need each gradient calculated at every training point for each neuron. As GPUs are a massive array of parallel processors, they lend themselves beautifully to backprop processing as neurons can be mapped to individual GPU processors. Once training is complete, a generic model exists (assuming convergence) and the weights and biases don’t need to be changed again, and this is why inference is much quicker than training.

The next genius of ML is that once the model is trained, it can process data it hasn’t seen before and provide an accurate output (assuming the inference data is in the same sample domain as the training data). The inference associated with otherwise unseen data is a measure of the accuracy of a trained ML model. Through the application of training the ML model, the converged, or trained model, provides a mapping between the input data and output labels. In other words, the model has found a function that represents the data. Hence the reason we don’t need domain expertise to build reliable ML models.

Data Accuracy Is Critical

This explanation is a very high-level overview of how an ML model learns, but even with this there are clearly some issues with how ML works in real-world applications. Predominantly, the model relies on accurate and representative data that has been correctly labeled. The use of labeled data is known as supervised learning, another version exists which is unsupervised learning where the model can work out the labels itself. This is an emerging area of ML that is finding applications in disciplines such as anomaly detection for network security. But for most applications found in broadcasting including Gen-AI, we are still using supervised ML.

The huge elephant in the room for supervised ML is that it relies on labelled data to facilitate training. Who labels the data? Humans do. And humans suffer from unconscious bias.

Understanding why we are subject to unconscious bias is constantly evolving area of the study of psychology. Carl Jung has some impressive views on the working of the human psyche, but psychology is a difficult subject to fathom as ethical scientists are limited to the measurements they can make on the human brain – assuming the physical brain represents the limits of our consciousness.

Unknown Decisions

Unconscious decisions represent the area of our decision making that we are not necessarily aware of. Why chose a blue shirt instead of a white one? Why chose tea over coffee? And why do some people tend more to the arts than the sciences and vice versa? The point is that there are many decisions and judgements we make every day of the week that we are not directly aware of, and this becomes biased when our thoughts tend to a specific thought belief system that is not mainstream. This is just a consequence of the human condition, and the real issue is when our decision making becomes biased, that is, away from the social norms.

What do we mean by social norms? It depends, there are many cultures in the world with differing beliefs. Generally, we agree on many of these beliefs as this helps solidify the basis of society, but some we do not.

When labelling data, we run the risk of this unconscious bias permutating into the ML model so that it in turn also becomes biased. This is a well-documented phenomenon and news reviews are littered with examples of ML systems that demonstrate inherent bias. It’s important to remember that the model’s only source of truth is the labelled data that humans have provided for it. As far as we know ML doesn’t have consciousness and is incapable of determining its own thought processes. In its most primitive form, it’s really just a huge look-up table, admittedly a very complicated look-up table, but non-the-less, it cannot create its own understanding of society and is entirely reliant on the humans that provide the labelled training sets for it to learn.

You might also like...

Delivering Intelligent Multicast Networks - Part 1

How bandwidth aware infrastructure can improve data throughput, reduce latency and reduce the risk of congestion in IP networks.

NDI For Broadcast: Part 1 – What Is NDI?

This is the first of a series of three articles which examine and discuss NDI and its place in broadcast infrastructure.

Brazil Adopts ATSC 3.0 For NextGen TV Physical Layer

The decision by Brazil’s SBTVD Forum to recommend ATSC 3.0 as the physical layer of its TV 3.0 standard after field testing is a particular blow to Japan’s ISDB-T, because that was the incumbent digital terrestrial platform in the country. C…

Designing IP Broadcast Systems: System Monitoring

Monitoring is at the core of any broadcast facility, but as IP continues to play a more important role, the need to progress beyond video and audio signal monitoring is becoming increasingly important.

Broadcasting Innovations At Paris 2024 Olympic Games

France Télévisions was the standout video service performer at the 2024 Paris Summer Olympics, with a collection of technical deployments that secured the EBU’s Excellence in Media Award for innovations enabled by application of cloud-based IP production.