This is Part two of the What is: AI series. You can find part 1 here.
I’ve always found the term Machine Learning to be similar to Artificial Intelligence, in that on a surface level, it feels straightforward, but when you start thinking about it, it immediately gets a lot more slippery.
In part one of this series on AI, we spent a lot of time going over the historical background of why intelligence, and especially artificial intelligence, is so hard to define. So now, we will explore this issue from the other end, by going on a discovery of the field of machine learning and finding out exactly how a modern artificial intelligence works.
How Machines Learn
Remember the baby learning about sheep in the previous article? The human baby has a brain which it can use to store information about sheep, eyes it can use to tell its brain about the sheep it saw, and hands to draw the sheep it sees in its mind. AI has none of that. So how does it work? How do machines actually learn? The answer might not surprise you.
It’s math. It’s all just math.
Toy Example
Let’s say you have a wall of unknown height. You want an AI that can tell you how high your head will be above the ground when you stand on this wall.
Your mathematical formula would be something like this:
your height + wall height = height above ground
Now we need some data for training:
You are about 2 meters tall (units not important) and when you stand on the wall, you are about 5 meters above the ground.
2 + wall height = 5
According to the formula, and the current dataset, the wall is 3 meters tall.
Now for another data point, let’s say I am 1.5 meters tall, I am standing on the wall, and I am 3 meters above the ground.
1.5 + wall height (right now 3) = 3
That doesn’t work, does it. Which means that the wall height needs to be adjusted a bit:
1.5 + wall height (more or less 2) = more or less 3
And we can check your data point again:
2 + wall height (more or less 2) = more or less 5.
Well, it’s not as good as before. It’s certainly not perfect. But given the two data points we’ve looked at so far, it’s a decent compromise.
And now we do this for another 2 million people. With each new measurement, the AI will adjust the value of the wall height to find the best compromise. This part of the process, where the AI looks at data and adjusts its numbers, is called training.
Let’s say after 2 million iterations, the wall height the AI has settled on is 2.33 meters. The value hasn’t changed in about 200’000 data points, so it’s safe to say the AI has finished learning. Now here comes your friend Jack, who is 1.6 meters tall. Jack is scared of heights and doesn’t want to get on the wall. But that’s okay, because the AI can tell you that if he got on the wall, Jack would be about:
1.6 + 2.33 = 3.93 meters above the ground
And now AI has just helped us solve a problem.
Beyond the Numbers
So that’s how machine learning works. It looks at all the information at its disposal, and calculates the mathematical values that fit the information best. Then, when it is presented with a new piece of information, it can use the chosen values to predict the expected result.
Of course this is easier in cases where we are dealing with numbers, like measuring the height of walls, or predicting risks involved in a surgery given a patient’s age, weight, and medical scores. Numbers are easy to use in math. But how does this work with texts, or pictures, or even video?
Quite easily, actually. Words can be translated into numbers. Each letter can be a number, or each word in a dictionary can be a number. And once you’ve transformed your words into numbers, you can just as easily feed them to an AI. This is a field known as Natural Language Processing, or NLP for short.
Images can also become sets of numbers easily, a subset of AI known as Computer Vision. In fact, digital photographs are already stored and shared as huge sets of numbers. Each pixel in the picture has a number for its position on the width, one for its position in the height, one for its color, another number for its brightness, and so on. To do the same with video, you just need another number indicating the number of second in at which the pixel occurs.
Anatomy of an AI
Now in our toy formula above, we only had to learn one number, the wall height. But as you can imagine, for texts and especially pictures and video, there are a lot more numbers involved than that. Modern AIs often take several million, or even billion of these numbers, called parameters. (ChatGPT 3 for example needs 175 billion.) These parameters are always paired with a matching mathematical formula. In our case, this was the calculation:
your height + wall height = height above ground
This is called the model. Together, the model and the parameters make a modern AI.
Machine Learning Models
While parameters control the actual learning of an AI, what the AI will actually do is in the hands of the mathematical formula selected, that is to say the model. And selecting the model, now that is complicated. In fact, the field of machine learning on an academic level can simply be summed up as the discovery and study of models for the purpose of creating AI. And it’s not just Computer Science anymore. There’s a heavy overlap with the fields of mathematics and statistics.
These mathematical models take lots of different shapes and forms, each with many variants and levels of complexity. Different models achieve different goals in different ways, and work better with different types of data, like texts or pictures. They tend to be very large, and have lots of subsections that feed into each other and occasionally even loop back. Each of these subsections is known as a layer. You may have heard the term deep learning. Deep learning is just machine learning with a formula that has a lot of layers, and consequently is slow to learn and expensive to train. But oh how clever when the training is complete!
Neural Networks
One type of model is especially famous, and it deserves its own shout-out. Remember how we compared an AI being trained to a human baby learning? Neural networks are a type of machine learning model literally inspired by the structure of the human brain.
The human brain feeds electrical signals from the nervous system into the brain, and neurons process this input to interpret it, but also evolve to learn from it.
Artificial neural networks feed numbers from data into a mathematical formula, and parameters process this input to interpret it, but also evolve to learn from it.
They’re the same, see? Except that the human brain is still much faster, lighter, more compact, more energy-efficient, adaptable and all around multi-talented than our current clunky artificial models. As far as supercomputers go, human brains still rule.
Aims of AI
One of the fundamental questions faced by any AI engineer selecting a model for a new AI is what they aim to achieve. For machine learning models, these goals broadly fall into two groups: categorizing something existing, or creating something new.
One of the big buzzwords of the last few months in AI circles has definitely been Generative AI. For perhaps not so surprising reasons, its more common, but decidedly less sexy counterpart, Discriminative AI, hasn’t had the same hype. In fact, it is so rarely used that my spell-check keeps trying to correct it. But understanding the distinction is an important part of understanding AI today.
Discriminative AI
First of all, this is not as bad as it sounds. The term “discriminative” is used here in its original Latin meaning, discriminare, “to divide, separate”. Most AI that exists today is discriminative, in the sense that its job is to separate things into groups.
A common example of this is grouping emails into spam and non-spam categories. Once a discriminative machine learning model has been taught how to tell the difference between a wanted email, and a spam email, it can be used to predict whether a new email is spam or not, and handle incoming emails accordingly.
Another example is facial recognition. First the AI has to identify and separate the part of the image or video containing the face. Then it needs to compare the face against a database of existing faces, and see if there are any matches.
And of course chess artificial intelligences, and other gaming AIs, are all discriminative. After all, all they need to do is compare all possible moves at a given point in the game and, based on hundreds of thousands of previous games, choose the best one. Easy, right?
Discriminative AI might not be as buzz-worthy as generative AI, but it is very valuable in that it does the job of scanning and organizing massive amounts of information, a job most humans find very boring, and generally does it very well.
Generative AI
Generative AI is much more controversial than discriminative AI, because unlike its busy little sorting robot counterpart, generative AI often does things humans actually enjoy doing themselves. Generative AI, as its name indicates, generates new data based on patterns identified in existing data.
For example, after looking at millions of oil paintings and identifying the patterns that exist in them, a generative machine learning model can draw new oil paintings that perfectly match the style. After looking at millions of sheets of classical music, it can compose new classical music. It can write screenplays and novels. It can even make pictures and videos of people saying and doing things they never said or did. These are known as deepfakes, and they are hugely problematic, particularly for politicians and public figures.
Generative machine learning models have experienced huge advances in the past couple of years, and gained a lot of publicity with ChatGPT, a chatbot which looked at millions of lines of written conversation on the internet in order to learn to interpret questions and generate appropriate answers.
Something I want to address right away, because I’ve seen this misunderstanding a lot, is that by itself, generative text AI does not fact-check. For example, if you ask an AI to cite a specific section of the tax code, it will not check the tax code. It will make something up that looks like that section of the tax code. This is known as a hallucination.
Once again, generative AI is not Google Search. It does not look up existing information. It creates new information it thinks you want to see. In the age of AI, it is vital to understand this. Otherwise you might end up like that lawyer who asked ChatGPT to write his court filing for him, only to discover that the AI made up a bunch of fake cases to support him.
In the AI’s defense, it was only trying to help.
The Overlap
Now the above may have given the impression that any machine learning model is either discriminative or generative, but it is worth noting that many models can in fact, do both. Historically, discriminative AIs were more common, because it’s much easier to teach an AI to sort something that exists than to create something new. But with certain types of AI, going from discriminative to generative is merely a matter of configuration. After all, once you learn how to recognize a face in a picture, you have also gained some ability to draw a face yourself.
Some AI models even capitalize on this fact. One example would be Generative Adversarial Networks, or GANs for short. GANs actually train two machine learning models at once, one generative and one discriminative, by pitting them against each other. The generative model has to create a new piece of data, which is mixed in with other data, and the discriminative model has to try to find it. In other words, the AI no longer just learn from humans, but also from each other.
No need to get out the pitchforks yet, though. They still do this under supervision of a human.
Whatโs Next (On This Blog)
In this second example, we’ve now successfully reviewed machine learning, and how modern AI actually works, and looked into the goals AI engineers try to achieve. But there is one more aspect of AI left for us to cover, and that is the data. Math is all good and well, but any AI, no matter how cleverly set up, lives and dies by the quality of the data that it is trained on. And so that is what we will end this series with.