Machine learning: A term I’m sure you’ve heard of (especially if you’re here). Recently, it’s been the big thing that’s touting some even bigger claims, but has anyone really explained much about it to you? Here, we’re going to go over some basics about Machine Learning, such as what it is, why it’s important, and how it works.
What is Machine Learning?
Machine Learning is a subset of AI science that focuses on making programs that optimize themselves by learning. Machine Learning allows a system to learn from experience rather than being explicitly programmed to do a task from the start. Rather than telling a computer specifically what it can and can't do with hard code, ML gets closer to how humans learn by giving a machine the tools to ingest large quantities of data and teach itself the rules, similarities, and relationships between information through experimentation.
Why Machine Learning?
Machine learning is useful for tasks that don’t have perfectly consistent binary factors. Early breakthroughs in AI (like turning a computer into a checkers master) were manageable to program by hand because the rules and factors that the program had to account for were not only simple to define mathematically, but perfectly consistent. To carry on the example, while the amount of moves and you have to choose from gets exponentially large: a checkers piece can only ever do a handful of things: move, not move, take, or be taken.
However, a task like telling the difference between pictures of checkers pieces, hockey pucks, and Frisbees becomes increasingly impossible for a programmer to hand write an AI to do due the number of factors needed to make an accurate assessment being infinitely higher: e.g. not only defining what each of the 3 things are and how they differ from each other in a way that a computer understands, but the condition of the object, the angle and distance the picture was take from, visual context, etc.
These are all things we as humans can generally do pretty reliably and intuitively. Unfortunately, intuition is still pretty hard for a computer, but ML gets a bit closer by providing a program with the tools and incentives to learn how to complete a task in a way that makes sense for a computer.
Components of a ML AI
ML systems generally consist of 3 components: the Dataset, the Features, and the Algorithm
A dataset is a large collection of relevant data (e.g. Photos, spreadsheets, video, sound, text, raw data, etc.) that the ML system uses as samples to learn from. A high quality dataset is crucial to the Machine learning process: imagine how hard it would be to learn how to read when all you have are smudged letters.
Features refer to what parts of the dataset the ML system should be looking at to learn from. A dataset can have its features labeled or not labeled depending on their use (more on that later). One thing to note is that the job of creating labeled datasets is being increasingly outsourced to you.
Yes: you. Thank you very much!
I’m sure you’re aware of CAPTCHA’s; those little tests where you write out a distorted word or click on pictures of traffic lights to prove you’re a human every time you try and comment, make a purchase, or login somewhere. Well, those answers you provide help generate labeled datasets for a litany of purposes: from digitizing books to teaching self-driving cars.
An algorithm is, very basically, a series of instructions that a computer uses to complete a task. In reference to machine learning, an algorithm is the processes that the program moves through to learn. Like anything, there are many different ways to approach a task (some of which are more relevant than others in certain situations)
Types of algorithms
Algorithms can be categorised by the way they approach learning. Below are some examples of different widely used algorithmic approaches to machine learning.
Supervised learning is what it sounds like: Where a programmer guides the ML system into a desired outcome.
They do this by providing a dataset that is labeled with the features that they would like the ML system to learn to analyse. The program then learns patterns and is corrected by the programmer until it can perform its task with a satisfactory degree of accuracy.
Unsupervised is the flip side of supervised: where a programmer provides an unlabeled dataset and makes the ML program search for recurring patterns independently. Unsupervised learning is good for data analytics, as it can sometimes recognise patterns in massive datasets that a human couldn’t. This is for 2 main reasons: 1) a ML algorithm can analyse massive datasets much faster than a human, and 2), by its very nature of being an unsupervised program, it does not learn the biases that a human would approach a problem with.
Semi- supervised learning
Semi-supervised learning is the mixture of supervised and unsupervised learning, as the ML program is fed a dataset that is a mixture of labeled and unlabeled samples. Semi-supervised learning is useful not only in focusing the results compared to fully unsupervised learning, but as a cost saving method; as procuring properly labeled datasets often requires a skilled human, and the larger the dataset the less feasible it becomes to have a fully labeled dataset.
Reinforcement learning differs from the aforementioned types of Machine learning in that it does not rely on static data sets. Instead, the computer is given an “environment” to learn in that provides positive and negative reinforcement for good and bad actions, and it’s the computers job to maximise the good and minimise the bad. This type of learning is similar to how humans learn, through trial and error.
Good reinforcement learning tries to balance both exploitation (using learned methods of completing a task that are proven to work) and exploration (experimenting with untested methods to potentially uncover a more efficient/ beneficial process) in order to learn.
Reinforcement learning is a good approach to take in noisy and dynamic environments such as the physical world, as it can enforce behaviours through trial and error we would otherwise find difficult to define to a machine. An example of where reinforcement learning is used is training self driving cars to be road safe before being used in public.
While this is without a doubt just the tip of the iceberg, you’ve hopefully now got the gist of the whats, whys, and hows when it comes to machine learning.