I talked recently about the process of choosing the right algorithm involving a lot of trial and error. Using the Algorithm Cheat Sheet provided by Microsoft is a great start, but, having a basic understanding of how the various algorithms work will certainly help guide you towards the one that is right for your situation.
There are three basic types of ML algorithms: Supervised, Unsupervised, and Reinforcement Learning. Azure ML works with Supervised and Unsupervised. Reinforcement Learning is primarily used in robotics to teach the robot how to do something. Self-driving cars are a great example where reinforced learning is used.
Supervised ML is by far the most popular and that is reflected in the fact that all but one algorithm in Azure is Supervised. Supervised algorithms are “predictive”. They are used to predict a future outcome based on historical data. With this type of algorithm, you must be clear in what you want to learn and how to go about learning it. You supervise the process. You get to choose what data is most important. Because you are only guessing as to what data is most important, you would run several experiments using a variety of permutations.
There are three subsets to the Supervised algorithm: Classification, Regression, and Anomaly. You would choose a Classification model if you trying to choose between different things: Red or Blue; aquatic or non-aquatic; Small, Medium or Large; a team in a tournament.
Regression algorithms are used when you need to predict a number or value. This could be a sale percentage, a stock price, a number of units sold.
Anomaly algorithms are used when you are looking for the outliers in a set of data. This is when you have a large set of data where most of it is as you would expect or want, but you want to find out where you can expect trouble. This type of algorithm is commonly used in fraud detection. The algorithm will learn what “normal” looks like, and then separate out the data sets that do not follow that normal pattern.
Unsupervised algorithms are “descriptive” models. They are used to find patterns in what looks like random data. For example, a retailer could use an unsupervised algorithm to figure out what combinations of products are often purchased together. In this model, you have no specific target in mind and you do not pick or choose any specific features as particularly important. A medical authority may use this to see which diseases are likely to occur when one specific disease is present.
Another consideration when choosing an algorithm is how linear your data is. If you try to generalize a non-linear data set, like traffic congestion over the course of a day, you are not going to get an accurate picture of traffic patterns. When you are dealing with non-linear data with peaks and valley like rush hour and 3 AM traffic, you need to choose an algorithm that will not reduce the problem down to an average. Having said that, linear algorithms are still a great place to start. They are simple and fast to train, so they can give you a quick overview of your data. In addition, if you are trying too hard to be accurate, you may end up over-fitting. This is where your algorithm is built to fit your limited set of data points. You end up describing the random error or noise instead of the underlying relationships. Sometimes a set of data that is best described by a linear function does not look linear due to the outliers in the data. The outliers may cluster, giving it a non-linear feel.
When setting up your experiment, you get to play with the various parameters that affect the algorithm’s behavior. Choosing things like error tolerance or number of iterations can have an enormous effect. The more parameters you choose, the more trial and error you introduce to the experiment. Azure ML has a parameter sweeping feature that automatically tries all parameter combinations (you choose the granularity). Keep in mind that the time required to train a model increases exponentially with the number of parameters.