A Developer’s Guide To Machine Learning
Machine learning is now so commonplace – not least because of its application to the vastly growing world of big data – that some of the most ubiquitous advertisements on YouTube, amid tantalizing glimpses of sunny holiday destinations, are promotions for machine learning courses.
According to Stanford University, the definition of machine learning is the science of getting computers to act without being explicitly programmed. Designers of the course taught by Andrew Ng, founder of the Google Brain deep learning project, note that in the past decade machine learning has provided the basic underpinnings of self-driving cars, along with practical speech recognition, effective web search, and a vastly improved understanding of the human genome. They add: “Machine learning is so pervasive today that you probably use it dozens of times a day without knowing it. Many researchers also think it is the best way to make progress towards human-level artificial intelligence.”
Key Machine Learning Applications
Machine learning is used in many applications such as image recognition, whether to identify faces in order to unlock a smartphone or character recognition for language-related tasks such as translation. It is employed in speech recognition, including transcribing spoken words from audio recordings into text or using voice for telephone dialing, adjusting car controls while driving or controlling home appliances.
Machine learning is becoming increasingly important in medical diagnostics to establish the type of illness a patient is suffering and predict how it will develop. The financial world has also benefited from machine learning for creating automated trading strategies, particularly for extremely high-speed trading and the near simultaneous trading of a large number of financial instruments, as with arbitrage hedge fund strategies.
Historically, machine learning software was written in a way that specified all possible options for the computer in making decisions. Modern machine learning differs by building a structure into the program that allows the computer to make decisions for which it has not been explicitly programmed, by “feeding” it a large volume of data from which the program can learn directly, rather than using an equation specified by the developer.
Feeding the data into its generic machine learning algorithms teaches the model to operate on data without programming. It is a highly dynamic concept that continues to evolve over time – the computer’s learning increases progressively the more data it absorbs, a particularly key process at a time when big data is evolving so rapidly. The process starts with the computer guessing the conclusions and then fine-tuning them against the right answer or outcome.
The learning process can be of two types: supervised and unsupervised learning. Choosing the right approach for a particular problem is crucial, but it can be a somewhat tedious process. The choice will vary depending on whether the outcome needs to be a specific prediction or whether a large set of data needs to be analysed for common markers not known in advance.
Supervised Learning
Predictive analytics have received a significant boost from machine learning technologies. Supervised learning, for instance, finds patterns in both input and output data and then starts making its own predictions. During the process, a developer defines what features the program will use and what kind of output is expected. The learning techniques it uses are either classification, when an either/or response is expected, or regression, which is used to generate a continuous response such as stock market predictions or a weather forecast.
In unsupervised learning, the software finds patterns based only on input data. This type of learning is better when trying to analyze raw data where it is not clear what kind of answers will be generated. For instance, the raw data can contain a large amount of information about hospital patients and the analysis could bring up links that earlier were not apparent to doctors.
It is not a given that all of the conclusions will actually be useful or actionable – for instance, that more patients got sick on a Tuesday than any other day – but the analysis will deliver more information than could any human brain.
Most of the unsupervised learning techniques are a form of cluster analysis where data is grouped based on some shared or similar characteristics. This type of learning will sort the data into groups that are in some form similar.
Neural Network Machine Learning
A key component of machine learning is neural networks, a decision-making engine consisting of numerous simple elements which process information using dynamic state responses.
“Neural networks are one of the most beautiful programming paradigms ever invented,” says scientist and programmer Michael Nielson, author of the book Neural Networks and Deep Learning. Neural networks are inspired by the architecture of the brain and make it possible for the computer to learn from observational data using deep learning.
“Until 2006, we didn’t know how to train neural networks to surpass more traditional approaches, except for a few specialized problems,” Nielson says. “What changed was the discovery of techniques for learning in so-called deep neural networks. Today, deep neural networks and deep learning achieve outstanding performance on many important problems in computer vision, speech recognition, and natural language processing.”
Neural networks use a set of equations to process input data and produce some form of analysis or judgement. They can incorporate training data – used to check whether the program is successful – as well as input data that describes the intended behavior of a system. And the analysis of that data improves slowly over time as it adjusts the parameters of the network.
Open Source Libraries
Developers use models built with open-source software libraries for dataflow programming across a range of tasks, including Tensorflow, Torch and Caffe. Tensorflow was developed for internal use by the Google Brain team but opened up generally for developers in late 2015; Torch is based on the Lua programming language; and Caffe, which stands for Convolutional Architecture for Fast Feature Embedding, was originally developed by the University of California, Berkeley; it is written in C++ and uses a Python interface.
In recent years, Python has become the language of choice for machine learning, particularly for developers with a computer science background, and has increased in popularity over Java, which formerly was nearly as popular and continues to benefit from the availability of open source material with Java application programming interfaces. R remains the third most-popular language but its take-up has been expanding rapidly, and it appears poised to overtake Java while other language options include C++ and newcomers Scala and Julia.
The fundamental goal of machine learning is to generalize beyond the examples in the training set. One of the common mistakes in machine learning is testing the program on training data that can give an unrealistic sense of success. If done improperly, the result can be no more precise than guesswork. A simple solution is to set some data aside from the beginning and only use it to test a chosen classifier – an algorithm that implements classification – in the final stage.
With machine learning one of the fastest growing fields in the technology world, the number of jobs opening up for experienced developers is mushrooming. Machine learning is revolutionizing ecommerce and IT as companies like Google, Tinder, Google Maps, Snapchat, OvalMoney and Netflix all use machine learning as part of their services to make the best possible judgement about what their clients may want to buy, watch, do or say next, and as the technology becomes more sophisticated, investment in the field and in data science as a whole, as well as in big data, is set to accelerate in the years to come.