Skip to Content

Testing of machine learning systems – The new must have skill in 2018.


The key to learning is feedback. It is nearly impossible to learn anything without it” ― Steve D. Levitt, Think like a Freak.

Machine learning, very simply put, is applications based on prediction using models. Building systems that predict is hard and validating them is even harder.

To best illustrate this, an excellent use case is in the medical industry where machine learning systems are used for DNA analysis to determine participating candidates for research treatments. The input data in this case are the DNA sequences that are constantly refined based on the data that the medical industry gathers. The outputs are the DNA sequences of patients who will benefit most from the research treatments.
These systems are based on learning algorithms, which change over time based on the learning data and number of iterations. Hence from a testing standpoint, the results for a fixed set of input will change as we learn more about the inputs.

Traditional testing techniques are based on fixed inputs. Testers are hard-wired to believe, that given inputs x and y, the output will be z and this will be constant until the application undergoes changes. This is not true in machine learning systems. The output is not fixed. It will change over time as we know more and as the model on which the machine learning system is built evolves as it is fed more data. This forces the testing professional to think differently and adopt test strategies that are very different from traditional testing techniques.

Mentioned below are critical activities that I believe will be essential to test machine learning systems:

1. Developing training data sets: This refers to a data set of examples used for training the model. In this data set, you have the input data with the expected output. This data is usually prepared by collecting data in a semi-automated way.

2. Developing test data sets: This is a subset of the training dataset that is intelligently built to test all the possible combinations and estimates how well your model is trained. The model will be fine-tuned based on the results of the test data set.

3. Developing validation test suites based on algorithms and test datasets. Taking the DNA example, test scenarios include categorizing patient outcomes based on DNA sequences and creating patient risk profiles based on demographics and behaviors.

4. The key to building validation suites is to understand the algorithm. This is based on calculations that create a model from the training data. To create a model, the algorithm analyzes the data provided, looks for specific patterns, and uses the results of this analysis to develop optimal parameters for creating the model. The model is refined as the number of iterations and the richness of the data increase.
Some algorithms in popular use are regression algorithms that predict one or more continuous numeric variables such as return on investment. Another example is the association of algorithms that create co-relations based on attributes of a data set. This is used for portfolio analysis in capital markets. Another illustration in digital applications is sequence algorithms that predict customer behavior based on a series of clicks or paths on a digital platform.

5. Communicating test results in statistical terms. Testers are traditionally used to expressing the results of testing in terms of quality such as defect leakage or severity of defects. Validation of models based on machine algorithms will produce approximations and not exact results. The testing community will need to determine the level of confidence within a certain range for each outcome and articulate the same.

In summary, software testing will be one of the most critical factors that determine the success of a machine learning system. Just like the models that we test, the hypothesis that holds true today may change tomorrow. The must-have skills that the test professional will need are critical thinking, an engineering mindset, and constant learning.