Machine learning is the study of algorithms that computers use to perform specific task without being explicitly programmed. Machine learning makes computers more like humans in terms of the ability to learn. This is one of the most exciting technologies actively used today in greater number of places than expected.
Importance of Machine Learning
The number of organizations has been increasing rapidly. So, the testing and releasing the code together exponentially increase across all these environments, frameworks etc. Test Automation enables speed in software organizations but automates repetitive tasks.
Machine learning algorithms learn from data on how to achieve the task and improve the performance. Goal can be set with ML and let the machines to figure out how to achieve it without human interventions and mistakes and much faster.
Machine Learning in Testing
ML algorithms learn from data on how to achieve tasks continuously to improve performance based on experience. In general, the word “Testing” in relation to Machine Learning models is primarily used for testing the model performance. This will be in terms of accuracy or Precision of the model.
Machine learning models can be tested as conventional software development from QA perspective.
Based on the requirements data is sourced, model will be designed, and training code i.e. data processing will be implemented. The model is trained, and performance of the model is measured on validation and using test data sets.
Model can be optimized during training. If model’s performance is not enough then more training data can be acquired, and same implementation can be retrained.
Data can be collected continuously, and performance of the model can be increased with training data. ML can also be interpreted as data driven development.
What are the Blackbox Testing Techniques for Machine Learning Models?
Some of the techniques that could be used to perform black box testing on Machine Learning models are:
- Model Performance
- Dual Coding
- Coverage guided fuzzing
- Metamorphic testing
- Comparison with simplified, linear models
- Testing with different data slices
Model Performance:
The testing model will be validated using test data/ new set of data and the performance of model will be compared based on parameters like accuracy/recall etc. with pre-determined accuracy of the model that is built already and moved to production.
Metamorphic Testing:
One or more properties are identified which represents the metamorphic relationship between input and output pairs. In meta morphic testing, the test cases that result in success will lead to another test case that can be utilized for further testing of ML models.
Example of test plan:
- Given is a male and smoker. Wants to determine the likelihood of person suffering from disease when his age is 30 years.
- The likelihood must increase by more than 5% when the age is increased by 5 years
- The likelihood should increase by more than 15% when the age is increased by 10 years.
Test engineers must work with data scientists to understand the model details such as types of learning, algorithm etc. Test plans can be automated using scripts
Dual Coding
The basic idea behind dual coding is to build different models based on different algorithms. Prediction from each model is compared and each of these models are given an input data set.
Consider a model is built with various algorithms like neural network, SVM or random forest. If all these have accuracy of 90% or if random forest has accuracy of 95% then random forest will be selected. During testing all the models are preserved and input is fed into all the models.
If majority of remaining models other than random forest gives prediction and if that does not match with model built with random forest algorithm, then defect can be raised in the defect tracking system.
Coverage Guided Fuzzing
In this technique, data is fed into the Machine learning models and can be planned appropriately such that all the features activations are tested. i.e. models built with neural networks, decision trees, random forest etc.
Example: Consider a model is built with neural network algorithm, the basic idea is to come up with data sets or test cases that will result in activation of each neurons present in the neural network. The feedback obtained from the model is used to guide further fuzzing.
What are the Skills needed to Test ML models?
Some of the below mentioned techniques could be required to play vital role in testing AI or Machine learning models.
- Knowledge of Machine Learning concepts and related algorithms
- General data analytical skill
- Scripting knowledge with one or more scripting language
- Knowledge of terminologies like Precision, recall, accuracy etc.
- Knowledge about feature, feature importance etc.
Machine Learning Algorithms, Examples and Testing Use Cases
Supervised Learning
In this type of learning, the machine is provided with correct answers in advance through labelled training data from which the ML algorithm will learn an answer key.
Supervised learning will be used in predictive models like regression, association and classification where the relationship between input and output is clearly understood. It can be applied to many predictive scenarios in software testing.
Testing Machine Learning use case
Predicting the Risk in a Release:
Customers’ need can be better understood through predictive analytics. It plays a vital role in software testing. Predictive analytics adjusts in production based that is based on the results fed back into the software.
Research and development managers need to know the potential risks involved in a release. With this they must be able to decide whether to push into production or not.
Outcome that is worth predicting is the Risk score per release.
Checking if relevant data is available, number of commits, number of tests and their results, code coverage, number of builds, number of releases, number of user stories, day of the week of the release, number of solved bugs in release etc.
In the above case, Machine Learning linear regression can accurately predict the risk score for each release.
Reinforcement Learning
This type of learning is all about learning to make decisions sequentially. In other words, Output depends on state of current input and the next input depends on the output of previous input.
In Reinforcement learning decision is dependent.
Example: Consider there is an agent and reward with many hurdles in between. The agent must find the best possible way to reach the reward.
Goal of the robot is to get the diamond and avoid the fire.
Consider the below image with diamond, agent and fire. The robot learns all the possible paths and chooses the path that gives him the reward with least hurdles. Each right step will give reward and wrong step will subtract the reward. Total reward will be calculated when it reaches the diamond.
In Reinforcement learning:
- Input should be initial state from which model will start
- There are many possible outputs as there will be variety of solution to a problem.
- Training will be based on input. Model will return a state and user will decide to reward or punish the model based on output
- Model learns continuously
- Best solution is decided based on maximum reward
Neural Networks and Deep Learning
Neural networks are algorithms inspired by brain and are made of neurons that pass data between them.
Deep learning applies deep neural network technologies that solves problems through layers of abstraction.
Unsupervised Learning
In this type of learning, the machine is trained using information that is neither classified nor labelled and allows algorithm to act on the information without guidance. Task of machine is to group unsorted information according to similarities and patterns without any prior training on data.
Example :
Suppose it is given an image having both cats and dogs which is not seen ever. Machine has no idea about features of dogs and cats, so it can’t recognize. But can categorize according to their similarities, patterns and differences etc.
Unsupervised Learning is classified into two categories:
Clustering: Discovers the inherent groupings in data such as grouping customers by purchasing behaviour.
Association: Discovering large portions of data such as people that buy A also tends to buy B
Machine Learning example through Salesforce
Download Text to Speech app in Salesforce AppExchange
Once installed MindsLab app can be selected
Click TTS (Text to Speech) tab
Select the language as English
Give the input text and click Go
Converted into Speech (Audio)
We can also download it as mp3 format, and it will be present under Documents tab
Click on document name and view it.
Email document button enables us to send the document via Email.
Conclusion
Organisations are still contemplating whether to embrace it in their testing practices. Once initial investment to set up AI system in test automation is made, organisations will be able to generate greater testing rewards for less money.
A good way will be to retrain humans while monitor AI bots and their results. Machine learning can reduce repetitive tasks and used to deliver top quality products in market. AI Bots based testing easily adapt and work around all the new paths and features through the product. Testers need to think differently about future of testing rather than worrying about their careers becoming obsolete.