State of AI applied to Quality Engineering 2021-22
Section 2: Design

Chapter 5 by Capgemini

Another view on test case optimization

Business ●○○○○
Technical ●●●●○

Listen to the audio version

Download the "Section 2: Design" as a PDF

Use the site navigation to visit other sections and download further PDF content

By submitting this form, I understand that my data will be processed by Sogeti as described in the Privacy Policy.*

Machine learning algorithms can assist in creating an automated suite of prioritized (e.g. Random Forest) and deduplicated test cases (e.g. TF-IDF, LSA) to achieve high defect yield per test case execution. Other more sophisticated algorithms exist to address classification and de-duplication issues. They do, however, require high-end computing and the availability of GPUs for model training and deployment. Our more straightforward models appear to suffice in terms of ease of use and accuracy.

As we are all aware, agile methodology improves risk visibility early in the project's lifecycle. As a result, the importance of regression testing[1] in agile software development grows exponentially. As the size of software increases, so does the size of the test suite and the effort required to maintain it. As the test suite grows in size over time, manual optimization becomes more difficult. As a result, duplicates creep into these test suites and must be prioritized for execution over time. Deduplication and prioritization by hand are implausible.

When an engagement lasts a long time with a company, test management activities run the risk of becoming cumbersome. Attrition, growth, and organizational changes all have an effect on the productivity of the testing team. Frequently, newcomers begin writing test cases without verifying that they exist, and then leave after a few years, and the cycle repeats. As duplicates accumulate over time, maintaining test assets becomes increasingly difficult. Manually removing duplicates is a time-consuming and unsatisfying task. How can we expedite this activity?

Similarly, the average number of test cases encountered by some of our clients, leading banks in the United States and Canada, is close to 50000. Weeks or months are required for a subject matter expert (SME) to go through the test cases and prioritize them. How can we improve test case prioritization and accuracy while still adhering to the agile velocity?
This chapter discusses how to address and resolve both of these fundamental problems through the use of machine learning algorithms.

Solution overview

Our platform, named SmarTEST, illustrates the overall flow of the entire process of test case prioritization and deduplication. It is summarized using a typical scenario and the workflow that was implemented.

Workflow of SmarTEST explained with a Scenario

Workflow of SmarTEST explained with a Scenario

Let us walk you through a concrete scenario from one of our customers, a leading bank in the United States. Prior to implementing SmarTEST, the client had a total of 2,400 regression test cases. Using all the regression test cases, the number of defects that were found was 120, Thus, the defect yield, which is a percentage of defects discovered during testing, was 5%, which was quite low. The cycle time for the execution of all the 2400 regression test cases was 15 days for ‘X’ resources.

Test case prioritization

Using the Random Forest machine learning classifier, SmarTEST analyzed multiple data dimensions and classified test cases as high, medium, or low priority. The algorithm is trained using historical test case data and multiple parameters. This includes the complexity of the test case, the number of steps in the test case, the defect count, the details of how the test case was executed, and the severity of the defect. The model can then be used to determine the priority of test cases. The trained model when run against the 2400 test cases classified it as 800 high priority, 1000 medium priority and 600 low priority test cases. After deduplication of high priority test cases [procedure described in the next section], the number came down to 700. Executing the high priority test cases the number of defects identified was 119. Testing with optimized regression test cases thus increased the defect yield to 17%, thereby increasing the test case's effectiveness. Executing the high priority test cases the cycle time was reduced to 10 days with the same number of resources. Thus the overall test cycle time was reduced by 30%.

Additionally, even after test cases are prioritized into appropriate buckets, the number of test cases in the high priority bucket may remain large. We encountered this situation with many other customers. Even after prioritization, the high priority bucket contained up to 25,000 test cases. This is a large number when one considers that the regression test is being conducted as a result of the previously discussed factors. To further address such situations, we developed the concept of the Quantitative Risk Index. This is a numeric value between 0 and 1 that is calculated using the same parameters as the classifier's input. A quantitative risk index value of 1 indicates that the probability of the test case failing is extremely high, making it a necessary test case to execute.

Additionally, we verified manually to ensure the accuracy of the test case prioritization method described above. We obtained an accuracy of approximately 92 percent, indicating the method's suitability. Figure 2 illustrates an example of the result screen that is displayed after the test case prioritization is completed.

The benefits of this 2-step method are listed below:

  • Increased test speed by 30%, saving time and maintaining agile velocity. In the above case study, the cycle time was reduced from 15 days to 10 days by executing only the high priority test cases which identifies more than 99% of the defects against executing all the 2400 test cases.
  • Improved test execution decision-making. Thus, the solution prescribes critical test cases that must be executed within constrained timelines.
  • Maximum prediction accuracy achieved through continuous learning and re-calibration of machine learning models.
Workflow of SmarTEST explained with a Scenario

Results screen from test case prioritization

Test case deduplication

We remove duplicates test cases from test suites using two natural language processing (NLP) based algorithms

  1. Term Frequency – Inverse Document Frequency (TF-IDF)
    TF-IDF is a technique for quantifying a word within a document. TF-IDF vectorizes parameters such as the test case name and description and compares them to the same parameters in other vectorized test cases. The comparison is made using a function called 'Cosine similarity,' which we refer to as the similarity index. The similarity index ranges from 0 to 1. A similarity index of '0' indicates that the test cases do not match, while a similarity index of '1' indicates that the test cases do match completely. Any value between 0 and 1 indicates how similar the test case is to the one being compared. When we obtain a similarity index that is very close to 1, we refer to it as near – similar. We can specify a comparison threshold for the similarity index. If the similarity index is greater than the threshold, they are considered duplicates. This method can be used to remove the majority of duplicates.
    TF-IDF is a very straightforward method for deduplication that is successful in the majority of cases. However, it has the disadvantage of requiring similar words in the test cases in order to achieve a high Cosine Similarity Index score. It disregards the sentence's context and meaning entirely. As an example, consider a couple of test cases. One of the test cases is labeled 'Go to the website,' while the other is labeled 'Navigate to the website.' A human being can comprehend that both test cases mean the same thing. If we compare these using TF-IDF, we will obtain a low score and may end up with both test cases.
  2. Latent Semantic Analysis (LSA)
    This is where LSA clearly outperforms TF-IDF. LSA makes an attempt to capture hidden concepts by leveraging the context surrounding the words. We have implemented test case de-duplication for a number of clients and have observed that approximately 15%-25% of duplicates are identified and removed in each case. In the case study discussed [in Test case prioritization section], running the deduplication algorithm it was possible to identify 100 duplicate test cases just among the high priority ones which amounted to 13%. The heat map plot of the test cases gives us the number of duplicates. Figure 3a and 3b illustrate a heat map plot generated using TF-IDF and LSA, respectively.
    The x and y axes of this heat map plot contain the test case numbers. Thus, if both test cases are similar, we have a value of 1, which is indicated by a yellow color, and if they are dissimilar, we have a value of 0, which is indicated by a blue color. In figure 3a, we see a strong yellow line running diagonally. This is because the test case is compared diagonally to itself and thus has a similarity score of 1. Off the diagonal, the heat map reveals smatterings of yellow. This indicates that duplicates exist. In Figure 3b, which depicts the LSA heat map, we observe a similar phenomenon along the diagonal, as the test case is compared to itself. Additionally, we see a strong presence of yellow color off diagonal. This is because, as previously stated, LSA attempts to comprehend the sentence's underlying context.
Figure: Heat map plot for TF-IDF and LSA

Heat map plot for TF-IDF and LSA

The following are the advantages of test case de-duplication:

  • Identify and remove duplicate test case pairs to create an optimized and unique test case pack.
  • Determine similarity between test cases based on three parameters: test case name, test case description, and test case steps.
  • Approximately 15%–25% of identical and near-identical test cases in the overall test case pack are identified and removed.

About the authors

Venkatesh Babu

Venkatesh Babu

Venkatesh Babu is a technology leader, with 22+ years of experience in JEE, .NET, Mobile, Cloud, Automation, SMAC, IoT, RPA, AI/ML, Digital technologies - architected, designed and delivered enterprise solutions for global clients. He is working in the Research & Innovation Group and passionate about Emerging Technologies, Focus areas include Cloud, Artificial Intelligence & Machine Learning IoT, Gamification, Gesture Recognition, Augmented Reality, Blockchain, Big Data, Microservices, Design thinking, Solution Architecture & Consulting, Application Integration, Product Engineering, Wearables, Crowdsourcing and Technology evangelization.

Raghav Menon

Dr. Raghav Menon

Raghav Menon is passionate about AI/ML, Natural Language Processing, Speech processing, Image and Signal Processing. He has a PhD in Signal Processing and has several years of Research and Development experience. At Capgemini, he had been working with the Cognitive Document processing team. Currently he is attached to the Analytics COE team in Capgemini where he looks into the application of AI/ML algorithms for software testing among the other areas of application of AI/ML. He has several publications in the areas of AI/ML, Speech and Signal Processing which can be accessed at His last assignment had been with the United Nations, Global Pulse Labs in Stellenbosch, South Africa.

About Capgemini

Capgemini is a global leader in partnering with companies to transform and manage their business by harnessing the power of technology. The Group is guided everyday by its purpose of unleashing human energy through technology for an inclusive and sustainable future. It is a responsible and diverse organisation of 325,000 team members in nearly 50 countries. With its strong 55 year heritage and deep industry expertise, Capgemini is trusted by its clients to address the entire breadth of their business needs, from strategy and design to operations, fueled by the fast evolving and innovative world of cloud, data, AI, connectivity, software, digital engineering and platforms. The Group reported in 2021 global revenues of €18 billion.

Get the Future You Want  I




Capgemini logo