Experimenting with GPT3 to scale quality engineering

After successfully transitioning Software Testing to Quality Engineering, we are now industrializing the latter through the use of predictive and prescriptive techniques powered by advanced artificial intelligence and machine learning algorithms.

Many of us have completed at least two-digit waterfall programs to date, and it's no secret that they take months or even years to complete. Around 20-25 years ago, we had large engagements and built only a few times a year, which left us with enough room to perform quality assurance at the conclusion of development. We spent months and quarters writing test cases and achieving successful KPIs, but today, in 2021, builds occur multiple times per month, if not per day. Additionally, we incorporated sophisticated tools and programming, but the overall process remained consistent. Thus, with forward-looking eyes, it is obvious that we must bank up. At each stage of quality engineering, intelligent automation should be implemented at scale; the process should transition from waterfall to real-time Agile; and the process should adopt a theme. Be like Water – devoid of all shapes and forms yet agile, adaptable, and resilient. The following question occurs to us. How? The answer lies in powerful AI and machine learning algorithms that not only automate steps intelligently but also predict and prescribe modules, phases, and features that are more prone to error and provide self-learning techniques to eliminate human manual intervention and bias in the quality engineering process.

This chapter discusses a real-world example of Quality Engineering (QE) at Scale using Artificial Intelligence for one of the world's largest financial institutions.

Quality Engineering at Scale using AI

Quality engineering has shifted rapidly from person to process, tester to script, automation to intelligent automation, and manual feedback to self-learning scripts in order to address the scale and time to market challenges associated with solutions in today's fast-paced environment.

Figure: Diagram 1

Let’s explore all above topics through real example and understand the benefits. We start with Develop and Deploy phases and will conclude after Monitor & Recalibrate phase.

Develop and Deploy

Quality engineers devote considerable time to identifying the appropriate set of objects and writing a large number of test scripts and test cases. Our case involves a regionally prominent bank that was developing a loan origination and disbursement process. The business desired to develop an application that would automatically generate scores for leads on a daily basis, as customer information was constantly being updated and the criteria for flagging a lead as purse or not-purse could change as additional details became available, altering the probability of sanctioning the loan. Changing data points and features necessitates changing test cases and test scripts, which created havoc for the team. Artificial intelligence techniques enabled us to generate test scripts autonomously based on both baseline and newly added features. How does it work?

Identify key topics of use cases, requirement documents
Analyze application code using advanced Deep learning-based techniques
Generate test cases for data pipelines and modules
Generate test scripts based upon acceptance criterion and KPIs
Predict foreseen bugs and severity and prescribe the suitable solution?

GPT3 with 175 billion parameters was the backbone of all these experiments. Test cases were framed quite good per requirements and application code.
Every day new data brings another challenge of data ingestion, integrating with other pieces of the puzzle to get business insights in time. To handle this challenge, we used CI/CD pipelines for Data Ingestion which covers:

Build
Test
Release
Deploy
Validation and Compliance

CI/CD pipeline is a group of steps that must be performed in order to deliver a new version of software.
After CI/CD pipelines, we do not need to wait for final ingestion module to Integrate with other part of the project, so we have all the time latest data and code integrated.

Figure: Diagram 2

Now, through the use of AI algorithms, we have test scripts and test cases scheduled to be accepted as new data arrives.
An AI algorithm analyzed test cases and test logs to identify functions that were highly malicious, along with a probability score, in order to take appropriate action and automate even self-healing techniques in the future. AI-based program enables round-the-clock operation of lead scoring applications, while testers can execute tests as and when required with no reliance on application execution. Above all, it occurs in real time, which improves precision and accuracy.
For this use case, AI algorithms have been integrated into the various stages of the software lifecycle.
Self-learning and self-healing pipelines enabled models to become more intelligent with each consumption and decision.
At the lead generation stage, due to the large number of data sources – structured and unstructured – it is not possible to ingest and feed everything to determine whether or not intelligence decision making is occurring, and it is also a lengthy, expensive, and time-consuming exercise. By analyzing historical project data and QA costs, AI systems were able to forecast the cost of quality across the software lifecycle and the priority of each feature.
AI was extended to predict which modules or components in the software code are likely to have more defects, assisting in the definition of more effective testing strategies, automating test design, and increasing test coverage. AI analyzed production data to determine which procedures and functions are frequently used. We predicted what changes might be required in specific modules based on this insight, thereby reducing the time and effort required to design and execute regression tests, which could have taken considerable time in rolling out the solution for final decision making.
By mining patterns from the previous release's defect, root cause, severity, and discovery phase, we identified the most likely defect for the next release and automated the optimization of relevant test cases and test scripts.

Monitor & Recalibrate phase

We saw firsthand how we achieved QE at scale during development and design by implementing AI techniques. Once the code was deployed to production, another significant challenge awaited us. How frequently should we monitor our solution to ensure that it continues to deliver the expected results in accordance with business objectives? When we need to update our solutions with new data, features, or techniques, or even when we need to decide when to rebuild and re-tire.

All of these decisions are certain to require considerable effort and create a reliance on experts. Once multiple solutions are implemented across multiple business processes, multiple countries, and regions, this model becomes unscalable. What is the solution, then? Consider our case where we deploy to production and then use AI to automate the above actions.
Refer to the following architecture diagram to see how we enabled auto-monitoring, auto-recalibration, and approval-based training of production solutions.

Figure: Diagram 3

The architecture described above is hosted on the AWS cloud. The first step is to infer the deployed solution. It published predicted results and compared them to ground truth (real-world data) in order to determine accuracy. If accuracy degrades, it may be necessary to re-train or re-calibrate the solution, but obtaining ground truth data requires its own process time and cannot be accomplished prior to the actual date/time, posing a clear risk of solution depreciation and failure to meet expected KPIs. To overcome this, we used statistical techniques to analyze newly arrived data sets. Several techniques have been used, including KS Statistics, Population Stability Index, Z-proportion test, and Wasserstein distance techniques, to alert us to significant changes in the distributions of input data for influential features that are otherwise unchanged. All of these tasks have been automated and scheduled using DevOps pipelines.

Another difficulty we encountered was managing multiple versions of the same solution in production, with the requirement of dynamically redirecting inputs to the correct version of the solution. We used A/B testing on the AWS platform to validate the accuracy of each model on a specific dataset and then auto-redirected data to the best performing model to generate results for final decision making, as shown below:

Figure: Diagram 4

About the author

Jatinder K Kautish

Jatinder Kumar Kautish

Jatinder Kumar Kautish

Jatinder K Kautish is a Director at Artificial Intelligence Offering, L3 Certified Chief Architect, IAF Certified, working from India in Hyderabad. He is a regular Industry speaker at leading conferences and academic institutes. He has been awarded with 3 AI Innovation Awards in 2020-21 and performing an Advisory panel for Confederation of Indian Industry (Govt. of India) for 2020-21. He has a passion for positing AI technology at the core of every business and converting solutions into established offerings. Outside of work, you’ll likely find him mentoring academic and NGO technology projects or writing & reciting poems and training ambitious folks for Punjabi folk dance (Bhangra) or enjoying long drives.

About Capgemini

Capgemini is a global leader in partnering with companies to transform and manage their business by harnessing the power of technology. The Group is guided everyday by its purpose of unleashing human energy through technology for an inclusive and sustainable future. It is a responsible and diverse organisation of 325,000 team members in nearly 50 countries. With its strong 55 year heritage and deep industry expertise, Capgemini is trusted by its clients to address the entire breadth of their business needs, from strategy and design to operations, fueled by the fast evolving and innovative world of cloud, data, AI, connectivity, software, digital engineering and platforms. The Group reported in 2021 global revenues of €18 billion.

Get the Future You Want I www.capgemini.com

Cookies	Description
Registered visitor cookie	Cookie given to each registered user.
Registered visitor functionality cookie	Cookies used to remember the unique identifier given to each registered user.
Social plug-in content sharing cookie	Cookies set by services such as Facebook Connect or Twitter Button, which allow social networks users to share the content of our websites on social networks.
Unregistered visitor cookie	Cookies used to give to unregistered users a unique identifier in order to recognize them and to analyze how they use the website.
Analytic cookie	Cookies used to store URLs of the previous page visited, enabling to track users navigating from inside or from outside the website. If you click on a Sogeti advertisement on a non-Sogeti website, a cookie may be used to log which website you are on, in order to ensure our advertisements are served effectively and to measure whether our advertisements are viewed. Google Analytics: cookies set by Google analytics are used for web analytical purpose, but are not used to track individual users. For further information on how Google Analytics collects and uses information on our behalf and the right to use such cookies, please refer to the Google Analytics products and services privacy statement. If you object to your Personal Data being collected by Google Analytics, you may download and install the Google Analytics Opt-out Browser Add-on. Pardot: cookies set by Pardot are used to track users on our website. Visits are tracked for known users only. Unknown users are recorded as anonymous users. Please refer to Pardot privacy policy for any further information on their use and your rights related to the use of such cookies.

Experimenting with GPT3 to scale quality engineering

Download the "Section 4.2: Automate & Scale" as a PDF

Use the site navigation to visit other sections and download further PDF content

After successfully transitioning Software Testing to Quality Engineering, we are now industrializing the latter through the use of predictive and prescriptive techniques powered by advanced artificial intelligence and machine learning algorithms.

About the author

About Capgemini