State of AI applied to Quality Engineering 2021-22
Section 4.2: Automate & Scale

Chapter 6 by Worksoft

Self-healing RPA for Quality Engineering

Business ●●○○○
Technical ●●●○○

Listen to the audio version

Download the "Section 4.2: Automate & Scale" as a PDF

Use the site navigation to visit other sections and download further PDF content

By submitting this form, I understand that my data will be processed by Sogeti as described in the Privacy Policy.*

Intelligent automation is now mainstream technology when it comes to Quality Engineering and Robotic Process Automation (RPA) and yet its full potential is largely unrealized. 

This is primarily because automation tends to be brittle in the face of application and process changes. Another reason limiting automation's return on investment is that automation is implemented in silos and is not intelligently integrated across the development and deployment lifecycle. Addressing these two issues can vastly expand the scale of automation in QE and improve the business outcomes associated with its use. QE and RPA can be used in a symbiotic way to maximize value.

A startling 30-50% of automation projects fail. Intelligent automation experts like Shail Khiyara agree “It’s in the complex processes that bots become fragile” [1]. The problem does not stem from a lack of interest or investment in automation, but the inability to derive sustained benefits that are commensurate with the level of investment that companies make when implementing automation.

Automation breaks due to infrastructure-related issues, software reliability, data changes and a host of other issues. And when a bot breaks in production, critical operations and end user experience are compromised.

Why is this so hard, and why do so many automation projects fail outright or only achieve underwhelming results? There are three primary reasons:

  1. The inability to achieve critical mass
  2. The inability to keep the automation working as underlying applications and associated business processes change continuously
  3. The inability to break the silos that exist between QE and RPA efforts and teams

Let’s look at each of these problems separately.


[1] Webinar “Connective RPA – Different by Design”, November 18, 2020 by Worksoft

Reason 1: The Critical Mass Problem

Most companies treat automation in an ad-hoc manner. People identify tasks that intuitively seem like good candidates for automation and then go about automating them on their own. The excitement of initial success then leads others to follow suit, finding additional tasks that are low-hanging fruit for automation. The automation seems to work for a few days, until the application or underlying task changes and then it has to be re-built again. Meanwhile the IT organization has to contend with uncontrolled bots running amok and is forced to implement restrictions, take them over, or outright shut them down.

During this ad-hoc process, people come to the wrong conclusion that automation can only be applied to stable processes (or applications) and bots cannot be very complicated. This is self-defeating because the investment in infrastructure, software, training and consulting services is not justified to achieve these simple results.

Forrester’s Craig LeClair describes the limitations as a rule of three 5s[2]. First, viable bots should make less than five decisions. Anything beyond that get complex and difficult to maintain and requires some type of embedded rules management.

Second, less than five applications should be involved in the entire robotic process. LeClair says the talent of software robotics is that they act like humans, or a digital worker. But going beyond five applications may break the bot. Finally, the number of clicks in an RPA process should be less than 500 clicks. He says anything beyond that is probably not repeatable.

But these limitations create real barriers to ROI. For automation to be successful long term, you need to build lots of it and use it to solve complex problems that are meaningful to the business. QE and RPA automation cannot simply turn into glorified macros that we had in the 1980s.

[2] Forrester Report – Use The Rule Of Five To Find The Right RPA Process

Reason 2 : The Bot Fragility Problem

Companies use anywhere from a few dozen to a few hundred applications to run their businesses. A good percentage of these applications are consumed from the cloud in a SaaS model. It’s a fact of life that these applications change frequently and the rate of change itself is accelerating.

With multiple applications in your environment with many connections between them, packaged application customers that rely on SAP, Oracle EBS, Salesforce, Workday and solutions like them are facing frequent application releases and updates throughout the year, and sometimes even more frequent patches and fixes to address bugs and security concerns. All of this persistent and pervasive change results in a constantly evolving application environment. And that doesn’t even take into account the corporate initiatives and internal development efforts aimed at increasing efficiency and maintaining competitiveness.

Along with rapid application developments, digital business processes themselves are constantly evolving to keep pace with the changing competitive, business, and regulatory environments in which global organizations typically operate.

These changes in turn break automation, requiring frequent updates or re-writes. Updating or rewriting automation can be costly, particularly if you use code-based automation that requires legions of trained software engineers to maintain.

The expense of developing and maintaining bots is prohibitively expensive in the midst of all this change. According to Deloitte, SMEs can pay from $4,000 to $15,000 for a single bot. Forrester reports that 51% of enterprises with RPA programs have fewer that 10 bots in production.

Figure: Reasons why bots break

Figure: Reasons why bots break

Companies once again solve these problems in wholly inefficient ways by:

  • Hiring expensive consultants or “automation experts” – although some amount of expertise is necessary, this in and of itself will not solve the problem
  • Building additional complex frameworks on top of bad automation products – if the underlying automation is fragile, no number of frameworks on top will help
  • Limiting the amount of automation that is built by restricting who can build automation and making it hard to use and deploy
  • Resigning themselves to the false premise that automation is just expensive to build and maintain

Reason 3 : The Automation Silo Problem

In the majority of businesses, the disciplines of QE and RPA are managed by separate teams utilizing distinct solutions. A careful use of RPA to the overall quality assurance process could result in significant benefits, cost savings, and improved quality outcomes. On the other hand, applying quality engineering principles to the development of more resilient bots that are often tested and improved would lead to better production automation.

A better, more holistic overall approach is required to address the critical mass, fragility and automation silo problems from a technology and people perspective. We discuss one such approach in this chapter that includes self-healing automation and the use of RPA to streamline QE processes.

Modeling an Enterprise Application: The Object-Action Framework

Bots interact with applications either via the application front end (browser, desktop app, mobile app, terminal emulator, etc.) or via APIs. The application can be broken down into components as shown below:

Figure: Application components

App – Application name, e.g., Workday, Salesforce, etc.
Version – Version of the application being used
Window – Web page or application screen
Object – An element on the screen, e.g., ‘Post



Figure: Example automation


In the example above, we have an HR application that we want to automate. The automation steps then become:

  • Go to ‘Report Multi-Day Leave’ [Window] page
  • Select [Action] ‘Personal Time Off’ in the ‘Leave Type’ [Object] dropdown
  • Press [Action] the ‘Post Leave’[Object] button

From the pattern above, the automation goes to an application window (‘Report Multi-Day Leave’ page), finds an object (‘Leave Type’ dropdown or ‘Post Leave’ button) and performs an action (select ‘Personal Time Off’ from ‘Leave Type’ dropdown or click ‘Post Leave’ button).

Each of the objects has a type: ‘button’, ‘dropdown’, ‘table’, etc. Depending on the type of object, an action can be performed: ‘click’, ‘select’, ‘verify’, etc. All windows (screens) and elements of an application can be modeled in this way. Some objects are more complicated than others, and therefore have more complex actions, but the pattern of interaction is the same.

The model is always separate from the automation. The automation steps reference the model that is stored separately. The beauty of this approach is that if an object changes, then only the model definition of the object needs to be updated. If that object is used by a thousand automation steps, then all of them get updated automatically.

The trick now is to find good definitions to consistently recognize application objects and update those definitions automatically when the application changes.

Modern object recognition techniques

Recognizing objects on an application screen depends on the technology used to construct the application. In web applications one can use the DOM (Document Object Model), HTML elements, JavaScript APIs, built-in application IDs or elements in the frameworks (AngularJS, UI5 from SAP, etc.) used to generate the web pages. In desktop applications built using .Net or Java introspection techniques using accessibility features of the underlying technology can be used to identify and interact with objects.

Computer vision is also emerging as a viable technology to recognize objects on a screen. Sometimes a combination of techniques can be used as fail-safe mechanisms in case one technique does not work consistently.

For more complex elements, composite objects can be created that combine more than one object and be recognized as a single logical object on which consistent actions can be performed.

Object definitions can be assigned to any object, whether simple or complex. A short list of examples of items that can be assigned a common object definition include things like: Username Edit Box, Password Edit Box, Login button, Reset Password link, Sales Organization Edit Box, or the Table of Items on a Sales Order.

Having a database of object types and definitions allow new objects to be evaluated against those definitions using guided classification machine learning techniques that can automatically match new objects on the screen and classify them into the appropriate object type based on previous matches stored in the database.

Patterns in a specific application can also be used to train the model for the peculiarity of that application. Partial matching techniques can be used to locate objects even if a full match is not found.

Sophisticated object recognition is based on previous experience and real-world data collected in the field to make it more robust. This experience and collected data cannot be easily duplicated in a short time.

Extensibility: making the model field configurable for a specific application

Extensibility is a concept that allows object recognition to be tuned to a specific application. Applications have their own way of generating screens/pages and UI objects based on underlying frameworks that their developers use to generate the application user interface.

This means that the model can be deliberately trained to recognize specific patterns in the user interface of the application and then interpret those patterns as specific object types and actions that can be done on them. This training can be accomplished without writing any code, but rather by including definition files that instruct the model on how to understand the application screens or pages it encounters.

This explicit training is very useful because it ‘sets the stage,’ so to speak, with a baseline upon which the implicit training and pattern recognition can build on, which greatly speeds up the process.

Using both explicit and implicit training ensures that object recognition is more consistent and therefore resilient to changes in the underlying application. For example, object recognition across multiple entirely different versions of an application can now be automated without requiring the automation builder to make any changes when they transition from one version of the application to another.

Updating application models dynamically

In the previous sections we described how to model an application into its constituent windows, objects and actions. We further explained how objects can be consistently recognized and acted upon. Once this is done, it then becomes possible to update those definitions dynamically by recognizing changes in an application screen, matching those changes to the stored definitions, and then updating the definitions incrementally and continuously as minor changes are observed.

This is done when automation is executing. Instead of just carrying out the requested actions, the automation engine now ‘looks around’ every time it lands on a screen or a page to see if new objects have been added or if its previous definition of existing objects need to be updated. Using pattern recognition and partial matching techniques it can identify objects that are ‘similar’ to its stored definitions and update them to match the changes it observes. This ‘looking around’ can be done each time the automation executes or can be done by executing the automation in an ‘update mode’ to avoid possible performance hits from the process. The preferred method is to perform it each time the automation executes.

When automatic matching is not possible, the execution engine can prompt the user to provide a ‘guided match’ by explicitly assigning an observed object to a previous one. This helps train the model for the next time to become better and better as more executions are carried out.

Process Capture: Solving the Process Change Problem

The previous sections focused on automation resiliency to handle application changes. But what if the process itself changes? In this section, we will discuss techniques that can be used to make the automation resilient to business process changes.

Figure: Complex business processes can have a number of process variants and activity variants and the ability to recognize those patterns is key.

Figure: Complex business processes can have a number of process variants and activity variants and the ability to recognize those patterns is key.

As shown in the figure above, a business process consists of a sequence of business activities each of which in turn are a sequence of steps. Business processes can have variants, each variant being an ordered set of business activities. Business activities in turn can have variants, each variant being an ordered set of steps required to perform the business activity.

Software can be used to capture business activities by capturing all the actions that an end user performs against an application to perform a task (e.g., entering a sales order in SAP). The business activity and associated set of steps then forms a reusable component in the automation.

Use case study

A sustainable technologies leader chose to implement automated testing for their ERP and other systems across their 30 locations worldwide to accommodate aggressive timescales for going live. They engaged scalable automation to accelerate business process discovery, reduce test development costs, and save time overall.

“We actually had the Worksoft solution up and running within a week, because the platform is very lean in terms of infrastructure and very rich in functionality,” said the ERP Project Delivery and Solution Governance Manager. “Prior to that, we were looking for each deployment to take at least two months of manual regression testing. But now, we can have it done, tested within a week, signed off and approved, ready for deployment.” 

Since process understanding is critical to effective test automation, the platform also enabled them to automatically document their processes 300% percent faster, establishing a foundation for automation data while meeting quality assurance and compliance requirements. 

“Process Capture is probably the hidden gem of the Worksoft solution itself because it allows you to get the business users to actually document as per the normal day-to-day operations to capture those processes,” he said “We’re able to actually use the tool to capture the processes and then automate them as part of the next cycle. And the real value of that is by actually shifting it to the left, we’re able to catch those issues at an earlier stage before they impact production.”

Now imagine that a different set of steps need to be performed for the same business activity. When that happens, the software captures the new steps and is able to match that set of steps to the previously stored set of steps against that business activity. The partial matching techniques here are similar to those used to update the object definitions that we discussed previously. Instead of updating an object definition in the application, we update an automation component associated with a particular business activity. When process automation is run the next time, these new set of steps will be substituted instead of the old ones, automatically fixing the automation.

Of course, this can get complicated depending on business activity variants and matching particular variants vs recording new variants but here again guide classification machine learning techniques can be used to identify the right ‘like’ activity to be replaced.

Putting it all together: self-healing automation

Application models combined with business process models can be used very effectively to solve the automation resiliency problem. One addresses application variability and the other process variability. Both of these are required simultaneously to be effective.

Capturing business processes either explicitly or implicitly allows business users to be involved in creating and updating automation in a way that they understand. It does not put a ‘technical’ burden on end users to provide process information that is extremely valuable. On the other side as applications change, the automation ‘learns’ new ways to identify objects to interact with. Both of these use previous data models and continuous insights to update and evolve, constantly ensuring that automation does not break—no matter how often things change.

Using RPA to Enhance QE and Vice-Versa

AI driven RPA can be used very effectively for QE use cases. QE use cases in turn can result in better RPA. Starting with a use case, we will examine some specific areas to consider.

Use case

Enterprises are using change-resilient automation for a variety of use cases. A leading manufacturing client reports that the reusability of object-driven automation is really useful due the ability to repurposes and reuse scripts built for testing in production and in other processes.

The senior QA manager says “If you do it properly, you can reuse your test scripts in production or in other processes. That's where the RPA part comes in and that's where the regression testing comes in. And it’s not just bing, bang go and do a whole lot of things for one project and move forward and never use that automation again. That’s wasted time, energy and costs. We reuse automation constantly and it works really well.”

The company uses robotic process automation for a wide variety of multi-layered functions, including user experience management, documentation of custom development, master data creation for reporting, and most recently user verifications.

“We've just recently started doing verification of our roles in SAP and audit verification of our roles for the RPA side. And it's like 10,000 checks. We're using RPA to do the checks because we don't think that the human can do all those checks. In those tedious repetitions that you're going to miss things. So give that to a robot to do.”

Process Intelligence – Figuring out what to test

RPA efforts always begin with process understanding. This can be effectively used for automated testing. By merging data from back-end systems with end user behavior, a mix of techniques such as process mining, task mining, and other AI-based approaches can re-create process flows, process variants, and process documentation. This information is invaluable when constructing test scenarios. Instead of using process intelligence separately and only for RPA, this can be very effectively used for QE if done in the right way. For instance, the same team and solution that powers RPA may also power the QE lifecycle, thereby automating both pre- and post-production processes, effectively killing two birds with one stone.

Setting up systems, configurations and data

RPA use cases can be applied to an array of pre-production scenarios. For example most QE processes require:

  1. Setting up and configuring test systems – This can be done by RPA bots very effectively
  2. Loading appropriate test data from various sources – Once again RPA bots can be designed to do this by gathering information from various sources and ensuring that systems are pre-populated with this information. Sometimes this can be done by using the system itself to create the appropriate information that is consumed in a test, such that the test creates what is necessary for it to run and then uses it to validate the required business rules

Understanding root causes of process failures

Intelligent RPA bots can be used to ‘read’ and ‘analyze’ test results and use AI based pattern matching to understand root causes of problems or systemic failure patterns that can be highlighted and addressed quickly across complex application and business process landscape.

Better automation through re-use (QE – RPA and RPA – QE)

If we take things to their logical conclusion, there should really be no difference in automation that is built for RPA and for QE purposes. The same component, for example ‘creating an order entry’ could be used in production to do the work via RPA and be used for testing whether an order was successfully created based on the desired business outcomes. Having the same team as well as the same solution do this ensures that any RPA automation candidate is thoroughly tested in pre-production before being deployed. If changes occur in the application or process, that bot component will automatically go through testing and will be replaced in production with the new version.

About the author

Shoeb Javed

Shoeb Javed

Shoeb Javed is chief strategy and product officer of Worksoft Inc., where he is responsible for business strategy, product management, product marketing, technology alliances, and M&A evaluations. He works with CIOs, business leaders, and digital transformation professionals from the world’s Global 5000 corporations to meet their strategic goals and drive success. With Shoeb Javed’s help, these companies are adopting process lifecycle automation as the new industry standard and, in doing so, achieving accelerated project timelines, reduced costs, and improved operational efficiencies. During his career, Shoeb Javed has successfully led the development of next-generation enterprise software, digital media, and converged telecommunications solutions. Prior to Worksoft, Shoeb Javed was CTO of Variview Technology, and he has also held leadership roles at Ericsson, M68 Technologies, Vesta Broadband Services, and Intelect Network Technologies. Shoeb Javed holds an MS in Electrical Engineering from the University of Hawaii, and a BS in Electronic Engineering from the University of Nagpur, India.

About Worksoft

Worksoft provides Connective Automation for the world’s leading global enterprises, automating the full lifecycle of a business process from process intelligence to testing to RPA. Our codeless automation empowers business users and IT to accelerate automation and arms organizations with process data insights to prioritize automation efforts and extend the value into RPA for maximum efficiency and scalability. With Worksoft, enterprises can speed project timelines and ensure data-driven quality for their complex end-to-end business applications, including SAP, Oracle, Salesforce, Workday®, SuccessFactors, ServiceNow, and more. Recognized by leading Global Systems Integrators as the market’s choice for large-scale continuous enterprise automation, Worksoft is embedded into their ERP practices to enable their Agile, DevOps, and SAFe methodologies and accelerate digital transformation.

Visit us at