State of AI applied to Quality Engineering 2021-22
Section 2: Design

Chapter 3 by Worksoft

The power of process intelligence

Business ●●●○○
Technical ●●○○○

Listen to the audio version

Download the "Section 2: Design" as a PDF

Use the site navigation to visit other sections and download further PDF content

By submitting this form, I understand that my data will be processed by Sogeti as described in the Privacy Policy.*

How do you choose what to automate and when? For most organizations, information and insight lives in disparate sources. True knowledge requires more than a single source. When it comes to building a deep and dynamic understanding of business processes, multiple data sources are necessary to gain actionable insights and the only way forward is to combine these sources with AI/ML.

The Process of Processes Problem

It takes an acceptance of the imperfect to manage processes that manage process. Process mining, task mining, test automation, and robotic process automation (RPA) all approach process efficiency differently. All have strengths in their methods, and all have weaknesses – discrete weaknesses that one must accept when implementing them.

  • Process mining is excellent at revealing the truth underneath processes but lacks visibility into the human steps leading up to committed states of data.
  • Task mining excels at capturing human-driven variations but is unable to dive beneath the variants to reveal hidden pain points and system inefficiencies.
  • Test automation provides essential development stability but is obscured from what is really happening in production.
  • RPA maximizes a workforce by building a digital one but is fragile if unsupported.

Utilizing the strengths of each approach to compensate for the shortcomings of another will significantly improve overall process understanding and management performance. However, before we can profit from these combined data sources, they must first be aligned.

A United Framework

Ensuring that the same activity captured by task mining aligns with the committed data for the same activity from process mining which aligns with the same automated test activity and aligns with the same RPA process activity result is much easier said than done. Gaining holistic insight into processes is best accomplished by ingesting data into a centralized database and then intelligently aligning the disparate process data sets using the appropriate AI/ML algorithms.

A number of factors need to be considered when extracting data from multiple tools. Event-level terminology needs to be considered as each tool will define events according to their different process improvement priorities. A data strategy for the broader collection of activities that compose business processes must also be developed. And although the various tools may share a common larger structure, converting the data into as similar a composition as possible would only aid in their alignment. To overcome these two challenges, the Token Set Ratio and Levenshtein Distance algorithms are our heroes.

Token Set Ratio algorithm

Alignment begins as data is processed. AI/ML is used to group the data into like sets. As similarities are discovered, the different data sets begin to align for future use in analysis. Activity names are first analyzed to find activities and processes with a high likelihood of similarity. For example, the following activity names are for the same activity across different data sets, such as:

  • Create outbound delivery with reference to sales order
  • Create outbound delivery without order reference.
  • OutboundDeliveryERP WithReferenceToSales OrderCreateRequest Confirmation_In
  • VL01N

The Token Set Ratio algorithm can do the initial work of aligning these named activities. When applied, the Token Set Ratio algorithm will weigh the following words from the example as strong matches: Outbound, Delivery, Sales, and Order. Some additional similarities will be found and weighed accordingly with the use of “Reference” and “Ref.” As the remaining words do not appear in many of the activity names, they will carry less weight. As a result of this comparison of common and specific activity terms, an average similarity score for each activity will be calculated. This score can then be used to decide if it should be associated with the other activities. The first three activity names will have a very high similarity score and thus be grouped together for analysis based on a previously defined similarity threshold.

The fourth entry, VL01N, will not match at all and appear as a unique activity. To assist, a reference table of friendly names can be used to aid the matching algorithm. The Token Set Ratio algorithm can reference a “VL01N = Create Outbound Delivery with Reference to Sales Order” entry and thus VL01N will also score a high similarity score when processed. The end result of this example is that all four activities are grouped together under the same activity name.

Levenshtein Distance algorithm

The next level of alignment involves taking chains of grouped activities and categorizing them into appropriate business processes. While naming algorithms such as the Token Set Ratio can be used at the business process name level to assist in this goal, greater accuracy can be obtained by using a string-matching process. To illustrate, let’s look at two order to cash business process flows shown in the figure below:

These two process flows are intended to accomplish essentially the same function, completing an order-to-cash process. The first flow shows a total of five steps:

  1. Create Purchase Order
  2. Purchase Orders by Vendor
  3. Enter Incoming Invoices
  4. Create Outbound Delivery
  5. Create Sales Order.

The second flow is similar, but has distinct variations in the process and a total of six steps:

  1. Create Purchase Order
  2. Purchase Orders by Vendor
  3. Sales Order Item Price Increase
  4. Enter Incoming Invoices
  5. Create Outbound Delivery
  6. Create Sales Order.
Conducting a string-matching process on the chain of activities will help you identify process variants, which is critical
String matching process

Conducting a string-matching process on the chain of activities will help you identify process variants, which is critical

Levenshtein Distance can be used to calculate a similarity score between the two chains. This is done by measuring the minimum number of operations needed to change Flow A into Flow B and vice-versa. For this example, the difference between Flow A and Flow B is the Sales Order Item Price Increased activity. Flow A would need this activity added to match Flow B, and conversely Flow B would need this activity removed to match Flow A. The distance between these two flows is one activity. Thus, the application of the Levenshtein Distance algorithm results in a high similarity score and the desired related business process classification.

Method to Method AI/ML

These applications of AI/ML, as well as additional methods prove very effective in aligning process data from multiple method sources. For instance, algorithms will improve the match between test automation and RPA data and task data mining data by examining the individual steps taken within each operation. Greater similarities between the activity steps combined with activity name similarities equal an even greater likelihood the activities are the same. The additional matching results can then inform the greater AI/ML methods making better groupings at higher levels across all process data.

Achieving Process Enlightenment

Once the Token Set Ratio and Levenshtein Distance algorithms are applied and the data is aligned:

  • Process mining capabilities may help distinguish between system and user problems, illustrate poorly understood and reported processes, and identify activities occurring both inside and outside the enterprise ecosystem. Frequency and ROI data from process mining can help drive which activities to add next to test automation. RPA performance metrics, RPA ROI, and RPA target ranking can also be strengthened by process mining.
  • Task mining data boosts process mining data by layering together a complete as-is picture of current production processes. The step-by-step activities captured by task mining can also be used as blueprints for build automation tests and RPA processes, including uncommon paths and edge case flows, creating additional stability through the extension & reuse of actual production tasks.
  • Test automation combined with process mining and task mining data can capture the before and after ROI gains of automation implementation. This combination can also track automated test coverage as well as be used as a foundation for future RPA, lowering performance risks and thus building more confidence in a digital workforce.
  • RPA data can supplement process mining by allowing for comparisons of digital workforce output to manual processes. This information can then be combined with task mining data to prioritize the next set of RPA goals based on stability and future ROI. RPA run failures can feed back into test automation to assist in building more continuous stability.

By uniting the data from each method, not only do weaknesses fade, giving rise to a deeper, more comprehensive understanding of process, but direction rises also, guiding and shaping a roadmap for future optimizations. Unified data is the foundation of true process intelligence.

Process Enlightenment

When combined, the four quadrants of process data supplement one another to achieve process enlightenment.

Process Understanding Magnified

To better illustrate the benefits of joining the data, let’s trace a high-value experience. A leading manufacturing organization begins their process improvement journey by using a process mining tool to identify process landmarks, analyze process deficiencies, and get evidence of what is really happening regarding committed data. After the deployment of a task mining solution, its data is combined with the process mining data to gain near complete transparency over current processes. With nowhere to hide, knowledge gaps, poor documentation, and other process inefficiencies become clearer targets for improvement. While current processes are being improved, a prioritization list is forming.

By analyzing the frequency, duration, and cost of production processes, along with the different process variants occurring for similar processes, a list of test automation targets is being built – prioritized by ROI. A test automation product is implemented, showing quick turnaround gains by reusing the same step-by-step process data captured from the task mining solution.

With a broad and comprehensive set of test automation coverage, built from actual production understanding and practices, this manufacturing organization confidently begins building their digital workforce. Utilizing the same ROI metrics for process and task mining that were used to prioritize test automation, the company now incorporates test automation data to ensure that RPA goals are also completely supported by rigorous automated tests.

Since no process optimization effort is ever complete, the organization drives continuous improvement through the insights gained from the shared process data. By continuously tracking process flows, defining and prioritizing new test automation goals, and confidently extending RPA, the company has created a 360-degree world of holistic process knowledge – understood process intelligence. None of which would be possible without the use of AI/ML.

The Viability of Visibility

Whether the data be from process mining, task mining, test automation, or RPA, combining and aligning the data through AI/ML is not only possible, but essential when evolving to an advanced process intelligence approach. Achieving complete visibility of enterprise processes is attainable and sustainable as long as one has the tools necessary to align the data and surface the insights from the aligned data. The insights anchored by AI/ML can shape the speed, effectiveness, and the extent of the enterprise’s return of investment.

00203 S2C3 Figure 3

About the author

Chris Bodam

Chris Bodam

Chris Bodam is the Director of Product Development at Worksoft. He has spent the last ten years building big data analytic APIs, designing interactive performance analysis tools, and streamlining development processes leveraging scrum and test automation.

About Worksoft

Worksoft provides Connective Automation for the world’s leading global enterprises, automating the full lifecycle of a business process from process intelligence to testing to RPA. Our codeless automation empowers business users and IT to accelerate automation and arms organizations with process data insights to prioritize automation efforts and extend the value into RPA for maximum efficiency and scalability. With Worksoft, enterprises can speed project timelines and ensure data-driven quality for their complex end-to-end business applications, including SAP, Oracle, Salesforce, Workday®, SuccessFactors, ServiceNow, and more. Recognized by leading Global Systems Integrators as the market’s choice for large-scale continuous enterprise automation, Worksoft is embedded into their ERP practices to enable their Agile, DevOps, and SAFe methodologies and accelerate digital transformation.

Visit us at