State of AI applied to Quality Engineering 2021-22
Section 4.1: Automate & See

Chapter 1 by Tricentis

Making in-sprint UI Test Automation a reality

Business ●●●●○
Technical ●○○○○

Listen to the audio version

Download the "Section 4.1: Automate & See" as a PDF

Use the site navigation to visit other sections and download further PDF content

By submitting this form, I understand that my data will be processed by Sogeti as described in the Privacy Policy.*

In this chapter, we’ll share how vision-based UI test automation finally makes in-sprint test automation a reality for DevTest teams building bespoke/custom applications. To conclude, we’ll consider the prospect of autonomous testing: how it’s likely to play out and what to watch for.

Before we start, I want to emphasize that even the most sophisticated and effective AI-driven testing does not, will not, and cannot replace testers. It never intended to. AI-driven testing elevates the role of the tester by enabling the tester to focus on the challenging analytical and investigative work that attracted them to the profession in the first place. It also helps the business reduce risks while accelerating application delivery. Ultimately, it’s all about bringing vital information to light so teams can rapidly release amazing products that grow the business.

In the introduction to this section - which we encourage you to read before this chapter - we shared the perspective of many people that UI testing is relatively intuitive but tedious. The next generation of automation is deep-learning-driven automation that can see and use the UI like a human would—independent of the underlying technology. Human behavior is simulated using various AI and machine learning strategies — for example, deep convolutional neural networks combined with advanced heuristics — to deliver stable, self-healing, platform-agnostic UI automation.

At Tricentis, we have been researching how deep-learning-driven automation can be used to observe and interact with the user interface in the same way that a human would, regardless of the underlying technology. Our solution, named Vision AI, simulates human behavior using a variety of artificial intelligence and machine learning techniques — for example, deep convolutional neural networks combined with advanced heuristics — to provide stable, self-healing, platform-agnostic UI automation.

One such implementation of this technology is Tricentis Vision AI. With this approach, UI elements are identified based on their appearance rather than their technical properties. It makes no difference whether a single UI element is redesigned, or the entire application is rewritten in a new technology. Like a human, the automation will simply adapt and figure it out. By utilizing machine learning to perceive and steer a UI in the same way that a human user would, you can ensure that your automation is as adaptable as the human brain. If you can see it, Vision AI has the capability to automate it. This might be an app that utilizes now-deprecated technologies, an app that utilizes new technologies, or an app that you access remotely. You can even begin automating tests with mockups or whiteboard designs.

This is accomplished through the use of intelligent object detection technology to distinguish user interface elements. While this is a novel technique to software testing, firms like Tesla utilize it to detect objects (other cars, pedestrians, signs, stoplights, and trees) for self-driving cars. Why would you take self-driving vehicle technology and apply it to test automation? Because this addresses the above-mentioned speed and accuracy issue. A self-driving automobile must perceive objects properly in real time. Any delay will result in accidents – and may even result in death. Vision AI processes 40 frames per second using the same quick and precise technique to intelligent object detection as self-driving automobiles (vs 1.8 fps with other tools and 24 fps with the human eye).

In developing Vision AI, we took intelligent object detection technology and adapted it to detect controls and understand user interfaces. Rather than checking for pedestrians, signs, and stoplights, it can look for dropdowns, tables, lists, and menus – in fact, any control that a person can identify. Detecting the controls, however, is simply one aspect of the issue. Additionally, we must read the screen in real time. This is where a novel approach to optical character recognition (OCR) comes into play. OCR has been around for over two decades, yet it is still somewhat slow. Even with industry-leading OCR, reading a screen takes seconds, but navigating UIs like a person requires real-time character recognition that reacts in milliseconds. That is why we created an entirely new category of optical character recognition powered by AI.

However, like with self-driving automobiles, simply seeing and understanding things is insufficient. You also need to drive. Vision AI was never intended to be a “test automation tool,” but rather a “test automation engine.” As such, it must be integrated with something capable of performing basic functions such as test data management and test case design. Consider Vision AI as a way to boost the performance of your existing tool suite rather than requiring an entirely new set of tools. With this additional layer of intelligence, your automation is intelligent enough to operate through the great majority of UI changes – things that invariably trip up traditional automation but would not occur to a human.

The real value to quality engineering

Here are just a few of the ways this AI-driven approach adds real value to your quality engineering practice:

You can build UI automation before a UI even exists

You can build automation before a UI exists. Vision AI can take a simple definition — like a textual requirement definition, a mockup, or even a whiteboard drawing — and generate a running automation case.

Your tests can withstand app modernization

When you upgrade your application, the automation provided by Vision AI remains intact. And I'm not referring to a simple improvement, such as changing some technical identifications. I'm referring to rather significant changes such as switching from one technology to another or migrating your JavaScript library from an older jQuery UI to Angular Material Design.

Figure: Automation provided by Vision AI

Citrix? Customizations? Old tech? new tech? No problem.

If you can see it, Vision AI can automate it. Vision AI works on any visual interface. Vision AI is compatible with any visual interface. Because it is utilizing the visual interface of the technology connection, it is capable of working with Citrix and Remote Desktop. It may operate even if you are not physically connected to the machine you are watching (for example, if you are viewing it via an RDP interface). It operates on interfaces that utilize technology that is no longer available — such as out-of-date/deprecated versions of Gupta (we recognize that these technologies are no longer fashionable, but they are still in use today). Additionally, it works with cutting-edge technologies that (most) automation tools do not yet support, such as Flutter, Blazor, and Electron.

Figure: Vision AI is compatible with any visual interface.

It’s so easy, your grandmother could do it

With this AI-driven approach, test automation is really, REALLY easy. In fact, one of our guiding principles for the project was “So easy, your grandmother could do it.” Since Vision AI sees interfaces the same way that humans do, you can define the automation just like you would explain it to another human.

Example: Leading Swiss Bank


  • Mobile development required rigorous testing that was limited to manual execution​
  • Traditional tools struggled to automate mobile platforms due to complex application structure and missing technical resources to set up the automation
  • Management of physical devices for mobile testing posed difficult logistical challenges


  • Robust tool to access and automate complex mobile applications and ability to learn and automate easily and quickly​
  • Global access to diverse physical devices for manual and automated testing​
  • Run tests across different environments and end-to-end flows


  • After initial POC, 10 test cases were successfully automated with Vision AI on physical devices located in the SeeTest Cloud​
  • Enterprise single sign on functionality verified over multiple target applications​
  • Multiple teams now looking to deploy Vision AI in mobile testing and complex end-to-end flows across hard-to-access and difficult-to-automate applications
  • 5X faster delivery

The road to fully autonomous testing?

I’m often asked if and when fully autonomous testing could become a reality. That’s a topic I love to discuss. But, before delving into that, let’s take a closer look at the two words that make up that term.

Autonomous, meaning “without human intervention,” is pretty simple. Testing is more difficult because the investigative, inquisitive nature of testing does not lend itself to automation. What I am about to describe is best categorized as “autonomous checking”. With that in mind, let’s continue.

With advanced tooling like Vision AI (refer to its own chapter) and other intelligent automation engines, the problems of automated checking have shifted from “How do I reliably automate this interface” to higher-level problems. Humans are still overwhelmingly responsible for creating the automated checks: describing what inputs to fill in, what buttons to click, etc. This is the first horizon.

The shift to autonomy is best defined as “Describing becomes Deciding”. With tools such as smart impact analysis (e.g., Tricentis LiveCompare), this is already the case. You don’t need to describe which tests to run; you just need to decide if the tool’s recommendations suit your needs. This is great in closed systems such as SAP, Salesforce, and ServiceNow (where these offerings shine). With the help of AI, this trend will expand well beyond this—into the realm of bespoke/custom applications.

So great! The future will just be getting a printout of possible activities from the machine and giving it the green light! Well… not so fast. You see, these closed systems not only have defined processes; they also have defined outcomes (the oracle). Not so with bespoke applications. While determining the actions to take is possible generically (by examining people taking these actions), it’s not always possible to extract the “Why” component. When a user executes a transaction, their eyes flick to the top of the screen to double check that the “Amount” value is correct. This validation is not captured, and so the automated process misses the point of the check (which was to determine not only that the transaction was processed, but that it was processed correctly).

This is not a bleak outlook, however. While “Fully Autonomous” checking may still be quite a way off, the trend of “Describing becomes Deciding” will remove a ton of busywork that bogs down quality engineers today. Parsing through the outputted scenarios, injecting validations, and deciding which to run is a much more pleasant job than worrying about why the Login button doesn’t have a stable ID field.

With that said, there are a few things to watch out for:

  1. Beware of test case spam
    If you embark on an autonomous testing endeavor, and your team comes back with a tool or process that “generates thousands of tests” beware. You still need to parse through these tests, inject validations, and debug them if they “fail.” The motto of “fewer, targeted tests” has been a good guide for the past 20 years, and it remains so now.
  2. Investigate the how
    When you are told that your tests can be automatically generated, dig a bit into how this happens. AI is not magic. If something appears to be magical, it is most likely a fabrication. Your team should be able to tell you that the process examines usage patterns, parses existing (accurate) definitions, or has some other source of how it defines the test. “Shaking up the app and generating tests from it” is still firmly in the world of magical thinking.
  3. Ask about maintenance
    Having a thousand tests is like having a thousand smoke detectors. If you own an entire high-rise apartment building, that’s probably justified. If you own a house, then you will spend two hours switching them all off when you burn the toast. Tests that fail must be investigated, updated or discarded. Inquire about the nature of this method to ascertain whether autonomy will actually save you time in the long run.

Despite this, the future of autonomous checking appears to be very bright. Our goal is to devise a method for generating the best—and fewest—tests necessary to achieve the desired level of assurance. We are looking forward to continuing on this journey.

About the author

Wolfgang Platz

Wolfgang Platz

Wolfgang is the force behind innovations such as model-based automation and the linear expansion test design methodology.  The technology he developed drives Tricentis’ Continuous Testing Platform, which is recognized as the industry’s #1 solution by all top analysts. Today, he is responsible for advancing Tricentis’ vision to make enterprise continuous testing a reality across Global 2000 organizations. His most recent book is “Enterprise Continuous Testing: Transforming Testing for Agile and DevOps.”

Prior to Tricentis, Wolfgang was at Capgemini as a group head of IT development for one of the world’s largest IT insurance-development projects. There, he was responsible for architecture and implementation of life insurance policies and project management for several projects in banks.

Wolfgang holds a Master’s degree in Technical Physics as well as a Master’s degree in Business Administration from the Vienna University of Technology.

About Tricentis

Tricentis is the global leader in enterprise continuous testing, widely credited for reinventing software testing for DevOps, cloud, and enterprise applications. The Tricentis AI-powered, continuous testing platform provides a new and fundamentally different way to perform software testing. An approach that’s totally automated, fully codeless, and intelligently driven by AI. It addresses both agile development and complex enterprise apps, enabling enterprises to accelerate their digital transformation by dramatically increasing software release speed, reducing costs, and improving software quality. Tricentis has been widely recognized as the leader by all major industry analysts, including being named the leader in Gartner’s Magic Quadrant five years in a row. Tricentis has more than 1,800 customers, including the largest brands in the world, such as McKesson, Accenture, Nationwide Insurance, Allianz, Telstra, Moet-Hennessy-Louis Vuitton, and Vodafone.

Visit us at