State of AI applied to Quality Engineering 2021-22
Section 4.1: Automate & See

Chapter 2 by Applitools

Shorten release cycles with Visual AI

Business ●●●○○
Technical ●●○○○

Listen to the audio version

Download the "Section 4.1: Automate & See" as a PDF

Use the site navigation to visit other sections and download further PDF content

By submitting this form, I understand that my data will be processed by Sogeti as described in the Privacy Policy.*

In this chapter, we will discuss how to effectively leverage Visual AI to avoid waiting until the end of a coding cycle to validate UI changes, thereby reducing the gap between check-in and validation, minimizing engineering churn and disruption - and ultimately delivering innovation to market faster without jeopardizing your brand.

Visual AI sounds like something you might read about in a science fiction novel - machine learning and deep learning algorithms that mimic a human’s cognitive functions to solve complex problems in seconds and with extreme levels of accuracy. Well it’s real, it’s here today and leading businesses are already employing Visual AI to unlock a new generation of functional, visual, and cross-browser/device testing, thereby removing testing bottlenecks and delivering on the promise of an uncompromised user experience.


In the grand scheme of using artificial intelligence (AI) in quality engineering, image matching might not seem to be an impactful application of AI but it is revolutionizing how leading brands are accelerating their delivery of innovation. Furthermore, visual validation might seem to just provide a way to improve the efficiency of automated end-to-end tests at the end of the development iteration, but the real value comes from validating the UI “in-sprint” during Agile development. Integrating into existing automated functional tests and performing visual validation alongside data validation, gives complete test coverage capturing both expected UI changes as well as visual regressions at check-in. Traditional approaches only partially uncover these errors during late cycle end-to-end testing, slowing down release velocity and increasing the number of bugs escaping into later stages (see figure below)


Figure: Automated visual testing increases release velocity and reduces defect escape rates


Figure: Automated visual testing increases release velocity and reduces defect escape rates

A scenario

Imagine this scenario: your team builds a responsive web application designed to run on any browser, with a viewport size ranging from 4K down to mobile. As part of the process, your team develops functional tests to validate the different application behaviors. These begin with unit tests for individual code level components, systems-level tests for data structure and network behavior, and UI tests to validate user interaction and workflow. Next, they automate these tests to use for production validation and regression testing to run during ongoing development.

From initial release onward, your team now has two tasks. First as they modify, or add new, behavior, each change or addition requires its own test. Second, they need to ensure that no unintended changes take place, so they maintain and update the regression tests (consisting of unit, system, and UI tests) for existing behavior.

The underlying code can be validated with unit tests and the functional scenarios with service level or end-to-end UI tests but what about the rendered UI? When do we know if the UI actually ‘looks’ right? How do we know that the UI is actually working and usable?

When to visually test the UI

Historically, manual testing has been the primary mechanism for validating the UI. What better to test the user interface than an actual user?

However, end-to-end UI testing is complicated to orchestrate and most development teams leave UI, and integration testing, to the end of the development cycle. While testing of underlying behavior can be performed mid-cycle, UI changes cannot be guaranteed to be in the correct state until all the functionality has been completed.

Furthermore, with increased application complexity (i.e. number of pages), numerous viewport sizes (including combinations of devices and browsers) and increased frequency of releases (that comes with Agile/DevOps), it becomes impossible for manual testing to achieve the desired coverage in the time available (see figure below).


Figure: Total number of screens makes it impossible for manual testing to achieve the desired coverage in the time available


Figure: Total number of screens makes it impossible for manual testing to achieve the desired coverage in the time available

One of the biggest challenges any Agile software development team faces comes from the delay between code check-in and defect detection. Teams can quickly repair any defect discovered during check-in but any gap between check-in and defect discovery introduces friction for developers. Developers will likely already be onto the next task when the defect is discovered and must context switch back to the prior task to resolve the defect, slowing down the current development activities. 

The need to wait for the completed UI introduces an obvious delay for testing and introduces schedule risk due to defects discovered late in the development cycle. But, how do teams test a UI in a constant state of change? How do they validate the UI if all the changes have not been checked in yet?

Automating UI Validation with Visual AI

With artificial intelligence, a lot of smarts can be built into testing software to make testing applications easy. Applitools makes use of advanced machine learning algorithms to pinpoint visual defects in your products that otherwise would go unnoticed by a human eye.

-Head of Machine Learning Platform Engineering, Top 5 Multinational Investment Bank and Financial Services Company

Visual Testing tools make it possible to compare and highlight differences between two UI snapshots. To efficiently leverage Visual Testing, there are 3 fundamental challenges that need to be addressed

Challenge #1: Creating baselines

Baselines need to be created for each browser / mobile device and viewport combination, to perform a true ‘apple-2-apple’ comparison. Given that the typical number of supported browsers and mobile device combinations are large, creating unique baselines quickly becomes a high priority when executing a Visual Testing strategy. 

Challenge #2: Evolving / Updating the baselines

The applications we work on today are “living products”. They evolve based on business / product / user requirements. This means, the team will need to keep updating baselines for comparison at regular intervals and if the baselines are manually created and managed, then updating them on a regular basis is not a trivial effort. 

Furthermore, a minor rebranding of the application (e.g. changing the web application’s header and footer) would invalidate all the baselines, requiring either a complete review, which would be time intensive, or a blind acceptance of the new changes, risking missing real regressions. 

Challenge #3: Accuracy of comparison, as it matters to your users 

While most commercial Visual Testing tools look to address the above challenges (i.e. creating and updating the baselines) the biggest challenge, and the deal-breaker, is the algorithm used to give the accuracy of comparison.

For years, visual validation was plagued with false positives introducing too many problems into the testing workflow. Pixel difference, the legacy approach, compares screen images for pixel rendering differences. But rendering issues can cause pixel differences without those differences being true errors. A pixel rendered in 24 bit color from a screen capture can differ from another screen capture by 1 bit. Does this difference constitute an error? Another example, font smoothing, can change between browser releases (see figure below). 

Ultimately, over time, engineers concluded that pixel difference technology results in too many false positive bugs requiring manual resolution - put simply, pixel matching just does not scale for modern applications, especially those with dynamic content. 


Figure: Pixel based comparison shows false positive due to font smoothing of “Total Balance”


Figure: Pixel based comparison shows false positive due to font smoothing of “Total Balance”

To solve this problem, Applitools invented Visual AI. By replicating how the human eye and brain work, Visual AI only highlights the differences a human would notice.


Figure: Trained on +1B images, Applitools’ Visual AI is 99.9999% accurate and not fooled by browser updates


Figure: Trained on +1 Billion images, Applitools’ Visual AI is 99.9999% accurate and not fooled by browser updates

Visual AI can analyze a screen, a UI component or even an entire web page (i.e. stitching together images captured by scrolling the UI). It then abstracts the UI snapshot into identifiable regions to be analyzed, comparing the size and relative dimensions, colors, and content of objects on a page. Trained on over 1 billion images, Visual AI delivers 99.9999% accuracy and makes automated visual validation possible, enabling teams to get complete coverage for the entire UI with a single snapshot, finding defects that could not be found any other way (see figure below). 


Figure: Visual AI finds functional bugs as well as bugs that no other technology can


Figure: Visual AI finds functional bugs as well as bugs that no other technology can

Visual AI Comparison Algorithms

I discovered there really are no visual processing settings, percentages or configurations that need to be set up to create tests with Applitools. The algorithm is entirely adaptive, and I can only imagine where they’ll take the technology as AI and machine learning advances even further.”

-Joe Colantonio,  Test Guild and Guild Conferences

When looking at the UI, not every page can be treated the same. For this we need to be able to mix and match different algorithms, or comparison modes, to identify the correct type of visual difference between the checkpoint and the baseline.

To make them engaging and meaningful to their users, most applications are dynamic in nature. There is dynamic content, options to select different layouts and user experiences (e.g. dark mode). We need ways to visually check “static” content, and “dynamic” content, and more importantly, combinations of these contents on the same pages / mobile screens.

Some pages should not change from build to build. Using a “Strict” comparison mode, Visual AI captures all the elements on the page or screen and compares the checkpoint to the baseline. All differences that a human could detect get flagged (see figure below). 


Figure: “Strict” comparison mode highlights the visual differences that a human could detect.


Figure: “Strict” comparison mode highlights the visual differences that a human could detect.


If the page, or regions on a page, feature changing or dynamic content such as ‘featured products’ in retail, ‘top stories’ on news sites or ‘stock prices’ in a financial application (see image below), the “Layout” mode validates page structure to identify regressions even with dynamic content. 


Figure: “Layout” comparison mode ignores dynamic content and only highlights structural regressions


Figure: “Layout” comparison mode ignores dynamic content and only highlights structural regressions


Color schemes may also change (e.g. dark mode or branding preferences), in which case the “Content” mode can be used to validate text and images on the page while ignoring color shifts.  

Reducing Coding Effort

We save 4 working days per month with Applitools. In my book, this means another 2 months of available man-hours.

-Nir Pinchas, Senior Automation Engineer, WalkMe


Until now, our quality team struggled automating tests for pages with A/B tests - we'd encounter false positives and by the time we wrote complex conditional test logic, the A/B test would be over. Applitools implementation of A/B testing is incredibly easy to set up and accurate. It has allowed our quality team to align and rally behind the business needs and guarantee the best experience for our end users.

-Priyanka Halder, Sr. Manager, Quality Engineering, GoodRx


When we added Applitools Visual AI + Ultrafast Grid to our test generation framework, we were able to decrease test authoring time to under 5 minutes per test, while increasing test coverage, reducing build time, and achieving a 99% pass rate.

-Greg Sypolt, VP Quality Assurance, EVERFI


A core benefit of using Visual Testing is coding efficiency. 

All tests involve setting up test conditions, performing the test, and then measuring results. For behavior validated through the UI, the typical test validates behavior by interrogating the document object model (DOM) of the web page. Using element locators, values are extracted and used for assertions. Each assertion involves its own unique identifier and each must be coded, validated and, due to the brittle nature of element locators, maintained. Because of the number of elements that might exist on a given page, and the amount of effort to create and maintain these assertions, developers and testers are typically very selective on what to validate and focus on the smallest number of assertions to validate the functional behavior. 

Using Visual AI, the developer or tester is able to perform a complete visual regression using a single snapshot, or ‘single line of code’, without the need to use unstable or brittle element locators. Therefore reducing the effort to create, maintain and achieve complete test coverage - removing the need to spot-check UI elements or perform manual review (see figure below).


Figure: Key Benefits of Visual AI for Test Automation


Figure: Key Benefits of Visual AI for Test Automation

End-To-End Validation with Visual AI

At first our developers didn’t believe in visual automation. Now they’ve been singing Applitools’ praises because it catches critical bugs.

- Greg Sypolt, Director, Quality Engineering - Gannett

Most users start out applying Applitools’ Visual AI to their end-to-end tests and quickly discover several things about Applitools. First, it is highly accurate, meaning it finds real differences - not pixel differences. Second, the compare modes give the flexibility needed to handle expected differences no matter what kind of page is being tested. And third, the application of AI goes beyond visual verification and includes capabilities such as auto-maintenance and root cause analysis.

Take our example from earlier, where a rebranding changes the web application’s header and footer. This one change will affect every page and therefore require review of every snapshot. It will be very easy for a reviewer to miss any unintended regressions introduced as part of the rebranding. With Applitools, the Visual AI not only identifies the visual regressions but also groups and categorizes them to streamline review efforts and baseline management (see figure below).


Figure: Auto-maintenance categorizes 76 visual regressions and organizes into two groups (40 and 36)


Figure: Auto-maintenance categorizes 76 visual regressions and organizes into two groups (40 and 36)


Additionally, because Visual AI provides highly accurate information about discovered differences Applitools can also reduce remediation time by providing developers the root cause code in the application where the behavior originates (see figure below).


Figure: Applitools Root Cause Analysis correlates visual regressions with changes in the underlying HTML and CSS.


Figure: Applitools Root Cause Analysis correlates visual regressions with changes in the underlying HTML and CSS.


Applying Visual AI to end-to-end tests ensures the completed application is free of visual defects but bugs captured late still drive context switches for developers. Fortunately, Applitools’ Visual AI can also be applied at code check-in.

Validation at Check-in with Visual AI

The integration of Applitools with GitHub merges two essential products that enable our team to continuously deliver. This collaboration unlocked new efficiencies without any changes to our workflow. Having used this for a while now, I cannot imagine our team being without UI version control and auto-baseline updating on merge.

-Priyanka Halder, Senior Manager Quality Engineering, GoodRx


What is the real difference between validating end-to-end at the end of a development cycle vs validating functional code at check-in during an active development cycle? The end-to-end code has a completed UI that can be tested by an operator or an automated test system. No one expects completed UI code at check-in, and everyone expects no regression error for unchanged code.

Applitools helps teams identify proper behavior at check-in, even when the UI does not appear complete to an end user by applying different compare modes to a specific region vs the rest of the page. 

Imagine, for a moment, that the page in question involves the coming opportunities to observe the International Space Station in the darkened sky. The developers just added a feature that allows the user to filter for only evening or morning results. That feature, along with the filtered results, will require a new visual validation. A second team continues to develop a map image to show where the ISS will appear. That feature remains incomplete, and the team has coded a placeholder on the page. The rest of the page - including the layout, menu, bottom footer, and color scheme, should not change from the last baseline.

So, at check-in, tests need to validate behavior of the new feature, the behavior of the region with the placeholder, and the rest of the page as separate regions. By choosing different compare modes for each of these (i.e. “Layout” for the placeholder region and “Strict” for the rest of the page region) the visual behavior of the new functionality can be fully validated. Once the tests are complete, and no change have been noted in the rest of the page, the check-in can complete and the baseline automatically updated on merge.

By using Applitools at check-in, developers get immediate feedback on the visual behavior of their code - even when the rest of the UI is incomplete. Developers can then quickly repair any defects at check-in time, eliminating defects discovered late in the software delivery cycle and reducing engineering churn and disruption.

However, while we are now able to get immediate feedback at check-in, validating one browser at one screen size is insufficient for today’s responsive web applications. Hence the need for Cross Browser testing but traditional approaches are just too slow to perform complete validation at check-in. Fortunately, due to its accuracy and speed, Visual AI opens the door to an alternative approach to the problem.

Revolutionizing Cross Browser Testing with Visual AI

Cross platform testing is hard, it's no wonder why so many companies skip this, the efforts to implement a comprehensive strategy using traditional approaches are astronomical. What took me days of work with other approaches only took minutes with Applitools Ultrafast Grid! Not only was it easier, it's smarter, faster, and provides more coverage than any other solution out there. 

-Oluseun Olugbenga Orebajo, Lead Test Practitioner, Fujitsu


Bugs that I missed (and there were lots!) on different browsers and viewports were easily caught without affecting the time it takes to run my tests. 

-Marie Drake, Principal Test Automation Engineer, News UK


While traditional cloud testing platforms are subject to false positives and slow execution, Applitools’ unique ability to run Visual AI in parallel containers can give your team the unfair advantage of stability, speed, and improved coverage. 

-Igor Draskovic - VP Developer Specialist, BNY Mellon

Because today’s applications run across multiple browsers, devices, and viewport sizes, engineers must consider the need to validate behavior across different test platforms. However, traditional cloud execution grids, or device clouds, are slow, expensive and often unreliable. 

But, what would happen if cross browser tests had no setup cost?


Figure: Applitools Ultrafast Grid - 18.2x Faster to Complete a Full Test Cycle


Figure: Applitools Ultrafast Grid - 18.2x Faster to Complete a Full Test Cycle


The Applitools Ultrafast Grid (part of the Applitools Ultrafast Test Cloud - see figure below) is a revolutionary approach to Cross Browser and Cross Device testing. Snapshots of the rendered UI are captured as a model of the rendered page (i.e. HTML + CSS) during functional execution and uploaded to the Ultrafast Grid for rendering across browsers, devices and viewports without needing to re-execute the functional scenario. This approach reduces the complexity of test data and test environment management, simplifies security requirements and accelerates visual validation across all screens to levels far beyond traditional execution grids - resulting in organizations seeing complete test cycles running 18.2x faster (see figure above). 

Additionally, because Ultrafast Grid runs so quickly, engineers can leverage it to validate at check-in across all platforms in seconds vs minutes or hours with traditional cloud execution problems. 

The Ultrafast Grid takes the cost, complexity, time, and effort out of cross browser and cross device testing - resulting in greater coverage, higher quality, and faster time to market.


Figure: The Applitools Ultrafast Test Cloud


Figure: The Applitools Ultrafast Test Cloud

Visual AI Delivers Quality Code Faster

Our CI environment executes tens of thousands of Visual AI powered tests against the grid each month. Since implementing it, we've been able to remove frail functional tests from our ecosystem and achieve a 99.8% pass percentage.  We are faster, more stable, and ship with confidence with Applitools’ Visual AI running on the Ultrafast Grid.

-Mile Millgate, Technical Quality Architect, Gannet

As with all AI technologies, Visual AI helps humans become faster and more efficient. Trained on +1B images, and with 99.9999% accuracy, Applitools’ Visual AI is transforming how the world’s top brands (9 of the top 10 software companies, 7 of the top 10 banks in North America, 2 of the top 3 retailers in North America, etc.) accelerate the delivery of innovation to their clients, while protecting their brand and ensuring digital initiatives have the desired business outcomes.

Visual AI has progressed significantly over the last few years and is advancing the industry towards true Autonomous Testing, the next most important innovation for Quality Engineering. Today we are focused on having AI remove repetitive and mundane tasks, freeing the human to focus on the creative/complex tasks that require human intelligence. As we move towards Autonomous Testing, the role of developers and testers will change significantly - training the AI how to use the application, leaving it to perform the testing activities and then reviewing the results. This change will deliver a fundamental increase in team efficiency, reducing the overall cost of quality and enabling businesses to establish scalable Quality Engineering practices.

The future is exciting - and it’s probably closer than you think.

About the author

Mark Lambert

Mark Lambert

Mark Lambert is the Vice President of Product Marketing at Applitools. Mark is passionate about improving software quality through innovation and over the last 16 years has been invited to speak at industry events and media such as SDTimes, DZone, QAFinancial, JavaOne, AgileDevDays, Software Test and Performance, TestGuild and StarEast/StarWest. Mark holds both a Bachelor’s and a Master’s degrees in Computer Science from Manchester University, UK.

About Applitools

Applitools delivers the next generation of test automation platform for cross browser and device testing powered by AI assisted computer vision technology known as Visual AI. Visual AI helps Developers, Test Automation Engineers and QA professionals release high-quality web and mobile apps enabling CI/CD.

Hundreds of companies, including industries such as Technology, Banking, Insurance, Retail, Pharmaceuticals, and Media - including 50 of the Fortune 100 - use Applitools to deliver the best possible digital experiences to millions of customers across all screens.

Visit us at