State of AI applied to Quality Engineering 2021-22
Section 5: Manage data

Chapter 3 by Curiosity Software

Coverage requires a command of test data

Business ●●●●○
Technical ●○○○○

Listen to the audio version

Download the "Section 5: Manage data" as a PDF

Use the site navigation to visit other sections and download further PDF content

 

By submitting this form, I understand that my data will be processed by Sogeti as described in the Privacy Policy.

The promise of AI is clear to software organisations, many of which have identified Test Data Management (TDM) as an area ripe for optimisation and acceleration. Yet, this interest in the applications of AI raises several questions for TDM, which this chapter seeks to address.

Is AI ready for test data? Is test data ready for AI?

When asked to rate their future options for AI and testing, the majority of respondents to the latest World Quality Report placed the generation of test environments and data highly in their plans. Meanwhile, 86% of respondents to the 2020-21 World Quality Report stated that AI is a key criterion when selecting new technologies.[1]

The promise of AI is accordingly clear to software organisations, many of which have identified Test Data Management (TDM) as an area ripe for optimisation and acceleration. Yet, this interest in the applications of AI raises several questions for TDM, which this chapter seeks to address. First, are the common technologies and techniques used in test data management in sufficient shape to fulfil the promise of introducing AI? Second, is AI or ML-based technologies today the best solution for fulfilling this promise, and, if so, with which broader approaches should they be combined?

The reality is that, for most organisations, test data “best practices” are lagging far behind where they should be. In fact, TDM at many organisations has remained static to the extent that it is not only lagging behind AI; it is also lagging behind developments in iterative delivery, test automation, and Enterprise DevOps. This disparity creates challenges for both testing speed and quality, challenges which have been compounded by the emergence of tighter data privacy legislation.


[1] Capgemini, Micro Focus, Sogeti (2020), The WORLD QUALITY REPORT 2020-1, 23. Retrieved on 22/03/2021.

Chapter Scope: Looking beyond the hype machine

The promise of AI might then be immense, but no organisation today has the luxury of adopting AI for AI’s sake. Within TDM, the horizon of improvement must be far wider, and broader changes will be needed to reap the benefits of AI. These broader changes will accordingly be discussed in this chapter, insofar as they will typically be required to achieve the full promise of applying AI to TDM.

Rudimentary solutions do exist today for applying AI/ML to test data. Technologies can, for instance, suggest data based on fields in a UI, or suggest data generation rules based on training data sets. However, by themselves, these technologies are often far-removed from the machinery associated with enterprise TDM, as well as from the challenges that a test data solution today must solve.

Any AI-based technology must work alongside engrained TDM technologies and processes, while also supporting the speed, compliance and coverage requirements associated with testing complex systems. Typically, however, the data “managed” by test data solutions far exceeds the complexity of examples given for the application of AI to test data.

There is a further challenge for organisations seeking to apply “AI” to test data generation and provisioning. It lies in distinguishing truly AI-based technologies. The testing market has been flooded by mixed messages, with a range of technology today branded “intelligent”. Some of these technologies include specific techniques like neural networks and machine learning. Others are labelled AI because they simulate the results of human intelligence, regardless of the technology used. This might include learning and self-correction, enabling greater autonomy. Yet, other technologies today described in the language of AI have simply been relabelled.

This chapter will avoid relabelling conventional automation or techniques as “AI”. However, it will not limit itself to a narrow subset of technology when proposing a solution to the current test data predicament. Instead, it will focus on an approach for making test data provisioning more reactive, self-sufficient and autonomous, including behaviours that mimic human learning. Taken as a whole, this solution delivers the desirable results of AI, but does so by combining technologies today used by organisations with newly available techniques.

Illustrations in this chapter is supplied by Curiosity Software.

Test data must adapt at the rate of knots

The following discussion will accordingly focus on the concrete question of how organisations can begin their journey towards AI in TDM, including areas where AI's promise can be achieved today.

The current TDM predicament must first be understood, defining the promise of AI at organisations today. The first section will thereby set out reasons why test data “provisioning” must be replaced by a reactive, adaptive and autonomous approach. It will discuss factors that have rendered a “Request/Receive” model obsolete, arguing that a dependency on production data and an overworked provisioning team is antithetical to the promise of AI in testing.

An automated and self-sufficient solution will then be discussed, in which tests self-provision data even as the tests and data requests change. Test data “provisioning” is not only then automated but becomes capable of handling an array of evolving requests over time. This approach thereby mimics human learning and the capacity to process a changing environment.

This “intelligent” solution will be set out in an order in which an organisation today could implement it:

  1. First, a set of test data utilities will be discussed for moving beyond a “subset-mask-and-copy” approach. These will be chosen to unlock greater test coverage and parallelisation.
  2. Next, the expanded set of utilities will be parameterised and made reusable on demand. This will focus on data provisioning that responds to a range of different requests, breaking the dependency on an overworked provisioning team.
  3. The automated data creation and provisioning will then be exposed to the full range of data requesters, including testers and automated technologies. This enables automated tests and CI/CD pipelines to self-provision data on demand, triggering the reusable test data utilities.
  4. Feedback loops into production will then be added to update the data provisioning routines as systems change, providing truly up-to-date test data in short sprints.
  5. This “lights out” approach will then be expanded to update test data provisioning as system requirements, business priorities and user behaviours change. This will harness data from a “Baseline”, enabling test data to self-adapt and evolve. Combined with automated test generation, this self-sufficient approach aims to expose the impact of changes as they occur, generating and executing the tests required to de-risk those changes.

The “smart” thing to do with TDM is modernise.

Before discussing this solution, the nature of the test data challenge today must be understood. This is needed to identify where and how the promise of “AI” can be most fully realised in TDM.

The interest in applying “AI” to test data challenges is understandable, given that the average test team spends 44% of their time waiting, searching for, or making test data[2]. However, any “AI”-based solution must reckon with an evolving test data requirement, that is only becoming more complex. In fact, a range of related trends mean that data “provisioning” will need to scale over time, fulfilling more requests, faster, for data that is of greater volume, variety, and complexity:

  1. Changing data requesters. The number and nature of data requesters has multiplied, increasing the volume of requests and the need for parallel provisioning. A TDM solution cannot only provision data to humans capable of adjusting data that does not match their tests; data must also be available to automated tests and CI/CD, and data provisioning must also therefore be automated and flexible, responding to rapid automated requests.
  2. Automation in testing. Test automation and CI/CD have also vastly increased the volume and pace of requests, while isolated data is now needed for parallelised tests. Whereas testers could execute a given number of tests at a fairly consistent pace, data-hungry frameworks tear through data overnight and on weekends, dramatically increasing the demand for data.
  3. Enterprise DevOps. Developers today impact code faster than ever, making changes that are increasingly complex. Containerisation, source control and code repositories allow developers to rip-and-replace reusable components rapidly. Systems have in turn become fast-shifting webs of intricately related technologies. Consistent test data “journeys” must reflect these interrelated components at the speed with which developers chop-and-change containerised code, often providing data that reflects recently introduced technologies.
  4. Iterative Delivery. Moves to agile or hybrid delivery methods have increased release speed, exacerbating test data bottlenecks. Data refreshes must be made faster than ever, testing new release candidates rigorously in days or weeks.
  5. New Technologies. Developers today can leverage a wide range of technologies when building systems, slotting new and open source technologies seamlessly into existing stacks. Many new technologies allow systems to process more data than ever, faster, and with a greater range of functionality. A test data solution must be capable of working with new data sources and targets, including data streaming and search technologies like Apache Solr and Kafka, as well as big data platforms and databases like Hadoop and Maria DB.
  6. New Compliance Requirements. Data privacy legislation is further impacting TDM and inviting greater business scrutiny regarding the use of data in test environments. Regulation and consumer concerns regarding personal data carry greater financial and reputational risk. For many organisations, this has effectively ruled out the use of raw production data in testing. New legislation often also presents logistical nightmares for TDM. For instance, an organisation might need to know where an individual’s data is, how it’s being used, by who, and for how long. They might then need to find, copy, and delete that data “without delay”.

[2] Capgemini, Sogeti (2020), The CONTINUOUS TESTING REPORT 2020, 21. Retrieved on 22/03/2021.

Testing speed: A Request/Receive model is antithetical to the promise of “AI”

TDM today must accordingly make data of greater volume, variety and complexity, available to more requesters, faster than ever before. This is not a challenge that’s going anywhere. In fact, each of these six trends are ongoing, and the complexity, volume and variety of data requests is only going to increase. Yet, test data “best practices” are already lagging behind.

In fact, the fundamental model of test data provisioning is now outdated and is itself antithetical to the promise of “AI” in testing. Far from instilling self-sufficiency and automation in testing, it creates a dependency on an overworked team. The last World Quality Report, for instance, found that 55% of organisations still rely on a central “test data support team” to provide test data[3], and the last Continuous Testing Report similarly found that 52% of testing teams depend on database administrators to receive data[4].

This “Request/Receive” model contrasts with the promise of AI in testing, with its emphasis on self-sufficiency, automation and autonomy. Ostensibly parallelised testers and automation frameworks must instead rely on a central team to fulfil rapid and complex data requests.

This has a stark impact on testing speed, as the central team is always playing catch-up. They are usually equipped with disparate and manual processes, while a lack of reusability forces them to repeat past work when fulfilling each request. Test data provisioning in turn cannot react to the volume and variety of data requests made today, creating substantial bottlenecks across testing:

  1. Waiting times. Relying on an overworked team leads to substantial waiting times for data, while a lack of reusability mean that wait times grow as more requests pile up. The central team must run a series of rigid and linear processes for every request, all while trying to retain data consistency. Data provisioning can in turn take longer than an iteration and only 36% of organisations today deliver data on demand[5]. This situation will be made worse if AI delivers on its promise of rapid and automated test generation.
  2. Lack of parallelisation. With long wait times, there is never enough data to work in parallel. This challenge has been significantly exacerbated by automated testing, in which parallelised tests demand their own data combinations. Meanwhile, parallel test teams compete in any test environments for a limited set of out-of-date copies of data. They edit, use up or delete one another’s data, causing frustrating rework and additional delays.
  3. Hunting for data. Testers and automated tests rarely require large copies of repetitive production data. They require targeted combinations for each test. Provisioning unwieldy copies of production data in turn wastes time as testers look for the data they need. This might be accelerated by database queries or scripts, but this only pushes the problem back as the queries must be updated as tests change.
  4. Creating complex data. If testers cannot find data to fulfil their tests, they are forced to make it by hand. Given the complex interrelations within and across components, this is vastly time-consuming and error-prone, contributing to delays and test failures.

Relying on an overworked, centralised team is never viable or fair. To fulfil the frequency and variety of data requests made today, a truly on demand and self-service model is need. Automated tests and CI/CD pipelines must be capable of triggering test data requests, and test data routines must be capable of fulfilling new requests. This reflects and is the real appeal of AI for test data.


[3] The WORLD QUALITY REPORT 2021-2, 32
[4] The CONTINUOUS TESTING REPORT 2020, 25.
[5] The CONTINUOUS TESTING REPORT 2020, 14.

Testing quality: The need for new tools and techniques

In addition to speed, the Request/Receive model undermines testing quality. This is usually due to a reliance on production data, as well as the limited tools available to data provisioning teams.

Broadly speaking, the tools and techniques used to provision data have remained static throughout the developments in test automation, DevOps, agile, and more. Often, test teams bypass data provisioning by maintaining test data in unwieldy Excel spreadsheets. Commercial test data tooling is now common but remains focused on the logistics of anonymising production data and copying it to test environments [6]. This practice is in place at 87% of organisations[7]. Though 58% do synthesize data, 79% create data manually for each test run[8]. Only 17% use automation for generating data[9].

Testing today accordingly remains dependent on production data. Where data generation exists, it’s often manual and unsystematic. These approaches undermine both the integrity and variety of data:

  1. Data coverage: Rigorous testing requires data combinations not found in production. This includes data for testing unreleased functionality, which is missing in historical data. Repetitive production data further lacks the majority of negative scenarios, edge cases and unexpected results, as production users largely behave as expected.
  2. Data consistency: Disparate and unsophisticated test data routines are further no match for complex system dependencies, and DBAs struggle to create data that links consistently across rapidly shifting, interrelated systems components. Test data provisioning in turn breaks data, leading to time-consuming test failures. Crude data subsetting, for instance, might take the first 1000 rows from each table in a database, without consideration of whether other tables or components depend on the excluded data. Data masking must likewise reflect vastly complex trends across data, anonymising interdependent data consistently. This might include complex temporal trends and numerical relationships.

To realise the value of AI in testing, TDM does not only need to catch up with “AI”. It must also move beyond the “Request/Receive” model and a reliance on production data. TDM match the speed and complexity of modern development, as well current compliance requirements. It must catch up with a number of related trends from across testing and development:

Figure: TDM “best practice” must catch up with broader developments in software delivery.


Figure: TDM “best practice” must catch up with broader developments in software delivery.


[6] The WORLD QUALITY REPORT 2020-1, 10.
[7] WORLD QUALITY REPORT 2020-1, 36.
[8] WORLD QUALITY REPORT 2020-1, 35.
[9] WORLD QUALITY REPORT, 29.

Autonomous and adaptive data allocation unlocks test coverage

Today, a constantly growing number of data requests are made by both human and automated requesters, who request data of ever-greater complexity and variety. To match this demand, test data “provisioning” must become automated and capable of handling diverse requests. As the tests and associated requests change over time, this automated provisioning must itself adapt. This is the promise of “AI” for TDM.

Test data utilities that can provision truly “gold copy” data

The first step for any organisation in fulfilling this promise is to ensure that they have test data utilities capable of creating “gold copy” data for testing. As indicated, “gold copy” test data does not simply mean production or even production-like data. It means rich and compliant data for every test scenario, available on demand and in parallel to both manual testers and automated tests.

The utilities most commonly deployed today, masking and subsetting, are valuable components of this expanded toolset. However, they must be capable of handling the range of data types found in an organisation and must be capable of producing data that reflects intricate relationships within and across systems. Commercial tools have evolved over the last 30 years that are of varying degrees of maturity. Many can mask and subset data while retaining referential integrity, and can work with a range of different data sources and targets. These provide an alternative to relying on in-house processes and scripts.

However, these utilities must be supplemented to move beyond a reliance on production. Several test data technologies can help deliver rich and compliant data to parallel data requesters:

  1. Test data coverage analysis: Data coverage analysis identifies gaps in existing test data, helping to ensure that test data can fulfil every test scenario needed before the next release. Analysis and data visualisation can be performed, for instance, to identify the values and combinations that exist within a data set, and the missing combinations of those values. Test-driven analysis, meanwhile, identifies missing data based on test definitions. This can be performed by linking test cases to data, performing data lookups based on the test steps.
  2. Synthetic test data generation: Data generation can then fill the gaps in existing data, but must be capable of reflecting the relationships within and across system components. Data generation must typically therefore be capable of going direct to databases, via the front-end, via APIs, or via files. The generated data must also reflect business rules and the flow by which data passes through complex systems. Generation should accordingly be capable of generating data sequentially to mirror events. This might require generating data in one place, and then using that generated data in a subsequent generation function.
  3. Data cloning: Data cloning, as distinguished from database cloning, supports parallelised test execution. It produces isolated data sets in one or multiple environments, reading data combinations from a source and copying it to a target. Cloning creates multiple sets of data combinations with the same characteristics, but with unique identifiers. This allows parallel testers and tests to work side-by-side, without using up or editing one another’s data.
  4. Database cloning and virtualisation: Database cloning copies complete or subsetted databases to test environments. Performed automatically with a tool, it supports parallelised testing by refreshing data in multiple environments side-by-side. Combining cloning with database virtualisation avoids the prohibitive infrastructure costs associated with maintaining several large database copies for parallel testing and development.
  5. Test data containerisation: Database cloning and virtualisation can today be combined with containerisation, automatically producing test data sets into containers. This is a crucial step in bringing TDM up-to-date with Enterprise DevOps, as it allows testers to rip-and-replace containerised databases as developers rip-and-replace containerised code. Testers can in turn spin-up flexible environments at the pace with which complex systems change.
  6. Test data allocation: Test data allocation provisions data to test environments, but differs from “provisioning” in that it is driven by test definitions. Rather than dumping large copies of data in test environments, it looks for data based on the test case requirements, performing automated data finds. This will be set out more fully below.

Parameterizable utilities can automatically handle different requests

Figure: A reusable workflow resolves decisions to create data of different types.

 

Figure: A reusable workflow resolves decisions to create data of different types.

Assuming an organisation has all the test data utilities they need to create “gold copy” data, they must still move beyond the “Request/Receive” model of data provisioning. The utilities must be made re-usable on demand, avoiding time lost waiting for an overworked team. The utilities must further be combinable and capable of responding to different requests.

This requires a method for defining test data utilities such that they are parameterizable on demand, and in which combined utilities can seamlessly pass parameters from one another.

Defining test data utilities in visual workflows offers one method for achieving this re-usability and flexibility. A workflow does not require maintenance or rework whenever it is run, as the logic gates in the flow handle variation. Instead of relying on an overworked team to handle subtly different requests, that team can then focus on defining flexible workflows for handling differences in data requests. The flow in the figure, for example, nests several processes within subflows to handle decisions when generating data.

However, finding, making and preparing consistent data journeys for complex systems is a multi-step process that must resolve several processes in intricate orders. Rather than relying on an overworked Ops team to run linear and disparate processes, the individual test data utilities must be combinable in a way that respects the referential integrity of data.

This can again be handled by model-based approaches, as shown in the figure below. If individual processes are automated and capable of handling different parameters, one process can pick up the parameters from another. They can thereby be ordered to pass parameters from the end of one process to the start of another. Meanwhile, overlaying rules and constraints ensures that the utilities produce complex data accurately.

Figure: Test data processes are executed by blue automation waypoints, while embedded subprocesses contain reusable subflows. This intuitive approach combines test data processes, passing parameters seamlessly from one to another.

Figure: Test data processes are executed by blue automation waypoints, while embedded subprocesses contain reusable subflows. This intuitive approach combines test data processes, passing parameters seamlessly from one to another.

 

Combining utilities in this way removes the need for a human to analyse the results of one process, identifying the next needed to fulfil a data request. Instead, automated test data utilities can handle variation and move seamlessly from one process to another. This minimises human intervention, while the ability to respond to differing requests reflects “intelligent” human decision-making.

One application of this approach lies in automated “Find and Makes”, which can substantially accelerate the fulfilment of data requests. A “Find and Make” searches for data among existing sources based on a set of criteria provided. If that data is not found, the criteria are passed as parameters into a data generation job, in turn creating the missing data. That data is then added to the test database, where it will be available for a future data “find”.

Find and Makes can be performed, for instance, using intuitive forms that are accessible to all users. Alternatively, SQL queries might look for data. If sufficient data cannot be found, the query is parsed to create new data. This constructs new values that will satisfy the query. For example, the “greater than” and “between” values in the function are used to construct new data.

Figure: An automated data “Find and Make” looks for message data based on parameters provided for a SQL Query. If no data is found, the parameters or the parsed query is passed into an automated data generation job.

Figure: An automated data “Find and Make” looks for message data based on parameters provided for a SQL Query. If no data is found, the parameters or the parsed query is passed into an automated data generation job.

 

This approach to performing data “Find and Makes” is well suited when a tester needs data for a set of scenarios. For instance, if they need data for 20 test scenarios, but only 18 can be found in a database, the missing two scenarios are generated by parsing the SQL query.

Alternatively, testers might want to generate a full spread of values into a test environment, creating “gold copy” data that is ready for an extensive range of test scenarios. In this instance, model-based approaches to test data generation might be used to construct the data and generate a complete spread of values.

Empower data requesters to self-provision data on-the-fly

In order to break the time-consuming dependence on an overworked data provisioning team, the automated data processes must further be exposed to the full range of data requesters. This includes both human testers and automated technologies, enabling automated tests and CI/CD pipelines to trigger combinable test data utilities and self-provision data.

For human requesters, a solution has already been shown in Figure 5. Fully parameterised and automated TDM processes have become reusable and can be exposed to intuitive forms. However, not every parameter in a TDM process will be relevant to a tester; instead, select parameters should be exposed as fields like drop-downs and text boxes. Simple names can further be specified for each parameter to reflect business and test logic, “de-skilling” the process of finding and making data.

These simple web forms enable testers to request data at a granular level, rather than relying on unwieldy production data dumps. The example below shows a range of different data masking processes that can be parametrised and run simply by hitting “play” and filling out a form:

Figure: A set of reusable and combinable generation and masking jobs are available from a web portal.

 

As shown in Figure 3, human data requesters can also combine these reusable processes in visual flows. This can be linked directly to model-based test case and script generation. As tests are generated or run, the combined test data utilities resolve, ensuring that each test comes equipped with matching test data. This fulfils the previously discussed requirement for test data “allocation”.

Automated tests and CI/CD pipelines can furthermore trigger the reusable test data processes directly, self-provisioning data even as the test requirements subtly change. This might be done via APIs or batch processes, while data generation functions can also be embedded directly in scripts.

At this point, test data provisioning already delivers results similar to those promised by AI. It can act autonomously, fulfilling different requests on-the-fly. Meanwhile, requests can be made by a range of data requestors, human and non-human, who self-provision data. Yet, the approach can be expanded further to incorporate learning as systems and environments change.

Feedback loops to production keep test data up-to-date

Incorporating this learning answers a key question that has not yet been addressed in the proposed approach: How can teams today identify the parameters that they need to pass into the automated test data processes? This is a tricky problem, given the complexity of interrelated data today, as well as the lack of sufficient visibility and documentation regarding back-end systems. Meanwhile, the pace of system change today calls for an automated approach to identifying key data characteristics.

Commercial test data tooling typically provides at least some data modelling capabilities, identifying relationships required to retain referential integrity. More sophisticated data analysis is furthermore possible today. Automated analysis can, for instance, perform counts and aggregates, or can measure its skewness and perform averages. Maximum and minimum values can also be identified, while “kurtosis” identifies rare data values. Automated data comparisons can further compare the density of data in production and development environments, identifying missing data:

00203-AI-for-QE-S6-V2.png

Figure: Automated data analysis identifies missing values.

 

Another form of automated analysis parses production data, converting it into template files for generating accurate test data. This approach becomes more “intelligent” when it applies algorithms to reverse-engineer data from known parameters. A range of algorithms can be borrowed from scientific and mathematical fields. For instance, Linear Algebra can be used in “message solving”, working from expected results to auto-populate the body of complete XML messages.

Automated analysis today can therefore help identify the parameters needed to produce accurate test data. Production analysis can feed automated workflows, performing multi-step test data preparation. In principle, any reusable test data process can ingest the results of automated analysis in this way.

This approach becomes yet more “intelligent” when run continuously, moving from production analysis to production monitoring. The automated data analysis might be run in batch mode, updating the parameters used to find, make, and prepare test data. When an automated test or tester requests data, they thereby receive data that is up-to-date relative to production.

This combination of automated data analysis with reusable test data processes in turn mimics the “learning” performed by data provisioning teams as systems and user behaviours change. However, testers and automated tests are no longer reliant on an overworked team who analyse complex data, updating and slowly running a set of rigid test data processes. Instead, the test data processes are fully parameterised, and the parameters passed into them are kept up-to-date. When a test(er) requests data, they thereby receive up-to-date data relative to the latest production systems.

Feedback loops across the SDLC deliver exactly the right data in-sprint

This “lights out” approach can be expanded beyond mirroring production data characteristics and evolving test cases. Automated test data processes can, in principle, be linked to data outputted by changing user stories, Application Production Monitoring, bug reports, and more. Test data provisioning then adapts to changes across the whole SDLC, producing data to target emergent risks.

The steps required to start building this real-time approach is beyond the scope of this chapter, but are indicated in Curiosity Software’s earlier chapter, “AI Shifts the Center of Gravity for Quality”. This chapter has focused on the steps required to make data provisioning reactive, flexible and automated. The inputs and outputs of the “digital twin” discussed in the earlier chapter indicate the possible inputs for fully automated test data allocation.

Autonomous, adaptive, and fully reactive to change

The approach proposed in this chapter intends to bring test data into 2021. Instead of retaining a dependency on an overworked team and production data, it aims to unlock self-provisioning of data by testers and tests. This automated data allocation must be flexible to evolving requests and must be capable of providing compliant data for an array of different data requests. As such, it must also be capable of fulfilling high volumes of parallelised requests, made by humans and automated tests.

Organisations should first seek a set of test data utilities capable of providing rich, compliant, and parallel data for every test. The utilities must be automated and combinable, while tests and human requesters must be capable of firing inputs to trigger them on demand. This allows self-provisioning of data, with flexible automation that mimics the decisions made by data provisioning teams.

The approach further mimics human intelligence when feedback loops are created to update the data provisioning. A method has been discussed for keeping data up-to-date relative to production. Exposing automated test data processes to a baseline of data from across all of the whole Application Delivery Lifecycle could in future generate test data that targets subtle changes in code, user stories, user behaviour, and more. The key to this approach often lies in web hooks and in the API connectivity of tools across the SDLC, which provide a rich source of readily available data to harvest.

About the authors

Huw Price

Huw Pryce

Huw Price is a test data management veteran and a serial entrepreneur, now the founder of his fifth software start-up. Huw’s 30 years of experience in software delivery has brought a collaboration with a wide range of organizations, large and small. He has crafted strategies and innovative technologies for test data success, on projects ranging from large-scale migrations from mainframe to open systems, to building best-of-breed test automation frameworks for microservices.

 

Huw Price

Tom Pryce

Tom Pryce has been working with test data since 2014, when he joined Grid-Tools Limited. He is now Communication Manager at Curiosity, where he enjoys collaborating closely with organisations on projects focused on test data, model-based testing, test generation, and requirements digitalisation. He enjoys learning about cutting-edge techniques in each of these fields, producing written content and media to share these insights with the broader testing community.

 

About Curiosity Software

Curiosity was the first company to launch an Open Testing Platform. Our goal is simple – don’t test when you don’t have to. With data-driven insights into the impact of change, we expose the fastest path to validate your changes complete with required test data. Our Open Testing Platform is test tool agnostic, we can optimize what your team should be testing across any tool or framework.

Visit us at opentestingplatform.curiositysoftware.ie