Section 3.2: Inform & Interpret

Introduction

As the amount of textual data we generate across the application lifecycle grows, processing and accessing that data becomes increasingly critical to make more informed decisions. We require assistance in analyzing, categorizing, and prioritizing the data at our disposal. The constraint is the human language's puzzling and diversified nature. We have distinctions in our syntactic, semantic, and regional languages. Additionally, comprehension of our QE domain expertise is required. This section discusses how artificial intelligence – more precisely, Natural Language Processing (NLP) – enables us to comprehend unstructured information and derive meaningful insights. Allow us to investigate the context of two different case studies before we introduce you to NLP in the next chapter.

Case Study #1: the need for automated coverage analysis

Test case design and production are mostly manual activities that occupy up to 70% of the time over the lifecycle of a software test. Test cases prepared by inexperienced testers frequently do not adequately satisfy functional requirements. Additionally, requirement changes reveal the inability to reuse manually generated test cases, resulting in schedule slippage and expense increases. Numerous software projects employ some type of behavior-driven development to define requirements as they go from business stakeholders to plain language user stories.

User requirements must eventually be converted to natural language—so that they are easy for humans to comprehend and implement. Natural language test cases have limits due to the possibility of improper interpretation resulting in verification mistakes.

Traceability for user requirements and test cases can be established and utilized for requirement to test coverage analysis. End user technical documentation, which often covers features and functions, is never examined, and there is no mapping between test case techniques and technical documentation. To get understanding about high-end and sensitive systems that are controlled by professional operators, it is recommended to go to the end user technical paper. This highlights the critical nature of ensuring that the different areas/workflows stated in the technical specification are covered by the test procedures designed and identifying any gaps that may exist.

For any company developing software, quality control should stand out as a differentiating factor in ensuring the product's quality, stability, and resilience. To ensure optimal coverage of software features, it is necessary to write and implement suitable test procedures against the feature set. Traceability between product characteristics and test procedures is a demanding and complex task.

This context was the one faced by a medical device OEM firm situated in North America. End-to-end traceability between product features and test processes was established using Natural Language Processing and advanced deep learning techniques, ensuring that product features received the maximum amount of test coverage. It proved extremely beneficial in terms of product quality assurance and cost reduction by reducing the difficult process of developing test strategy and design. With the assistance of the solution built, features with sparse or no test coverage were found, helping test managers enhancing their test strategy, planning, and design. The underlying technology stack was IBM Watson, Java, Cloudent DB, IBM Bluemix.

Case Study #2: the need to commit classification

When a software project grows in size, managing the workflow and development process gets more difficult. As a result, it is critical for the management team and lead developers to comprehend the nature of the work performed by software developers.

In simpler terms, any developer, at any point during the project's various phases, can write code that seeks to accomplish one of the following:

Adding new features
Design improvements
Bug fixing
Improving non-functional requirements

The goal here is to define some broad categories of work that span the vast majority of development tasks.

Understanding how the development team's work is allocated throughout the four categories above will assist management in making more informed decisions about how to manage the software's growth while continuously enhancing its functionality. If the majority of the development team's time is spent resolving anomalies, management should direct lead developers to place a higher premium on quality.

To accomplish this, we can make use of the commit comments, which are provided in natural language by developers. These comments can be labeled and used to train classifiers that will subsequently classify future changes automatically.

This is the methodology taken by Capgemini engineering in the healthcare, automotive, and independent software vendor (ISV) sectors. The underlying technology is Python, Sklearn, NLTK, Numpy, Pandas, MySQL.

The method built for classifying commits using natural language processing achieved greater than 90% accuracy and assisted in classifying incoming commits into one of the categories indicated above. It became easier to assess the software development process and workflow of a very large team using this software engineering solution. The allocation of effort and the project's momentum were easier to track and optimize.

In this section

Chapter 1: NLP & NLU Fundamentals

Chapter 2: NLP For duplicate and orphans assets

Chapter 3: Bringing NLG into Software Development

Chapter 4: NLP for downstream recommendations

Chapter 5: NLP and sentiment analysis

We respect your privacy

We use cookies to improve your experience on our website. They help us to improve site performance, present you relevant advertising and enable you to share content in social media.

You may accept all cookies, or choose to manage them individually. You can change your settings at any time by clicking Cookie Settings available in the footer of every page.

For more information related to the cookies, please visit our cookie policy.

Cookies	Description
Registered visitor cookie	Cookie given to each registered user.
Registered visitor functionality cookie	Cookies used to remember the unique identifier given to each registered user.
Social plug-in content sharing cookie	Cookies set by services such as Facebook Connect or Twitter Button, which allow social networks users to share the content of our websites on social networks.
Unregistered visitor cookie	Cookies used to give to unregistered users a unique identifier in order to recognize them and to analyze how they use the website.
Analytic cookie	Cookies used to store URLs of the previous page visited, enabling to track users navigating from inside or from outside the website. If you click on a Sogeti advertisement on a non-Sogeti website, a cookie may be used to log which website you are on, in order to ensure our advertisements are served effectively and to measure whether our advertisements are viewed. Google Analytics: cookies set by Google analytics are used for web analytical purpose, but are not used to track individual users. For further information on how Google Analytics collects and uses information on our behalf and the right to use such cookies, please refer to the Google Analytics products and services privacy statement. If you object to your Personal Data being collected by Google Analytics, you may download and install the Google Analytics Opt-out Browser Add-on. Pardot: cookies set by Pardot are used to track users on our website. Visits are tracked for known users only. Unknown users are recorded as anonymous users. Please refer to Pardot privacy policy for any further information on their use and your rights related to the use of such cookies.

Section 3.2: Inform & Interpret

Download the "Section 3.2: Inform & Interpret" as a PDF

Use the site navigation to visit other sections and download further PDF content

Case Study #1: the need for automated coverage analysis

Case Study #2: the need to commit classification

In this section