Empirical Principles and an Industrial Case Study in Retrieving Equivalent Requirements via Natural Language Processing Techniques

IRIS

Though very important in software engineering, linking artifacts of the same type (clone detection) or different types (traceability recovery) is extremely tedious, error-prone, and effort-intensive. Past research focused on supporting analysts with techniques based on Natural Language Processing (NLP) to identify candidate links. Because many NLP techniques exist and their performance varies according to context, it is crucial to define and use reliable evaluation procedures. The aim of this paper is to propose a set of seven principles for evaluating the performance of NLP techniques in identifying equivalent requirements. In this paper, we conjecture, and verify, that NLP techniques perform on a given dataset according to both ability and the odds of identifying equivalent requirements correctly. For instance, when the odds of identifying equivalent requirements are very high, then it is reasonable to expect that NLP techniques will result in good performance. Our key idea is to measure this random factor of the specific dataset(s) in use and then adjust the observed performance accordingly. To support the application of the principles we report their practical application to a case study that evaluates the performance of a large number of NLP techniques for identifying equivalent requirements in the context of an Italian company in the defense and aerospace domain. The current application context is the evaluation of NLP techniques to identify equivalent requirements. However, most of the proposed principles seem applicable to evaluating any estimation technique aimed at supporting a binary decision (e.g., equivalent/nonequivalent), with the estimate in the range [0,1] (e.g., the similarity provided by the NLP), when the dataset(s) is used as a benchmark (i.e., testbed), independently of the type of estimator (i.e., requirements text) and of the estimation method (e.g., NLP).

Empirical Principles and an Industrial Case Study in Retrieving Equivalent Requirements via Natural Language Processing Techniques

Falessi D;Cantone G;Canfora G.

2013-01-01

Abstract

Though very important in software engineering, linking artifacts of the same type (clone detection) or different types (traceability recovery) is extremely tedious, error-prone, and effort-intensive. Past research focused on supporting analysts with techniques based on Natural Language Processing (NLP) to identify candidate links. Because many NLP techniques exist and their performance varies according to context, it is crucial to define and use reliable evaluation procedures. The aim of this paper is to propose a set of seven principles for evaluating the performance of NLP techniques in identifying equivalent requirements. In this paper, we conjecture, and verify, that NLP techniques perform on a given dataset according to both ability and the odds of identifying equivalent requirements correctly. For instance, when the odds of identifying equivalent requirements are very high, then it is reasonable to expect that NLP techniques will result in good performance. Our key idea is to measure this random factor of the specific dataset(s) in use and then adjust the observed performance accordingly. To support the application of the principles we report their practical application to a case study that evaluates the performance of a large number of NLP techniques for identifying equivalent requirements in the context of an Italian company in the defense and aerospace domain. The current application context is the evaluation of NLP techniques to identify equivalent requirements. However, most of the proposed principles seem applicable to evaluating any estimation technique aimed at supporting a binary decision (e.g., equivalent/nonequivalent), with the estimate in the range [0,1] (e.g., the similarity provided by the NLP), when the dataset(s) is used as a benchmark (i.e., testbed), independently of the type of estimator (i.e., requirements text) and of the estimation method (e.g., NLP).

Scheda breve

Scheda completa

Scheda completa (DC)

Anno

2013

Appare nelle tipologie:

1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
canfora-tse.pdf non disponibili Licenza: Non specificato Dimensione 3.88 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	3.88 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12070/3385

Citazioni

ND

95

67

social impact