1 Introduction

In Software Engineering (SE), one kind of requirement is called non-functional requirement. NFR is difficult to capture, organize, reuse and test; therefore, they are usually evaluated subjectively. NFR are known as constraints or quality requirements [1, 2] and are treated as softgoals [3]; they are targets that do not need to be addressed in an absolutely way but in a good enough sense [6]. The systematic treatment for NFR in early stages of software development may introduce positive contributions and increase software quality. The conceptual modeling for quality considering provenance as NFR is still underexplored either in the SE or Database domains. This is important because the quality achieved by data provenance has a clear proximity with software traceability. Both subjects are considered hot topics, offering potential benefits to data management and software development respectively.

Traceability and provenance handling consists of storing metadata that enables to reconstruct these chains of operations at different levels of abstraction. Due to the similarities between traceability and provenance [12], we advocate that the provenance can also be considered as NFR in software development. There several representations of provenance focused on data [4, 79] and very few works of provenance focused on the software process [12, 13]. Data provenance authors use taxonomies, recommendations or ontologies to describe the elements involved in the conceptualization, classification and hierarchical structure of distinct kinds of provenance metadata. However, our research, differently from related works propose a new approach based on reusable catalogs (conceptual models) not only to represent provenance as a quality factor, but also to aid reducing the gap between software specification, its operationalizations and the diversity of data provenance descriptors generated by its execution.

The aim of this work is to present the steps to map provenance as NFR catalogs, using a systematic approach based on NFR framework [5], NFR patterns [6] and NRF catalogs [5, 10]. The NFR framework and the NFR patterns provide a solid theoretical foundation for treating NFR, with appropriate representation schemas and rules. In particular, the NFR pattern focuses on the reuse of NFR knowledge [3, 5]. NFR patterns may be decomposed to create/compose more precise and unambiguous patterns to build larger ones or be instantiated to create occurrence patterns using existing ones as templates.

2 Modeling Provenance as a NFR Catalog

Our proposal is one of the first to represent provenance as a quality factor within a catalog based on NFR framework and NFR patterns. We stress that the modeling effort is not a simple representation based on hierarchies of provenance or data provenance standards. Just the contrary, The NFR catalog was modeled taking into account the decomposition of softgoals to be addressed or achieved by (business or scientific) systems that require different kinds of provenance. Besides, our contribution also exposes the links and impacts between the software softgoals. We introduce a novel perception of provenance, describing it as a quality that must be satisfied to enhance the software traceability, enabling the construction of verifiable chains of operations in software systems to produce pieces of data with higher quality and embedded with data provenance descriptors.

The development of a Provenance NFR Catalog used several patterns defined by Supakkul et al. [6]: (i) Objective Patterns used to capture the definition of NFRs in terms of specific (soft)goals to be achieved; (ii) Problem Patterns captured knowledge of problems or obstacles to achieve goals; (iii) Alternatives Patterns (operationalizations) used to capture different means, solutions, and requirements mappings; (iv) Selection Patterns used to choose the best alternative considering their side-effects. To elaborate the provenance NFR Catalog we defined a set of three modeling steps.

First Step - The conceptual model was conceived to follow the Objective Pattern. The result is a Provenance SIG (not depicted here due to space restrictions). An SIG is a graph that shows two elements of Objective Patterns. First element is the Identification Pattern, where Provenance is modeled as the root of the graph. The second element is the Decomposition Pattern with relations, like ‘Capturable’, ‘Classifiable’ were presented. Such relations were based on the provenance taxonomy proposed by Cruz et al. [7]. The Provenance SIG was focused on the positive or negative contribution of the relations represented by links of the type HELP, HURT, BREAK and MAKE and also decompositions, operationalizations and argumentations represented by the links OR/AND [5].

Second Step - In this step we defined three patterns: GrupoIdentification, Questions and Alternatives. The definition of such categories is important because they help designers to define the questions and further select the operationalizations during the software development process. After these definitions, it was possible specify the QuestionIdentification [10] and combine them with the GroupIdentification. The questions were answered according to the list of operationalizations for the softgoals (Alternative Patterns). Their impact on other NFR softgoals (previously defined in the SIG graph) were evaluated and then linked with questions as alternative responses. The operationalizations were represented at the lowest level of the SIG graph as leafs associated with NFR softgoals by contribution links of the type ANSWER.

Third Step - After defining the above mentioned patterns; it was possible to use SE standardized document like GQO [10] to organize and represent the knowledge achieved by the previous steps. The result of such effort was a conceptual model with the knowledge about provenance in a framework that can be used in (business or scientific) systems or even be shared, reused and evolved by third-party.

3 Conclusion

In this work, we introduce an original proposal about treating provenance of software development as a quality factor of (business or scientific) systems. Our research provides systematic approach based on conceptual modeling to represent provenance as NFR. We stress that our study is supported by consolidated methods of SE that do not substitute, but may compliment, traditional data provenance standards and specifications. We also agree with [11, 14] on the need for further empirical research on the use of NFRs and SIG during requirements engineering. As future work, we will expand the catalog through larger number of softgoals and operationalizations and evaluate it in different domains.