Keywords

1 Introduction

Developing products and services that respect a user’s privacy is a growing field of interest with advances in big data techniques, the internet of things and progress in information and communication technology in general. Cavoukian [1] is often attributed as being the first to summarize the privacy by design principles emphasizing user-centered design and pointing out the benefits of increased privacy and security awareness when systems are developed transparently and privacy is enforced proactively Privacy by design (PbD) has already been proposed as a guideline to ensure privacy friendly systems, but. the question of how these guidelines can be put into practice has become an even more pressing issue [2]. Especially, since the General Data Protection Regulation was adopted by the European Parliament [3], which makes PbD mandatory for new products. PbD emphasizes that privacy considerations have to be a part of every step of the software design process to be effective.

Despite a growing amount of work on privacy enhancing technologies, privacy strategies and privacy patterns, the adoption of PbD is still lacking adoption in practice, especially in software development processes [4]. Gürses and Alamo [2], in line with a recent ENISA report [5], state that engineering privacy by design requires a multidisciplinary approach in which “Data protection authorities should play an important role providing independent guidance and assessing modules and tools for privacy engineering” [2]. While there are notable advances with respect to engineering privacy requirements we see a lack of adoption of PbD ideas with respect to process-driven approaches and socio-technical design. One way to foster PbD not only on a technical but also an organizational process level is to support collaborative approaches of socio-technical systems design.

In this paper, we further elaborate on an approach that extends existing methods of socio-technical design by including privacy related aspects. We published first ideas of this approach in [6]. The approach combines collaborative process design workshops with a web-based system that fosters critical reflection and discussion on such designs.

2 Related Work

The fuzziness of the concept privacy is one of the main challenges of PbD and privacy engineering [4, 7]. There are legal, regional and cultural differences with respect to what is to be achieved by protecting privacy. And with respect to the question of how to apply PbD one can find solutions in IT Security, Software and requirements engineering, business process management and legal compliance [5]. This emphasizes the need for collaboration when systems are developed with privacy in mind to incorporate the different perspectives. Especially legal requirements for handling personally identifiable information (PII), despite the fuzziness of the concept of privacy, have led the discussions of data protection goals [8] that are meant to be workable constructs when designing process that involve PII. The data protection goals extend the widely known computer security goals (confidentiality, integrity and availability) with respect to privacy related goals such as transparency, unlinkability and the ability to intervene [8] which were recently chosen to be the standard model for data protection audits by the German conference of data protection officials. While unlinkability refers to mechanisms to enforce purpose binding, the ability to intervene requires data processors to prove that they can actually control and disrupt specific PII data flows, e.g. if required by the data subject. Unlinkability for example can be achieved by minimizing the amount of data collected. The data protection goals are in line with other, less process but more technology oriented approaches like the one proposed by Gürses et al. [9] and especially the privacy strategies and tactics developed by Hoepman et al. [10, 11]. They argue that engineering privacy by design should always be based on minimizing data since the amount and risk of PII collected within a product or process predetermines the following iterative steps of development like requirements analysis, threat modeling, security analysis and implementation. This leaves room for methods that support these iterative steps. Notario et al. [12] suggest to apply use cases as a methodology to elicit requirements. The value of use cases within that methodology is to bring together all stakeholders that have an interest in processing PII such as legal staff, business consultants, business analysts, data analysts and software architects. Vicini et al. [13] describe how methods of co-creation can be used to integrate a variety of stakeholders in a requirements engineering process, but make no use of process models which are emphasized by Notario et al. [12] as an important factor to achieve organization impact. There is thus a need for methods that bring relevant stakeholders together and make use of process models as a mutual artifact. The socio-technical design approach we propose in the following can provide a suitable solution for this gap.

Socio-technical design first became a field of interest in the early 1950s in the face of the ongoing industrialization [14]. During that time researchers realized that it is necessary to consider the social context of people in order for technology to have the desired effect. They also found that the introduction of technology inevitably has an effect on the working environment which again has an influence on how technology is used. This led to the development of a number of approaches which were subsumed under the umbrella of the term socio-technical design (STD). These approaches aim at giving “equal weight to social and technical issues when new work systems are being designed” [15]. The goal of these approaches is to bring together users and designers since thy are mutual experts or mutual lays at the same time. Practitioners are experts of the domain, while they would usually know little about privacy enhancing technologies. This is the case vice versa for the privacy experts creating a gap for both groups. In consequence discursive processes creating a discussion around a proposed design are necessary to bridge this gap.

Most STD approaches consequently focus on workshops in which current and future users of a system alongside domain experts and software developers create a conceptualization of a future system [16,17,18]. It is common to start conceptualization by analyzing the current state of a system or process by visualizing it in graphical models. These models are then subsequently used as a basis to identify problems and discuss future designs. Arriving at a suitable design usually requires multiple workshops as well as phases in between in which designs are reflected and tested [19]. Results from these tests then serve as an input for future workshops and future design iterations. STD can thus be perceived as a mutual adaptation process between design and its implementation in the work place.

Privacy is a multi-facetted problem that can be leveraged using organizational as well as technical means. Socio-technical design can serve as a means to consider both aspects and come up with solutions that all stakeholders agree upon when used in the context of privacy by design. Through socio-technical design it is possible to integrate multiple stakeholders into the design process and to identify problems within processes that are potentially be overlooked otherwise because they are often considered less important [20]. Therefore, legal and privacy/security experts can also help to make decisions on tradeoffs that have to be made with regard to the use of privacy enhancing technologies and usability, efficiency or implementation costs.

3 The Methodical Background: SeeMe and the Socio-Technical Walkthrough

To design socio-technical systems, modeling is at the core of our methodology. Socio-technical modeling was designed to integrate the modeling of technical and work processes and in consequence makes more topics of the envisioned practice available for design and development. It proved to be helpful to contextualize processes and situations to make topics available for discourse. Methodologies like the well-established socio-technical walkthrough (STWT) [19, 21,22,23,24] consist of the two parts: notation and method. The modeling notation we used in the project is called SeeMe. It supports the description of various socio-technical aspects such as coordination between different process participants and the behavior of human actors performing the process. SeeMe is applied during STWTs to represent and discuss the work processes. Our experience with both SeeMe and the STWT method stems from a development of about eighteen years, driven by practical application in various contexts (for a list of projects s. [16]). As the initial rationale of this still ongoing action research effort, we intended to describe the phenomena of socio-technical systems appropriately consisting of technical, organizational and personal views, as it is an important background of privacy engineering in particular. We found technical and social phenomena to be equally relevant including technically enforced behavior as well as (emergency) behavior with inevitable human decision-making and human actors creating workarounds to unpractically designed technical solutions. Therefore, our basic assumption is that it is highly relevant to describe aspects of work processes and coordination issues as part of designing socio-technical systems and making privacy by design proposals. The SeeMe notation we use is based on technically oriented modeling notations and was enriched with ways to express vagueness including incompleteness and uncertainty. The notation is designed to support:

  • the visualization of complex interdependencies between the activities of users, between human work and the technical systems, and if needed it can also depict the technical components

  • the creation of an integrated view on technical and social aspects

  • the flexible adaptation of levels of detail in every section of processes

  • the creation of a shared understanding of the socio-technical design

Using the notation SeeMe we developed methods to create models discursively. The core method is called the socio-technical walkthrough (c.f. [16] for a detailed description). The term walkthrough points to a step-by-step approach which takes place in collaborative workshops. Questions play an important role to guide the modeling and the attention of participants. Additionally, workshops are repeated to further elaborate the results. Between workshops changes to models are primarily done for aesthetic reasons in order to make models easier to perceive and understand. We will discuss this approach later as the mixed collaboration approach.

With the STWT the goal is to foster collaborative reflection and negotiation. The models are used as visible explications of knowledge, which has various facets:

Models are used as a boundary objects [21] between different perspectives. The models are a shared resource for reference. Participants can see their own perspective in the context of the environment. They can also see, understand and discuss consequences of personal behavior for others.

In previous applications the STWT has helped to foster integrated discussion of technical and organizational aspects that lead to well thought through decisions. Decisions and changes – technical as well as organizational – than lead to changes to the respective area, so that collaborative reflection on the changes improves the design. In addition, the diversity of the participants’ experience resulted in an enriched design decision. Using the STWT and extending it with Privacy by Design aspects can therefore enrich discussions about design decisions and will allow system designers to relate to the future practice that can be integrated in the design process.

The next section gives a simple modeling example to create an impression of the models used. We already proposed specific changes on the methodology [6] to adapt to the needs of privacy by design, which we will describe in the then following section.

4 Modeling an Example Process in SeeMe

We will use the design of a survey-based study by a university where participants are contacted by email and asked to use a web-based system to answer a short questionnaire as a practical example for our approach. Study designs like this have to take into account local privacy regulations and – depending on local practices – have to be approved by institutional review boards or data protection officers. A process model that reflects the necessary steps is shown in Fig. 1.

Fig. 1.
figure 1

SeeMe model of the survey process with added comments regarding privacy (Color figure online)

In order for a design artefact for a future system to be useful it has to cover social and technical aspects at the same time and has to be easily understood by those involved in the design. It has to be useful for those that later use it to develop software based and conduct organizational changes in order for the software to be used effectively. The SeeMe modeling notation thus can be perceived as being ideal for a task like this. It is capable of covering social and technical aspects of a process within the same visualization. SeeMe only consists of three basic elements and has been proven to be easily understandable for stakeholders. Furthermore, SeeMe also allows for explicitly displaying vagueness. As mentioned earlier this is crucial for depicting real life processes since real life phenomena sometimes cannot and should not be expressed formally. At the same time SeeMe offers all constructs necessary to depict complex decisions and can thus be used as a basis for software development.

The example process model (Fig. 1) uses SeeMe. The process involves roles (depicted as red ellipses) like participant, researcher and research assistant who execute activities such as invite participants and remind participants (depicted as yellow rectangles with round corners). The process involves assistants who will send out links with unique tokens, e.g. encoded within the URL to the survey to a list of participants (an entity depicted as blue rectangle) created by the researchers. They will also remind participants if codes were not used. When the time is up the survey is closed and the assistants export the answers from the survey systems as a CSV file and send it to the research group via email. This rather simple process of conducting a survey can pose various privacy related issues such as protecting the identity of the participants or general questions about data handling within research groups. This model in particular depicts multiple occasions in which issues with respect to privacy and secure handling of PII can arise.

Additional stakeholders that we omit in this example are third parties like the company providing the survey system or researchers from other institutions that would like to work with the raw data.

5 Adapting the Methodology to Privacy by Socio-Technical Design

As described above, models can play a central role in privacy by socio-technical design. In order to arrive at a privacy friendly system and corresponding organizational process those models have to cover both aspects. Our proposed approach especially intertwines phases of collaborative work in workshops with phases of asynchronous collaboration and reflection. We propose to involve privacy experts to review these models and add privacy related questions later on. The adapted models are distributed among workshop participants and other interested stakeholders who are asked to answer those questions by adding annotations. Those annotations subsequently serve as a basis for the next workshop to elaborate on the raised topics.

The STWT approach is based on the creation of process models in workshops in order to reflect multiple perspectives and aspects of the real work environment and the existing experience and practice. It is important to note that envisioned practice which is already documented (e.g. in Information Security Documentation) often differs from the real practice. It is crucial to understand the needs that lead to such differences. Facilitators can help to reflect the actual process in a process model. These facilitators guide workshops by asking how the participants conduct their work and what they do at a certain point in time. The contributions are integrated into the graphical process model right away. This model subsequently serves as a basis for discussion on potential improvements as well as on how the future system has to be designed to suit the work environment of current and future users. In order to arrive at a suitable design, the facilitator usually asks the users a set of predefined questions such as: “Where do you see issues with the current process?” or “What support do you need in order to fulfill your tasks?”.

Altering this approach in order to fit the context of privacy by design requires some changes to the STWT. While it is considered useful for privacy experts to participate in workshop sessions, other changes should integrate a privacy perspective, too. It is necessary to focus on potentially privacy relevant aspects of work processes, to include questions regarding privacy into the design phase and to respect modeling guidelines to achieve the required level of detail. In the following section we will describe an analysis of existing models to identify requirements for guidelines towards the goals of privacy by design.

Designing a suitable socio-technical system cannot solely happen within modeling workshops. Due to the fact that social and technical aspects mutually influence each other it is not possible to analyze all potential effects of technology on a social system and vice versa. It is thus crucial to apply an evolutionary approach in which designs are created, tested and refined. Additionally, to the repeated workshops we use a web-based editor as a means to access process models that have been created during workshops and to further discuss these models using annotations. In addition, the web editor supports a question based re-evaluation of a process that can be used to ask questions related to the data protection goals or common privacy patterns [25]. This enables non-privacy experts to evaluate common privacy practices and optimize a process before details are discussed with privacy experts in a consecutive workshop. Proposals of privacy experts might be complex so that it is necessary that the models are adapted prior to following workshops.

Such a participative process have positive effects on understanding and motivation when the process are executed [26] but especially increase the motivation for changes otherwise only perceived as obstacles [20].

6 An Analysis of Existing Models of Work Practice

To get a better understanding to what extent our process driven approach already covers privacy relevant issues and what aspects of privacy are not dealt with we analyzed the outcome of 10 previously held STWT workshop series. Over the years a changing team of process modelling experts has conducted workshops in a variety of domains ranging from logistics to insurance to health care and welfare. Each of the analyzed workshop series consists of 2 up to 11 individual workshops with a great variety of complexity of both the domains and the developed models. The analyzed workshops were conducted within one large organization. We asked two privacy experts to review the final SeeMe models and analyze them with respect to the data protection goals. The experts added comments to the process models addressing privacy problems that could emerge if the process was implemented as described. The comments the experts made mainly asked for access rights, retention and deletion of PII and missing aspects of the models to determine if an impact to privacy is present.

6.1 Results

The privacy experts added 19 annotations in total. We categorized the questions and clustered them to get an overview of common issues. The 19 questions were added to 18 distinct elements. An overview over the corresponding element types is given in Table 1. We expected the high count of annotated entities. This relates to the SeeMe notation where entities are used to model artifacts like systems or documents that are likely to contain PII.

Table 1. Type of element

The effected sub elements ranged from 0 to 12 elements while a large majority concerned elements without sub elements. The comments indicate that this is a result of a lack of detail as those elements often referred to subsystems that might handle PII but are usually omitted during the workshop phase and handled as black boxes (Table 2).

Table 2. Categorization of the added questions

The questions added most frequently referred to the minimization of data collection (e.g. “is this information really necessary?” or “are PII included?”) and were connected to elements where it was unclear what types of data were actually stored or processed. A number of organizational processes required employees to make notes about what happened, for example after experts assessed safety issues of working conditions and rooms. If not regulated, more than necessary information could be stored describing potentially unsafe, personal habits of specific employees revealing them to everyone that could access the report. Thus the category data minimization is the most frequent one. An overview how often a specific category was assigned to the questions is given in Table 3.

Table 3. Meta comments assigned to the added questions

As already outlined above the experts did not find obvious threats to privacy but addressed potential issues that would arise if PII were involved. To determine if this is the case the experts asked for missing detail.

The privacy experts also stated apart from their annotations added that they did not find any aspects explicitly related to support the transparency with regard to the data subject or the ability to intervene e.g. if a data subject requests a copy of records or demands deletion.

7 Guiding Privacy by Design Modeling

As mentioned earlier the walkthrough to create models is guided by questions that are oriented to the work practice. From our analysis of previously created process models we know that an evaluation by privacy experts requires more detailed information as currently captured in the modeling workshops.

To provide the necessary depth of detail a two-tiered approach is necessary. On the one hand, workshop organizers have to adhere to additional modeling guidelines and questions to reduce the vagueness of the models with respect to what types of data are collected. On the other hand, the participants can add more detail to specific model parts during the reflection phase.

For example, restricting the access to a distinct group is a common measure to handle PII securely [5, 10, 27]. To enable the privacy experts to evaluate the appropriateness of such restrictions, the system access rights must be modeled with little ambiguity. To achieve this the roles with access to the system should be captured in one super role that is connected to the system itself or directly to the activity using the system and not its parent activity.

As stated above, experts were unable to assess some of the process steps and systems included in the process model because details about what type of data is actually involved was not specified. The practitioners involved in the process know best which information is handled in each step of the process and know what is actually necessary to execute it. But, a discussion on the details of each part of the process during the workshop can lead to too detailed models and extensive workshops. The more detailed a model gets the harder gets it to grasp [28]. Also, focusing in tiny details can lead to major discussion blocking the whole workshop. Therefore, participants should be asked to add additional details during the reflection phase in which they individually work on the model and annotate it. The participants need to be guided through corresponding questions while annotating the model. When using our software other stakeholders are able to view the annotation of others dissent can be expressed and resolved in the following workshop. The questions for detailing on specific parts of the model can be assigned to individual participants so all parts in question are covered.

This approach is only feasible if the participants know systems or parts in question well and can explain them easily e.g. a standard form. During the course of supporting the introduction of an information security management system (ISMS) [29] we made the experience that dedicated models for systems and their corresponding interfaces are needed to keep an overview and facilitate the creation of models including this systems. These models can easily be provided through hyperlinks in SeeMe. Providing the sub models on demand on the one hand feeds the need to know the specific details of technical systems or other artifacts to review them but on the other hand omit them if they are currently not relevant for the discussion.

To ease the aforementioned guidance needed we provide the following heuristics (Table 4):

Table 4. Overview of additional guidelines

8 Conclusion and Future Work

In this paper we described how privacy by design can be incorporated in established, collaborative methods for designing socio-technical systems. The methods need to be adapted to the goals of privacy by design. We extended the focus of modeling to specific topics needed for privacy by design. We suggest that privacy experts should take part in workshops where processes are modelled and propose a question-based evaluation of processes to enable non-privacy experts to avoid common privacy and security issues. The methods should bridge the gap between practitioners being experts of the work practice and privacy experts which know privacy by design patterns, but have problems to evaluate (unintended) effects of these proposals to practice. Our early experience is promising with respect to this goal. The methods already prove to be useful to bridge practice-expert gaps.

In our future work of this action research project we aim to practically improve work practice to collect more experience. After including common privacy patterns into PbD plugins of the SeeMe web editor we also aim at evaluating our approach in workshops with the data protection office of a university that handles cases like those described above. We will observe the hopefully converging market of privacy patterns as to incorporate better design support with these patterns in mind.