1 Introduction

Nowadays, it is expected that software-based systems (“system”) are easy and enjoyable to use, and easy to learn. A well-known development approach to achieve such qualities is user-centered design. Involvement of users is a key element of user-centered design [1,2,3].

The principles of the user centered design process are: Early focus on user tasks, empirical measurement, and iterative design [1]. There are different approaches for user involvement [3]. Users are either exclusively involved in distinctive phases (i.e. analysis, design, evaluation) or across the process [3].

User involvement has the following benefits [4]: Improved system quality, avoiding costly features that the users don’t need, improved system acceptance, and greater understanding of the system.

However, in industrial environments, the involvement of users in the system development requires extra effort and budget –someone must pay for it. In an industrial environment, the question arises rather sooner than later: Does it work? Or to reword the question: What is the effectiveness and cost-benefit ratio of user involvement?

This UX case study had the goal to design a configuration tool for interlocking hardware which is used in the railway domain. Since it is a rather complex domain, the UX design of such a configuration tool is not trivial. Therefore, a participatory design approach was chosen. Users should help to inform the UX team about domain background and the users were asked to express their preferences for presented design options. The user participation was organized as weekly user group meetings. The participating users could either select one of the presented design options or they could create an alternative design option, usually based one of the presented ones.

This case study is looking for answers to three research questions, two regarding the effectiveness and one regarding the cost-benefit ratio of participatory design:

RQ1: How much did the participating users effectively influence the UX design? In other words. This question addresses the different view the participating users brought to the table. Let’s look at two hypothetical extremes. If the participants always confirmed an initially preferred design choice of the UX team, the influence would be very low. If the participants always rejected the initially preferred design choice of the UX team, the influence would be very high. Note that an initially preferred design choice of the UX team was not communicated to the participants.

RQ2: What was the impact of the participating users on efficiency improvements? One goal of the project was to make the new tool more efficient, compared to the benchmark tool. A question is which impact the user group had on design decisions with low, medium, or high efficiency improvements.

RQ3: What was the return-on-investment (ROI) for the weekly group sessions? This question addresses the mentioned cost-benefit ratio for the effort to let users participate in the design.

The questions were answered in a post-mortem analysis of the industrial project.

The paper is structured into five parts. Section 2 summarizes related work and identifies the research area of this case study. Section 3 introduces the application domain (engineering tool for railway interlocking hardware) and the applied UX process. Section 4 describes the weekly user group sessions which facilitated the participatory design activities. Section 5 describes the methodology to determine the effectiveness and the cost-benefit ratio of the weekly user group sessions. Section 6 summarizes the answers for the three research questions together with lessons learned. Section 7 concludes the results and outlines potential future research.

2 Related Work

There is a significant body of research regarding participatory design (PD) [6,7,8]. This paper focusses on user participation as “behaviors or activities that the target users of their representatives perform in the system development process” [5, p. 59].

There are different levels of user participation [4]:

  • Informative (exchange of information with users)

  • Consultative (users comment predefined services/options)

  • Participative (users influence decisions)

The way the weekly user group meetings were organized and facilitated addressed all three levels of user participation. Participating users listened to domain related questions and answered them (informative). Users expressed which presented design options they preferred and why (consultative). Finally, the users’ preferred choices influenced directly the design direction (participative).

Another way to categorize PD research is to consider contingency variables which are grouped into technical, managerial and user behavioral attributes [7]. The user behavioral attribute has several contingency variables which are applicable to this case study: perceived ease of use, and ease of use (addressing RQ1, RQ2), and system impact (addressing RQ3).

Considering the UX process (see Fig. 1), the participating users are involved in the “Explore and Select” phase to help finding the most appropriate interaction design concept.

Fig. 1.
figure 1

User experience process

Several studies involve users with the intent to improve the quality of requirements [11, 12]. Those participation activities are allocated to step “Define” in the UX process. The requirements are an input into the interaction concept design.

In one study, users were involved to evaluate scenarios of a smart card system use. The evaluation took place with a questionnaire [9]. Scenarios are a result of the “Discovery & Define” phase. Scenarios are not interaction design concepts, but they are one pre-requisite for the design.

In another publication, seven PD studies were analyzed [10]. These studies focused on how the elicitation of user needs can be improved. User needs are a result of the “Discovery & Design” phase.

A main difference to the study described in this paper with many other published studies seem to be the “black-box” vs “white-box” perspective. Many of the above studies look at the effectiveness of PD from a “black-box” perspective. This means the studies measure the impact of PD, applied to some project artifact or activity, on the project outcome, e.g. measured as a subjective rating or on the project success. This study however applies a “white-box” approach to PD. It means that this study looks into the internal mechanisms of a specific participatory design activity and measures the direct impact on key interim results, which are going to have a major impact on the project outcome. However, this study does not consider the final outcome of the project.

So far, the authors could not find a study which tracks the effectiveness of participatory design on single design decisions and their impact and its return of investment.

3 Project Background

3.1 Application Domain: Railway Interlocking Hardware Configuration Tool

The use of railway signals, switches and level crossings control the direction and speed of trains, for safe train rides. Such railroad safety equipment is controlled by interlocking hardware which is installed in the field along railroad tracks. The interlocking hardware needs to be configured for specific track layouts. An application engineer (user) configures the interlocking hardware with an application engineering tool (“tool”). An industrial project had the objective of the development of such a tool.

The configuration tool for interlocking hardware supports the following tasks:

  • Task 1: Setup project

  • Task 2: Configure system

  • Task 3: Define logic

  • Task 4: Compile logic

The intent of the first task is to introduce the project parameters. For instance, the railroad is identified which is going to use the interlocking hardware. In addition, it is defined where the interlocking hardware is used and the name of the application engineer. The number and types of chassis are selected. The result of the first task is the basic project infrastructure.

The intent of the second task is the configuration of the hardware (chassis and cards) and software (e.g. network settings). The application engineer assigns hardware cards (e.g. lamp input card, lamp output card) to the chassis slots. When all cards are assigned, system parameters are configured. Afterwards, parameter for each chassis card are configured.

When the system and the cards are configured, the application engineer defines the application logic (task 3). These are complex Boolean equations expressed in ladder logic [13] or relay logic [14] which determine under which condition which devices (e.g. switches, signals, level crossings) are set to which state. For instance, one equation determines under which conditions a red signal light should be turned on or turned off. In a typical project, hundreds or even thousands of such equations are defined.

After all equations are specified, the application engineer compiles the logic (task 4). Under the assumptions that all errors are fixed, the result is a report and an executable file which can be uploaded to the interlocking hardware in the field.

The four tasks require different efforts for an average project. Most of the effort (about 89%) is needed for task 3. Task 2 requires about 10% of the effort. The remaining 1% is distributed to task 1 and task 4.

Depending on the complexity, an application engineering project can last from several days to several months.

The application engineering tool most widely adopted in the industry became our benchmark tool (“benchmark tool”) for the project. We articulated the goal that the UX quality of our new tool should make it even more efficient and easy to use for the users.

3.2 Applied User Experience Process

To achieve high user experience quality, a user experience team was tasked with the UX design of the new tool. It applied a user-centered design approach (see Fig. 1).

Discover & Define:

In an initial “Value and Scope” phase, the UX team understands the business case of the product under design and how UX can support the business case. It also outlined the initial UX scope. Afterwards, the UX team gains an understanding about the involved user roles and their user profiles (e.g. user tasks, user needs, user characteristics), the use environments (e.g. spatial, work flow, social), and relevant good and bad practices (gains and pains). Known constraints (i.e. business, technology, design, and regulatory) are identified. The UX insights are consolidated in a step called “UX Essence” which includes UX goals, optimization use cases, and UX quality and quantity criteria which are used to access explored interaction design concepts.

Ideate and Select:

The UX team creates several interaction design concepts and assesses them against the established UX quality and quantity criteria. If the UX criteria are not met, further interaction design concept options will be created. The UX team, together with UX stakeholders, finally settles on an interaction design concept which ideally meets all the defined UX criteria. The interaction design concept is presented as wire frames and screen flows to demonstrate how they support the optimization use cases and how they meet the qualitative and quantitative UX quality criteria.

Design and Refine:

The UX team refines the selected interaction design concept, adds missing details and creates the visual design. Users are involved to evaluate the designs. UX designers refine the design according user’s feedback.

Develop and Deploy:

The UX teams creates optionally a specification or another kind of document as input for the front-end development and implements the front-end. The implementation can be a prototype or a product-quality front-end. The UX teams may evaluate the implemented prototype/front-end (e.g. with usability tests) and refines the design afterwards to address the findings.

Some additional explanations about the outlined UX process:

  1. 1)

    Representatives from identified user groups and project stakeholders (e.g. product manager, project manager, front- and back-end developers) are involved in all phases, so close feedback loops are happening along the way. If users or project stakeholders express concerns, the process may go a step back, e.g. from “Ideate and Select” to “Discover and Define”, or from “Design and Refine” to “Ideate and Select”. These loops are not displayed in Fig. 1.

  2. 2)

    The outlined UX process can be applied to an entire UX framework (e.g. an Engineering Tool) as well as to a single UX element (e.g. Find and Replace widget). For an entire UX framework, more time is necessary for each phase than when the process is applied to a single UX element.

  3. 3)

    The outlines UX process can be applied to agile, waterfall, or hybrid (called “wagile”) product development approaches. It is critical that the first phases (“Discover & Define”, “Ideate and Select” and “Design and Refine”) are performed before the UX results are implemented.

  4. 4)

    The workshops discussed in this paper took place during the “Ideate and Select” phase. For a given use case, the participants looked at different interaction design concepts and expressed their preference and reasons for one of the presented concepts, or the participants expressed their preference for a give concept with additional change requests.

4 Weekly User Group Sessions

Since the industrial domain was new for the UX team, it has asked the product manager to select a small group of users for weekly user group sessions. The overall intent was to front-load the UX design process as much as possible to avoid late and costly changes, e.g. because of late findings from prototype-based or product-based usability tests. Due to the complexity of the domain, the UX team wanted to establish a communication channel to people who are both domain experts and users, so any kind of questions could be answered in a timely manner.

The group was setup for two reasons:

  1. 1)

    elicit missing domain knowledge, and

  2. 2)

    gather feedback for proposed design options.

The UX team articulated the following selection criteria for the members of this user group:

  • Domain knowledge and current user of an interlocking configuration tool

  • With different application engineering backgrounds (e.g. freight, commuter)

  • Open to new ideas, also from other people

  • Team player (no big egos)

  • Availability for one hour per week

  • Interest in contributing to the development of a new tool

The product manager selected six individuals which met the selection criteria. The weekly user group sessions had a duration of one hour. On average, five users attended the weekly user group sessions. In addition to the five users, the head of the front-end development team attended almost weekly, and the product manager attended some of the meetings. The UX team lead facilitated the weekly user group sessions. Each user group session had a specific UX topic.

There were two types of questions. Only in a few numbers of cases, the UX team asked for additional domain information. In most cases, the UX team asked to express preferences for several interaction design options, presented as wire frames (an example is displayed in Fig. 2).

Fig. 2.
figure 2

Example of prepared material for weekly user group meetings

The UX team presented the material to the user group first. The design material was presented on a PowerPoint slides. Most concepts were presented with several options. An option was presented as one or a sequence of wire frames. The UX team introduced the several options and explained the reasons for each option and what distinguishes them. The group was asked to express which option they prefer and why. The group was briefed that they also could adjust presented options (“I like option 2, and would add this or that”), or that they could combine options (“I like option 2 with part x of option 3”). Afterwards, each group member explained which option they prefer and why. The UX team took notes directly on the slide, so the user group members could see what the UX team understood, and response to it when needed. After every user shared their preferences and reasons, the group selected one of the presented options. It also happened that the group selected a combination of several options with additional changes. The decision was noted on the slide.

For this paper, it is important to note that the UX team itself had selected a preferred option, out of the set of options. The UX team did not share its own preference with the user group. In the remainder of this paper, the UX team preferred option is called the “initially preferred option”. In some cases, the user group has selected the “initially preferred option” (called the “same” option), in some cases the user group has selected another option (called a “different” option).

5 Methods

To answer the three research questions, we need to introduce methods which allow to measure the effectiveness of the weekly user group sessions (answering RQ1), the efficiency improvements of the made decisions (answering RQ2), and another one to determine the return-on-investment (answering RQ3). All methods are described below.

5.1 Measure Effectiveness of Weekly User Group Sessions

To measure the effectiveness of the weekly user group meetings, all 64 design decisions were analyzed. For each design decision, it was determined whether the user group selected a different option than initially preferred by the UX team (“different”) or the same option as the UX team had preferred (“same”).

5.2 Measure Impact on Efficiency Improvements of Weekly User Group Sessions

Since efficiency improvements was one of the UX goals, the efficiency of the benchmark tool was compared with the efficiency of the new tool on a use case basis. For this reason, an average interlocking hardware configuration project was defined (including the number and types of chassis and cards, the number of equations etc.).

To perform this average interlocking hardware project, 26 use cases were identified, and determined how often each use case needs to be performed (“frequency”) for the benchmark tool and the new tool under design. In addition, it was determined, how many interaction steps it takes to perform each use case (“interaction steps for a single use case”) for the benchmark tool and for the new tool under design. To determine the number of interaction steps for the benchmark tools, we counted the actual number of mouse clicks. To determine the number of interaction steps for the new tool under design, we counted the expected number of mouse clicks based on the interaction design concepts (wire frames). By multiplying the frequency per use case with the interaction steps per use case, the number of interaction steps per use case could be calculated (see Table 1).

Table 1. Determination of efficiency improvements per use case

It was now possible to compare the total number of interaction steps per use case of the benchmark tool with the total number of interaction steps of the new tool under design. The efficiency increase was categorized with the following schema:

  • “Low”: If the efficiency increase of the new tool was equal to or less than 33%, compared to the benchmark tool.

  • “Medium”: If the efficiency increase of the new tool was more than 33% and equal to or less than 66%.

  • “High”: If the efficiency increase of the new tool was equal to or more than 67%, compared to the benchmark tool.

Example of such calculations are shown in Table 1.

Example calculation for use case 1: For the benchmark tool, use case one is performed 10 times. Each time, it requires 5 interaction steps. This makes 50 interaction steps for use case 1 for the benchmark tool (10 * 5 = 50). For the new design, use case 1 is performed 10 times. However, it only takes 4 interaction steps, which makes a total of 40 interaction steps (10 * 4 = 40). This means the new design is 10 steps (50 - 40) more efficient than the benchmark tool (or 20% = (50 steps - 40 steps)/50 steps).

Each single design decision which was made in the weekly user group was mapped to one of the 26 use cases. The mapping was determined by checking which design decision supports which of the 26 use cases.

In addition, for each design decision, it was checked whether the user group has selected the “same” design option or a “different” design option.

By mapping a design decision to a use case, each design decision inherited the efficiency category “high”, “medium”, or “low” from the mapped use case (Table 2).

Table 2. Mapping of efficiency category and efficiency category to each design decision

5.3 Measure Return on Investment of Weekly User Group Sessions

To calculate the return on investment (ROI), the following formula was used:

$$ ROI = \frac{Gain\;from\;investment - Cost\;of\;investment}{Cost\;of\;investment} $$

To calculate the cost of investment, the effort for planning and holding the weekly user group meetings was determined. We consider the number of hours needed for planning the weekly user group sessions. This included mostly preparing the workshop material by the UX team. For holding the workshop, we count the number of people participating in the workshop and multiply them with 1 h.

To determine the gain from investment, only the effort for potentially changing the design late in the project was considered. Therefore, only “different” design decisions were considered for the cost calculation, meaning that the weekly user group has selected a design option which was different from the initially preferred design by the UX team. The cost estimate considered the cost for a design change, performed by the UX team, and for an implementation change, performed by the front-end development team. The lead of the UX team estimated the effort for designing a different concept (wire frame, visual design, style guide). The lead of the front-end development team estimated the costs for implementing that concept.

6 Results

When we apply the three methods to the weekly user group sessions and their outcome, we get the following answers for the research questions.

6.1 RQ1: How Much Did the Participating Users Effectively Influence the UX Design?

In 18 weekly user group sessions, the user group made 64 design decisions. Out of 64 design decisions, 40 (62%) were identical with the original preferred preference from the UX team (“same”), and 24 (38%) were different from originally preferred design options (“different”).

6.2 RQ2: What Was the Impact of the Participating Users on Efficiency Improvements?

Out of the 64 design decisions, 7 were assigned to “high” efficiency improvements, 23 to “medium” and 34 to “low”.

All 7 “high” efficiency improvements were the “same” decisions. In other words: the user group did not select a different option which support “high” efficiency improvements.

Out of the 23 “medium” design decisions, 10 were “different” and 13 were the “same”.

Out of 34 “low” design decisions, 14 were “different” and 20 were the “same”. Table 3 summarizes the results.

Table 3. Distribution of design decisions across efficiency categories

6.3 RQ3: What Was the Return-On-Investment (ROI) for the Weekly Group Sessions?

To calculate the ROI, we need to calculate the cost and the gain of investment.

Calculating the Cost of Investment.

The investment for preparing and conducting the weekly user group sessions were:

Effort for preparing the 18 weekly user group sessions:

  • 3-person hours per one-hour session

  • Total: 18 sessions * 3-person hours per session = 54-person hours

Effort for conducting the weekly user group sessions:

  • 15 one-hour user group meetings with 5 people each: 75-person hours

  • 1 three-hour user group meeting with 5 people: 15-person hours.

  • Total: 75-person hours + 15-person hours = 90-person hours

$$ {\text{Grand}}\;{\text{total: 54-person}}\;{\text{hours + 90-person}}\;{\text{hours = 144}}\;{\text{person}}\;{\text{hours}} $$

Calculating the Gain of Investment.

The gains consider the avoided late and costly changes. This includes the effort for the design and implementation of all “different” design decisions (where the user group selected a design option which wasn’t initially selected by the UX team).

Effort for designing the “different” UX design: 98-person hours

Effort for implementation the “different” UX design: 63.5-person hours

Total: 161.5-person hours

Calculating the ROI.

We can now calculate the return on investment:

$$ ROI = \frac{161.5\;person\;hours - 144\;person\;hours}{144\;person\;hours} = 0.12 = 12\% $$

6.4 Lessons Learned

Beside the quantitative results, there are some other lessons learned from the weekly user group meetings.

The user representatives reported that it was exciting for them to be part of the design process. Most of them have never experienced such a process before. They enjoyed seeing how the design evolves (from “zero” to “hero”) and how many thoughts go into the design process. By contributing their ideas, they developed a strong sense of ownership with the tool under development and the new design.

The product manager appreciated the approach of “fail fast, correct fast”. Because of the weekly user group meetings, the project does not work with untested design assumptions for a long time but evaluates each design option in a timely manner. It is a trust building activity which saves money down the road. The product manager also appreciated that the members of the user group started to talk about the user groups and the new tool positively with their peers. They advertised not only the tool but also the project and the process.

The front-end developer understood the user’s needs and how they prefer to perform certain use cases first hand. The weekly user group sessions equipped him with the knowledge to make the right development decisions down the road. In addition to learning about the needs, the front-end development lead could identify missing requirements early in the project. This avoided change requests late in the project and reduced project costs and delays.

Because of the weekly user groups, the UX team was always certain that it has a strong foundation for the design. Everything was evaluated quickly and with a rationale, which made the design explainable and defendable to other project stakeholders. For that reason, the UX team gained trust from user groups and the project sponsor. This trust should not be underestimated in a technology focused organizational environment. Another benefit of the weekly user group session was the established communication channel. The UX team could ask any question almost any time. As mentioned earlier, the UX team mostly asked the user group to provide feedback to explored design options. It also used some of the meetings for domain related questions (“Could you please explain this to us?”). In some cases, the UX team could approach individual members of the user group which are specialized in certain topics for further explanation and background information. Finally, the UX team became fully integrated into the project and development team.

One challenge in such group meetings is that everyone should be heard. Due to the selection criteria, strong opinionated individuals were not present. However, depending on the topic and to human nature, some individuals are more vocal than others. To guarantee that every single user was heard, we established the “go-around” method. It simply means we went around and asked every single user individually for his/her preference and the reason behind it (“Darren, which option do you prefer?”). All other users could hear what everyone preferred. We changed the sequence of the go-around method frequently. The method ensured that we collected the preferences of all users systematically for each design decision, and all users knew we want to hear their preference and the reason behind it.

7 Summary and Conclusion

Due to demands from the business to clarify the effectiveness, impact on design decisions on efficiency improvements and cost-benefit ratio of weekly user groups, we analyzed post-mortem the effectiveness and cost/benefit ratio of the weekly user group sessions. The paper articulated the following research questions. And provides some answers:

  • RQ1: How much did the participating users effectively influence the UX design?

    In 24 (38%) out of 64 design decisions, the user group has selected a different design option than the design option which the UX team initially preferred. How to interpret this number? We haven’t found a comparable number in published research. This number justifies user group sessions and indicates how many of the first design decisions are sustainable. The number is most likely influenced by the amount of domain knowledge of the UX team, and the amount of experience of the users themselves.

  • RQ2: What was the impact of the participating users on efficiency improvements?

    The user had different preferences about design options which helped to make the design more efficient in the “low” and “medium” range. The users confirmed all design options with a “high” efficiency improvement. The conclusion could be that users don’t need to be involved when it comes to high efficiency improvements. That is probably not a good idea. The increase of efficiency means that some functions are automated and performed by the machine, and not triggered by the user anymore. This means the user loses control. In some cases, this is acceptable, in other cases, it is not. Believing in the principle “control over efficiency”, users should be involved in providing feedback to design concepts which significantly increase the efficiency. Otherwise, a new design may have achieved a higher efficiency, but is not accepted by users. The weekly user group sessions helped to address this challenge.

  • RQ3: What was the return-on-investment (ROI) for the weekly group sessions?

    The Return on Investment is slightly positive (12%). It indicates that weekly user groups are cost effective. One of the big benefits are the early detection of deviation and avoiding unplanned rework or, due to time constraints, not considering the insights in the development process. The ROI provides an initial cost justification that weekly user group meetings are cost effective.

Overall, weekly user groups meeting can be recommended as a tool for early user participation. It is particularly useful for complex domains and if the UX team is not very familiar with the domain.

Future work can check the “same”/“different” ratio for other domains. It might also be interesting to consider the knowledge of the UX team about a certain domain. It is expected that the “same” ratio goes up the more a UX team knows about a certain domain. A study with a larger number of user group meetings would be beneficial. Another extension is to compare the measured UX quality of the final product with the decisions made in such weekly user group meetings to determine how well it predicts the final UX quality.