Keywords

1 Introduction

Throughout the last decades numerous information visualization techniques and applications have appeared. These are generally highly interactive visual exploratory tools or methods aimed at allowing users to formulate better hypothesis and develop a deeper understanding of the underlying phenomena. Yet, these techniques tend to be complex and thus adequate development methods for them to effectively support users are pivotal. In this scope, a proper evaluation of the tools, including the visualization techniques they provide is crucial. Though, how to evaluate visualization applications, or techniques has been (and still is) a challenge in several ways [16], and numerous publications as well as several workshops have been devoted to discuss this topic (e.g. the beliv workshop series).

A natural approach to this problem, although not without risks, was to adapt evaluation methods developed and applied in other fields. Indeed, this was the case of several usability evaluation methods widely used in Human-Computer Interaction (HCI), each fostering the detection of different types of problems and having different limitations, implying that evaluators should select and use various appropriate techniques that best fit the situation [7].

Using the taxonomy of usability evaluation methods by Dix et al. [8], we may divide them in analytical and empirical; while the latter involve users, tend to be more complex and onerous, there are low-cost analytical evaluation methods widely used, capable of producing useful results with a low investment. Heuristic evaluation is such a method, possibly the most popular discount usability evaluation method [9], and has been adapted to evaluate Information Visualization tools, and techniques by several authors [1014]. We have previously used the method and believe that it may provide useful results with an interesting cost-benefit [15]; still, some heuristics may be difficult to understand hindering their applicability by not very experienced evaluators, which suggests the need for a study on the comprehensibility and applicability of visualization-specific heuristics. This paper describes a first step towards this goal: a study involving 25 evaluators aimed at assessing how easy to understand and apply are the Nielsen’s heuristics in Information Visualization evaluation, as well as two sets of visualization-specific heuristics, the ones proposed by Zuk and Carpendale [10] and by Forsell and Johanson [11].

The remainder of the paper is organized as follows: Sect. 2 presents the method of heuristic evaluation and the three sets of heuristics used, Sect. 3 presents the example selected to be evaluated and the methodology used in the study, Sect. 4 presents and discusses the results, and some conclusions are drawn in Sect. 5.

2 Heuristic Evaluation

Heuristic evaluation is a widely used discount usability evaluation method that allows finding potential problems in a user interface [9]. As it is subjective, it should involve several evaluators who inspect the interface concerning its compliance with a set of established usability principles (the “heuristics”). Non-compliant aspects should be compiled in a list of usability problems rated according to their severity, including possible suggestions of how to fix them. This list is supposed to help the development team to prioritize the problems to tackle.

According to Munzner [16], heuristic evaluation is an “immediate validation approach” that can be used at the visual encoding and interaction design level, the third level of the nested model for visualization design and validation proposed by this author. At this level the threat is that the design does not convey the desired abstraction to the user. We believe heuristic evaluation can be most useful in the scope of a (iterative) user centered development process of visualization applications and techniques as it is a pragmatic way to obtain quickly, inexpensively, and effectively valuable formative information if adequately employed, however, several issues must be carefully considered when applying this method [17], namely:

  • what heuristic set to use;

  • how well does it represent the relevant aspects of the type of user interface under evaluation;

  • how to train evaluators to use correctly the set of heuristics;

  • if they will be able to use it effectively to find problems;

  • and how many evaluators should be involved.

It is possible to use specific heuristics to evaluate specific types of products (e.g. groupware [18] or mobile applications [19]), or considering a particular class of target users (as seniors or children). Nonetheless, selecting a set of heuristics adequate to an actual situation is not easy and a poor choice will influence the problems found, and consequently how many evaluators are needed and the quality of the obtained evaluation. Hence, a careful consideration of the heuristic set to use concerning the above mentioned aspects is essential before applying heuristic evaluation. Moreover, the evaluators’ experience in using the method and their understanding of the set of heuristics used are also relevant factors.

Tory and Moller [20, 21] considered heuristic evaluation as a useful expert review method to evaluate visualization systems, outlined how to conduct a heuristic evaluation, and advised the usage of visualization heuristics.

However, while a number of rules have been used for that purpose (e.g. in the works by Shneiderman; Ware; Amar and Stasko; Zuk and Carpendale; Forsell et al.; [10, 11, 2224]), their understandability and scope is not yet fully assessed and selecting a set of heuristics may not be a trivial task for a development team.

In this work we used the well-known Nielsen’s Ten Usability heuristics [9] and two other sets developed specifically for Information Visualization, namely the ones proposed by Zuk and Carpendale [10] and Forsell and Johanson [11], and then tried to assess how easy they might be to understand and use by evaluators having some but not much experience in evaluating visualization applications, a scenario we deem rather realistic, for instance in a company.

Nielsen’s heuristics are general enough to be applicable to any kind of interactive product; yet, whereas they may have value in finding problems also in Information Visualization, as usability issues are often associated to visualization problems, developing heuristics sets that comprise the most common problems in this type of applications (namely encompassing issues related to visual representation, presentation, and interaction and manipulation of the parameters) is important in order to fine tune the method and reduce the risk of assuming too much in reusing the process of heuristic evaluation from usability [25]. This goal has been pursued by several authors, and the sets of heuristics selected for this study seem two interesting candidates for practical use.

2.1 Nielsen´s Heuristics

As mentioned, this set of heuristics is very general, which makes it interesting for the developer’s evaluation toolkit; nevertheless, that might be a disadvantage, as it may not be completely adjusted to a specific situation. We decided to use it as baseline to compare the understandability and number of problems found with the other heuristics, as our evaluators were familiarized with this set and had previous experience in using it to evaluate interactive systems. Even though it is widely known, we include the list of 10 heuristics [9], for the sake of clarity and completeness:

  1. 1.

    Visibility of system status

  2. 2.

    Match between system and the real world

  3. 3.

    User control and freedom

  4. 4.

    Consistency and standards

  5. 5.

    Error prevention

  6. 6.

    Recognition rather than recall

  7. 7.

    Flexibility and efficiency of use

  8. 8.

    Aesthetic and minimalist design

  9. 9.

    Help users recognize, diagnose, and recover from errors

  10. 10.

    Help and documentation

As mentioned, these heuristics are general enough to be useful to evaluate any kind of interactive product; however, we expect them to help finding problems mainly related to the interaction mechanisms provided and not so much related with visual representation and presentation aspects that should also be assessed in any Information Visualization technique or application [26].

2.2 Zuk and Carpendale’s Heuristics

This set was compiled specifically to evaluate the visual and cognitive aspects of visualization solutions from the works of Bertin, Tufte [27], and Ware [23]:

  1. 1.

    Ensure visual variable has sufficient length

  2. 2.

    Don’t expect reading order from color

  3. 3.

    Color perception varies with size of items

  4. 4.

    Local contrast affects color and gray perception

  5. 5.

    Consider people with color blindness

  6. 6.

    Pre-attentive benefits increase with field of view

  7. 7.

    Quantitative assessment requires position or size variation

  8. 8.

    Preserve data to graphic dimensionality

  9. 9.

    Put the most data in the least space

  10. 10.

    Remove the extraneous (ink)

  11. 11.

    Consider Gestalt Laws

  12. 12.

    Provide multiple levels of detail

  13. 13.

    Integrate text wherever relevant

Detailed descriptions of all heuristics are available in the original paper [10], which also provides an analysis of eight examples of uncertainty visualization using this set. In another work Zuk et al. [25] performed a meta-analysis aimed at understanding the issues involving the selection and organization of the heuristics based on a case study.

2.3 Forsell and Johanson’s Heuristics

This set was compiled empirically by Forsell and Johanson [11] to find common and important problems in Information Visualization techniques through heuristic evaluation. The method used by the authors was based on Nielsen’s approach to develop the widely known Ten Usability Heuristics [9]. The heuristics of six previously published sets ranging from very specific low-level heuristics to very high-level ones (Nielsen [9]; Shneiderman [22]; Freitas et al. [26]; Amar and Stasko [24]; Zuk and Carpendale [10]) were used to analyze a number of problems derived from earlier evaluations and the 10 heuristics that provided the highest explanatory coverage were selected to integrate the following new set:

  1. 1.

    Information coding

  2. 2.

    Minimal actions

  3. 3.

    Flexibility

  4. 4.

    Orientation and help

  5. 5.

    Spatial organization

  6. 6.

    Consistency

  7. 7.

    Recognition rather than recall

  8. 8.

    Prompting

  9. 9.

    Remove the extraneous

  10. 10.

    Data set reduction

According to the authors, the six heuristic sets considered cover important aspects, yet none seemed general enough to be used on its own for the evaluation of any Information Visualization technique. On the contrary, this new empirically determined set, comprising the highest ranked heuristics (according to the method used) from the considered sets was considered by the authors to have significantly wider coverage than any of the previous.

Detailed descriptions of all heuristics can be found in the original paper [11], as well as the method used, and suggestions on how to improve and validate the reliability, usefulness and applicability of the derived set.

3 Experimental Data Set and Method

The exploratory study described in this paper aimed at better understanding how to use heuristic evaluation in the context of Information Visualization and encompassed two main phases both performed with the collaboration of Information Visualization students of the MSc in Information Systems (University of Aveiro) during two academic years (2012-14).

In the first phase we asked 15 students to analyze a simple InfoVis example of their choice in a non-structured way (we dubbed “naïve critique”) using only their judgment based on the common sense and experience acquired in their previous use of applications, web-sites, etc., and list the potential problems they had found. We provided an example and explained it in a lecture. Later in the semester, after having practiced the heuristic evaluation method with other visualization applications, the students evaluated the first example using heuristic evaluation with two of the three selected sets. Results of this exploratory phase suggested that on one hand, heuristic evaluation (irrespective the heuristic set used) does help evaluators to consider issues that they would have otherwise missed as it fosters a more systematic inspection of the user interface relevant aspects. On the other hand, evaluators generally found that Nielsen´s heuristics are less finely tuned to Information Visualization examples, as expected. Concerning the heuristics specifically developed for Information Visualization evaluation, participants felt some difficulties in interpreting and applying some of them, (e.g. number 1, 2, 6 and 10 by Zuk and Carpendale).

In the second phase, we selected a simple example (http://spotfire.tibco.com/en/demos/spotfire-soccer-2014) from the Spotfire gallery that includes interactive and coordinated visualizations of data from the soccer world cups going back to Uruguai 1930. This example was chosen due to the concrete and easy to understand data set visualized; moreover the experiment was performed in 2014 at a time when the fifa World Cup Brazil had high media coverage. Thus, we anticipated this example would motivate our evaluators fostering the discovery of a higher number of problems.

Figures 1 and 2 show main aspects of the selected example allowing access to:

Fig. 1.
figure 1

World cup soccer analysis (spotfire demo gallery) - aspect of the geographical overview of the application.

Fig. 2.
figure 2

World cup soccer analysis (spotfire demo gallery) - aspect of the historical view of the goals data.

1- Data corresponding to the selected country on a map concerning a specific metric (goals for, goals against, etc.) (Fig. 1 – geographical overview)

2- Data corresponding to average goals scored filtered by team, city, etc. (Fig. 2 – historical view).

Ten students of the Information Visualization course participated in the experiment as evaluators. They all had some previous experience with heuristic evaluation using Nielsen’s heuristics to evaluate interactive systems, had attended the majority of the course classes, performed and presented to the class a naïve critique of an example of their choice, attended a session on the two other heuristics sets and performed an heuristic evaluation using the Nielsen’s heuristics and one of the visualization specific sets. Therefore, we deem that while not being experienced evaluators, the students had already a significant experience allowing them to obtain useful results using the method and provide valuable insight regarding understandability of the heuristics.

The experiment consisted in evaluating the Soccer example using heuristic evaluation with the Nielsen’s heuristics and one of the two other sets of heuristics (at their discretion), and answering two simple questionnaires.

The protocol involved the following steps:

1- Answer a questionnaire to collect data concerning the participants’ experience in using heuristic evaluation, as well as their background in Information Visualization and familiarity with heuristics and guidelines used in Information Visualization (e.g. the Bertin’s principles, or the Shneiderman’s Information Sseeking Mantra);

2- Carefully analyze the three heuristics sets and rate the understandability of each heuristic in Likert-like scale (1- not at all … 5- very much understandable);

3- Perform a partial heuristic evaluation of the example using the Nielsen’s heuristics; find 6 interaction problems, and classify each problem recording the heuristic (or heuristics) not complied with.

4- Select one of the two other heuristic sets and perform a partial heuristic evaluation; find 6 problems related to visual aspects, and record the heuristic (or heuristics) not complied with.

The complete session had a maximum duration of 90 min and time of completion of each step was recorded.

Throughout the experiment participants had access to the Internet and were allowed to search for any information they needed. The entire process took one hour and a half. At the end of the experiment, there was an informal discussion with the participants concerning what was more difficult or simpler in applying the method and the various heuristics to the example, among other issues.

4 Results and Discussion of the Experiment

This section presents the main results regarding heuristics understandability obtained through the questionnaire and the number of problems found by the 10 evaluators using each list of heuristics as well as a discussion of the most relevant findings.

4.1 Understandability of Heuristics

Figure 3 depicts median values of understandability as rated by the 10 students concerning the Nielsen’s heuristics. All heuristics were considered highly understandable (at least 4/5). Probably this is due to the fact that all students were familiarized with these heuristics (as confirmed by the answers to the first questionnaire); yet, heuristics 9 (Help users recognize, diagnose, and recover from errors) and 10 (Help and documentation) obtained the maximum value (5), which suggests that participants consider these particularly clear and easy to understand and apply. We took these results as a baseline for understandability of the other heuristics sets, for this group of participants.

Fig. 3.
figure 3

Nielsen’s heuristics - median values of understandability (1-not at all understandable; 5 –very much understandable).

The median values of understandability concerning the Zuk and Carpendale’s heuristics are shown in Fig. 4. Most heuristics were considered very understandable; however, two heuristics were rated 3: 1- Ensure visual variable has sufficient length, and 7- Quantitative assessment requires position or size variation. Moreover, the understandability of heuristic number 1 was rated 1 by one evaluator, meaning that its meaning was completely incomprehensible to him. Confronted with this, the evaluator explained that the heuristic should be more specific to what is sufficient length, since it is too vague and no clue is given on how to assess compliance with this rule. The same evaluator also rated 1 another heuristic, and did not rate 5 any heuristic, which suggest he might have been less familiarized with this set of heuristics.

Fig. 4.
figure 4

Zuk and Carpendale´s heuristics - median values of understandability (1-not at all understandable; 5 –very much understandable).

4.2 Problems Found

Tables 1 and 2 show the results concerning the potential problems found by each evaluator using the Nielsen’s heuristics, and the visualization specific heuristics. They also show the number of problems that were considered as correctly classified, and the time each evaluator spent to find all problems.

Table 1. Problems found with Nielsen´s heuristics by 10 evaluators: time spent, number of problems found, and number of problems correctly classified.
Table 2. Problems found with Visualization specific heuristics by 10 evaluators: heuristics, time spent, number of problems found, and number of problems correctly classified.

Analyzing Table 1 we observe that the evaluators altogether found 50 potential problems (not all different; some problems were identified by several evaluators), and that all the problems were considered as correctly classified regarding the heuristic not complied with. This most probably is due to that fact that all evaluators had previous experience in using heuristic evaluation with Nielsen’s heuristics to evaluate interactive systems. We notice also that seven evaluators were able to detect 5 or 6 problems in a relatively short time (19 to 27 min) and that the evaluators taking more time were the ones reporting less potential problems suggesting that these were the less experienced evaluators (this fact was confirmed analyzing their background and performance in the course).

Analyzing Table 2 we observe that eight evaluators chose to use the heuristics by Zuk and Carpendale and only two used the heuristics by Forsell and Johanson. Moreover, one evaluator (#6) was not able to apply the heuristics he had selected to use. Inspecting his answers to the questionnaire we noticed that unlikely all other evaluators he decided to use the heuristic set he had not previously used. The other nine evaluators found altogether 35 potential problems (not all different; some problems were found by several evaluators). In contrast to what happened with the Nielsen’s heuristics, some of the problems (20 %) were considered as incorrectly classified regarding the heuristic not complied with. The heuristics misused to classify these latter problems were Zuk´s number 2 (Don’t expect reading order from color); 3 (Color perception varies with size of item); 6 (pre-attentive benefits increase with field of view); 10 (Remove the extraneous). Analyzing the rates given by the evaluators who misused these heuristics we noticed that most had rated the misused heuristic less than 4 (in a scale 1 to 5, meaning much understandable).

This suggests that, while all but one evaluator were capable of finding relevant potential problems using visualization specific heuristics, they needed more practice in order to attain the same performance they had with Nielsen’s heuristics.

Similarly to what happened in the first phase of the study, evaluators generally agreed that Nielsen´s heuristics are adequate to find interactive problems, even in InfoVis, however the other two sets are preferable to evaluate visual aspects.

5 Conclusions and Future Work

We reviewed relevant issues involved in using heuristic evaluation in Information Visualization and performed an exploratory study to assess how understandable are the heuristics of three sets that have been used to evaluate interaction and visual aspects in InforVis, and how difficult it is to find potential problems using heuristic evaluation with those sets. The first phase of this study, involving 15 evaluators with some experience in using this method, suggested that, irrespective of the heuristics set used, heuristic evaluation is useful as it fosters a more systematic inspection of the user interface relevant aspects.

In the second phase of the study 10 evaluators, with some experience in using heuristic evaluation in InfoVis, rated the understandability of each heuristic and applied the method to an example. The obtained results suggest that heuristic evaluation is indeed suitable and produces useful results with a low investment even when performed by analysts not very experienced and hence it should be included in the developer’s evaluation toolkit. We found also that some of the heuristics are easier to understand than others and confirmed that the heuristics by Nielsen seem adequate to evaluate the interaction aspects even in InfoVis applications and the Zuk and Carpendale’s heuristics are useful to detect potential problems related to the visual aspects.

Even though this study has involved 25 evaluators and provided insights concerning the applicability of heuristic evaluation in Information Visualization further research is needed to compare and validate the use of these heuristics, namely involving more InfoVis examples, and evaluators with different degrees of experience.