1 Introduction

Nowadays, eye tracking methods are widely used in many disciplines such as computer interaction, UX or marketing. Eye tracking is a tool aimed at capturing the eye movements on mobile or static systems. During our research, we used a static eye tracker arranged under a screen. According to Just and Carpenter eye-cognition hypothesis [1], what the individual is looking at indicates that he is thinking about this element, without there being any gap between this fixation point and his cognitive task. From this point of view, eye tracking covers many potentialities. However, as Hyökki says, the metrics that can be studied through eye tracking make it possible to know what the individual is looking at, as well as other factual elements without answering the question we are in fact asking more often: “the question why” [2]. To obtain answers to this question, qualitative methodologies have improved. The eye tracking retrospective think aloud is one of the best-known qualitative methods in the field. While the method has proven its worth in studies on digital objects usability, it has weaknesses that lead to participant’s distraction or silence, which is not conducive to the exploitation of personal characteristics, experience or habitus. However, as far as our study is concerned, this “why question” focuses on the individual and on his habitus, personal characteristics and emotions. It is also close to a “how question”. Thus, the purpose of this paper is to present the methodology used in our research on data visualization, and to particularly focus on the qualitative step developed during the eye tracking tests conducted exclusively for this study.

Our research consists in studying how people make sense about data visualization, while evaluating what is the influence of data visualizations embellishment on this sense making. This study will serve as the backdrop to this paper: we will focus more on the method used than on the results. Thus, we will explain how individual and self-reported appreciations collected during the eye tracking test can help the researcher in a semi-directive interview. We propose a method that provides more structure to this qualitative phase, being initiated by the participant and perfectly controlled by the researcher. It applies exclusively to eye tracking studies that address participants’ habitus regarding the appropriation of the studied object.

First, we will briefly present the research and its context. We will then discuss the methodology used by describing the different steps of the experimental protocol followed. Finally, we will discuss the added value of including a qualitative method in the experimental protocol before concluding.

2 Research Context

Our research therefore concerns the sense making performed by data visualizations users. Beyond that, we wonder about the influence of data visualization embellishment on sense making. After finding that different studies on embellished data visualizations [3,4,5,6] did not give a clear definition of “embellishment”, we chose to define the term, based on these studies and our own observations. Embellishment appears in a context of media and technological evolution. The increasing use of infographics in the media, as well as the expression of visualized information, is in full upsurge. Embellishment is an aesthetic contribution to a standard form of data visualization (that some call “raw chart” [5]). This contribution can consist of a pictorial, metaphorical or metamorphic contribution of the pixels (or “ink”) relating to the data on the graph. It is undoubtedly part of the design process, without necessarily being a separate step. Thus, data visualization embellishment would become a visualization technique, without us deciding on its utility, its advantages or its defects. It would seem, then, that embellishment covers aesthetic criteria that can evoke emotion.

The concept of embellishment obviously does not correspond to the design principles of data visualizations advocated by the great theorists of information visualization, Bertin and Tufte. The concept of effectiveness, identically described by the two authorsFootnote 1 under somewhat different words is quite functional and requires the fastest possible understanding, in a small space and using the least “ink” (pixels) possible [7, 8]. Their recommendations are of minimalist style: a chart, to be as efficient and understandable as possible, must be very simple from an aesthetic point of view. Many of their principles converge on this simplicity in style, which is meaningful according to them [7,8,9,10]. Several authors question this. For example, Inbar, Hill and their colleagues question the principle of “data-ink ratio maximization” according to which the ink dedicated to the data on a graph must be maximized in a non-redundant way [11, 12]. Indeed, the words of Bertin and Tufte make sense, but are mainly based on the human perceptual system. Nevertheless, the sense making do not only happen thanks to the perceptual system: we could extend the notion of efficiency to the users’ habitus.

Thus, a person’s sense making at the time of reading a data visualization can cover multiple aspects. We study two in particular.

  1. 1.

    The person can make sense and understand a visual representation through his cognitive activity and through his perceptual system. Moreover, the whole point of data visualization is to lighten cognitive work [11]. Does the arrangement of visual elements thus promote visual and perceptual appropriation of information?

  2. 2.

    From a constructivist and cognitivist perspective, information is understood and interpreted from the point of view of the individual, where each one depicts its own reality [12]. The sense making is then a process resulting from the knowledge and other subjective characteristics of people, such as their experiences, environment, opinions, etc. [13]. We can call this “the habitus”. Looking at this, how does a person make sense of information?

These two aspects are reminiscent of Dervin’s definition of sense making, for which it is an internal behavior, namely cognitive, and external, where the person will act in space and time according to his/her experience, which makes sense making a communicative process. In addition, in relation to the second aspect, Kennedy and her colleagues distinguished two groups of factors that interfere with a person’s engagement to a visualization: (1) human, social, and visualization-specific factors; (2) the emotions generated by the visualizations in the person’s mind [14, 15].

In relation to all these elements, our research question is: “How do we make sense of data visualization and which influence does data visualization embellishment have on it?”. To answer this question, it is therefore crucial to develop a methodology that takes into account the different aspects of sense making. Regarding visual and cognitive reception, the design of an eye tracking experiment is relevant, and in this case went hand-in-hand with the study of the participants’ habitus influence on data visualizations appropriation. We thus integrated a qualitative analysis step into the experiments.

3 Methodology Development and Experimental Protocol Design

3.1 Preliminary Choices in Experimental Design

Laboratory Conditions

Above all, it is good to know that we conducted the experiments under laboratory conditions. We invited each participant to the multi-room usability labFootnote 2. Thus, the experimentation computer was controlled from the management room. The researcher and the participant were therefore not in the same room at the time of the eye tracking test. In the same way, a third room reserved for brainstorming or debriefing activities was ready to conduct interviews. This is an advantage, allowing participants to clearly move to the second stage of the experience. Our study, consisting of evaluating the influence of data visualizations embellishment on the users’ sense making, did not need to be conducted in context. Indeed, we wanted to present to the participants different data visualizations whose different constitutive variables were under control (more on this later). This also allowed the participants to present a significant number of data visualizations.

The Choice of Participants Sample

The study was conducted on a sample of 40 people working in the field of digital communication, or in a related field. Our approach being qualitative, we carried out a reasoned sampling [15]. We did not take into consideration the age and demographics, believing that, by their profession, the people in our sample occasionally see data visualizations during their free time or professional activity. The results of this study ultimately only apply to digital communication professionals.

3.2 The Eye Tracking Test

Technology Used

We used the device Tobii Pro X3-120. This eye tracker is a wide bar that easily attaches to the bottom edge of the screen, making it non-intrusive. The tool allows launching tests in a controlled environment, with a computer processing completely dedicated to the collection of gaze data. The sampling frequency is 120 Hz, which makes scientifically reliable data collection.

The Choice of Test Corpus

For a few weeks, we looked for static data visualizations that did not respect design principles on the web, in the media, blogs, and so on. It was not difficult to find such images. Indeed, we chose rather than creating ourselves prototypes of embellished visualization from real cases. We have subsequently created “correct” data visualizations regarding design principles, based on the same data and the same subject. In short, for an original and embellished image, we created the standard visualization, tending towards what Bertin and Tufte advocate. In addition, we chose to stick to the bar chart for it is the data visualization most easily perceptible. The bar chart requires a perceptual task of a basic level [17]. Choosing it allowed us to focus on one particular element, but also to create standard visuals that are accessible to our entire sample. The most important variable of the corpus focuses on charts embellishment. Thus, the standard visualizations made by ourselves are only bar charts.

We did not present dynamic or interactive visualizations to participants but only static data visualizations. Anyway, the human and emotional factors discovered by Kennedy and her colleagues have influenced the visualizations choice but also some minor transformations we brought (delete the source, translate into the mother tongue of the participants, etc.). The entire corpus is composed of 40 images: 20 embellished visualizations from the media and 20 corresponding standard visualizations, created by us.

Conduct of the Experiments

Once the participant had completed the calibration phase of the device, necessary in any eye tracking study [18], we presented to him 20 data visualizations. We asked him to read them one by one before moving on to the next one. Then, the participant had to give five self-reported appreciations on Likert scales, about each visualization – from “strongly disagree” to “strongly agree”: beauty, clarity, interest, understanding, and appreciation. No other directive was given: we considered that a visualization in itself must give sufficient reading keys for a certain understanding, if we refer to the recommendation of Bertin and Tufte to proceed at a clear and complete labeling [7, 8]. Moreover, “the task given to users affects their gaze paths even without any need task” [1]. We therefore did not wish to add other more complicated tasks. Thus, apart from the intention to evaluate the differences in gaze patterns between standard static data visualization and embellished data visualization, we did not wish to formulate hypotheses at this stage. We will then explain it: the semi-directive interviews will allow us to develop them further. Nevertheless, we are here rather in a diagnostic situation in order to evaluate the “objective and quantitative evidence on user’s visual and attentional processes” [2].

Not all participants have seen the same static data visualizations. They observed as many standard visualizations as embellished visualizations. People who saw an embellished visualization could not observe the standard version built on the same data, in order to avoid the learning effects specific to the data and the topic addressed by visualization. Since time is an important factor in receiving a visualization [14, 15], they had the time they wanted to look at each image. They should then pass to the next image through a command. The eye tracking tests lasted between 12 and 15 min on average. Then, it was therefore equally important to control threats to the internal validity of our experience [16], to ensure the reliability of the experimental protocol [19]. To control various threats, sequence effects and other undesirable effects [19] we chose the randomization method by latin square, which allows to control at least two sources of harm to the internal validity of the experiment simultaneously [20]. By presenting 20 visualizations per person, this method allows to obtain an order of visualizations presentation of such that (1) each participant visualizes the images in a different order and (2) each participant visualizes 10 standard images, created by us, and 10 embellished images, from real cases. Once the eye tracking test was completed, the participant was ready to move in the brainstorming room for the second phase: the semi-directive interview.

3.3 Post-experimentation Interviews

Theoretical and Methodological Considerations

The purpose of the semi-directive interviews that directly followed the eye tracking test was to awaken the participants’ words about how they read and understood the visualization, taking into account their characteristics and listening to their own interpretation. We were not just trying to find out why a person had produced such a fixation point or saccade during the experiment. The interest was then to evoke the personal sense making, thus giving importance to the expression of the person’s emotions and habitus. Therefore, what type of interview could we have chosen and what process could we have implement?

Different theoretical sources have been an inspiration for our own interview model design. First, we could have thought of a think aloud method where the participant is asked to say as much as possible what he thinks during the experience. Most of our thoughts do not take verbal form. Putting words in them does not reveal all the substance of their meaning, but makes it possible to obtain very complete and interesting data [21]. The benefit is to capture thoughts that otherwise would have disappeared almost instantly. Nonetheless, tasks that require high cognitive load may interfere with verbalization [21]. Asking someone to think aloud during a complicated task can simply confuse his thoughts as well as gaze data captured by the eye tracker. To overcome this, many researchers practice the retrospective think aloud (RTA), especially regarding eye tracking studies. The principle remains the same but, as its name suggests, the verbal production takes place after the experiment. The gaze replay is then shown to participants who must therefore say what they thought during their gaze path. While this method is very effective in some cases, different studies have revealed weaknesses. If in some UX studies it increases the detection of usability problems, seeing the gaze replay is actually a distraction for the participant who sees it for the first time [22]. This makes the interview longer. In other cases, adding the replay gaze to the exercise does not change the verbalization and weighs it down [23]. The interaction between the researcher and the participant is more difficult. It could not match with our study: we needed the participants to be able to address their thoughts and emotions as freely as possible. We also wanted their words not to be subject to various distractions.

Finally, in order to avoid these distracting effects, and also because RTAs are more suitable for detecting usability problems than for exploring the role of the habitus in the appropriation of a digital object, we chose a semi-directive interview while using an unconventional interview guide. The self-reported appreciations selected by the participants during the experiment formed the interview guide.

The Self-reported Appreciations as an Interview Guide

In relation to all this, we chose to show to participants data visualizations they viewed during the experiment. The gaze replay was available: after explaining the principle of this video to the participant, we gave them the opportunity to ask us to show it to him if he felt it necessary. To conduct this interview, we had prepared questions in advance. They consisted of making the interviewee talk about himself, about the visualizations they preferred, hated, and so on. Thus, open questions left flexibility to the interviewee who, far from feeling “questioned”, could speak freely. Questions about the emotions and factors of Kennedy and her colleagues were also planned [14, 15]. Nevertheless, the real interview guide was the self-reported appreciation that we had on a tablet. As a reminder, participants gave ratings after each viewing on Likert scales about visualizations: how much they liked, understood, found clear, beautiful, etc. After spending a few minutes in the management room reading these reviews, we were ready to exploit them during the interview. In the same way, we had these appreciations under our eyes on a tablet. It allowed us to go back to a data visualization with the participant and take stock with him:

T: :

This one: you found it beautiful (4/5), but not super clear (2/5) and well understood (3/5). Explain to me.

E: :

It is pretty too, but less clear … I understood right away but you have to read everywhere, look at lots of spheres to understand … Not understand but … get the information.

T: :

Is it less intuitive?

E: :

Yes it is. The comparison is not straightforward even though for the bubbles, the size corresponds to the number of days.

T: :

And about aesthetics?

E: :

I do not know … if I look at it in detail … If the message goes. The aspect does not bother me. But would I have an interest to read bubble by bubble? And the little drawing helps.

Conduct of Interview

Thus, after participating in the eye tracking test, participants went to the brainstorming room in which was located a large screen. The interview always opened with the same question: “What visualization did you notice the most? Why?”. Following his/her answer, the interviewee could choose a second, then a third visualization that had retained his attention. This question concerned as much the positive or negative feelings mentioned by the participants. Then, taking control thanks to the self-reported appreciations, we guided the participants on the most interesting reactions regarding our research question. Silences were infrequent, and the interviewees always had something to say. In fact, they justified their answers, harvested very quickly after reading the visualization. Returning to their actions and thoughts allowed them to express how they felt about the reception of visual elements placed on the visualizations.

N: It’s not nice. I have a phobia of mathematics and numbers and there are plenty of them, so it did not help. I did not take anything away from this information. And it takes time to understand it … it bothers me and it would have been clearer with a small image.

Depending on the participants, these interviews lasted from 40 min to one hour. At first glance in our research, it seemed to us that simply analyzing gaze paths without having participants return to the visualizations could mislead us. For example, if the participant gives the reading up, eye tracking could not explain the reason for this abandonment, especially if the reason is related to personal and emotional elements and not a bad design. It was essential to hear the participant speak about how he felt or how he thought he had acted, beyond a simple explanation of gaze replay.

4 The Wealth of Data: What We Learned About It

Through this method, we were able to collect data relating to two equally interesting and essential aspects of data visualization: (1) visual and spontaneous reading, “the first impression” [2] and (2) each person’s reception and interpretation of data visualization, which are two aspects of sense making as we see it. We are thus in possession of data which give us an indication of the behavioral and physiological progress of the reading of a data visualization. These are the gaze data, which will undergo a statistical treatment. We will be able to identify through different hypotheses the influence of embellishment in relation to a data visualization reading tending towards minimalism. We are therefore in the case of a spontaneous reading. Moreover, as we indicated, studying only visual perception is not enough since human and emotional factors are taken into account in a person’s engagement to visualization. The qualitative data analysis allows to understand the extent to which the data visualization embellishment is important for individual sense making. Thus, we can greatly improve the hypotheses to develop in the context of statistical analysis by the first conclusions to draw from the qualitative analysis.

Indeed, the qualitative analysis carried out on the transcript of 30 h of recorded video will be a real asset to raise the level of the hypotheses to be tested. The textual data, resulting from the semi-directive interviews, underwent a content analysis by thematic coding. This type of qualitative analysis makes it possible to interpret the content and to penetrate it in depth to comment on its meaning [24]. In order to carry out the work of interpretation, it is necessary to classify the opinions present in the recorded dialogues [25]. After identifying the main ideas of these data, it is therefore necessary to proceed to a precise thematic coding. “The thematic analysis aims at identifying fundamental semantic elements by grouping them within categories. Themes are basic semantic units”. [25] The thematic analysis allowed us to evaluate the convergences and differences of participants’ opinions. There have been many comments on the visuals themselves that is visualization as implied by Kennedy and his colleagues in their emotional factors [14, 15]. The conclusions are prominent. They also provide guidance for the hypotheses that will be tested in the quantitative data statistical analysis. The purpose of our paper is to show the contribution of qualitative data to quantitative analysis, so we will not go through all of the conclusions. However, it is relevant to show a type of conclusion that highlights the possible extension of qualitative exploration after an eye tracking test. We are referring here to the research extension, not to a complementary contribution. We will also show a second example that can lead to the enrichment of the production of hypotheses to be tested in the quantitative analysis. As a reminder, the qualitative analysis is almost completed, while we are about to start the gaze data statistical analysis.

Thus, as a first example, we can mention that the majority of participants were not disturbed by an effort to be made when reading the data visualization, as long as it was accompanied by a pleasant feeling when reading. Thus, participants said they would like to focus on an embellished visualization if it seemed attractive and enjoyable to them. This pleasant feeling motivated them to read and understand data visualization. Other communication professionals who have not completed their school careers in the field of communication (IT, management, etc.) have proved to be less sensitive to this aspect of visualization. This type of conclusion does not really complement the analysis of gaze even if, for our study, such an observation is valuable and revealing with regard to personal sense making linked to the habitus.

As a second example, we can highlight the allusive aspect of data visualizations embellishments. Thus, many participants felt that embellishment could contribute to maintaining the universe addressed by the subject of visualization. Some believe that it is not necessary to reread the title to remember the data visualization subject. For others, this can only work if embellishment is a very simple addition to data visualization: a pictogram with simple colours, a drawing without too many details and so on. An aesthetic effort with too many details would thus bring confusion by preventing the allusion capacity that embellishment can bring. In this case, this conclusion may lead us to test new hypotheses as part of the quantitative analysis, or to pay attention to new areas of interest when analyzing gaze data. For example, we could check the number of jerks between the chart and the title for a standard visualization and an embellished visualization. We could also check if, indeed, the more aesthetically charged visualization are subject to more jerks, without returning to the title. We are therefore at this stage in this research: the analysis of gaze paths and the pattern search is of course planned for gaze data. However, beyond the analysis, we will be able to realize hypotheses that will be in phase with the conclusions resulting from the thematic analysis, and therefore with the thoughts and feelings evoked by the sample. Indeed, the data collected are very numerous: we have 800 gaze plots, i.e. 800 diagrams representing the eye path of the individual on a data visualization of data. The hypotheses could be just as numerous and varied. Precisely, the qualitative analysis from which cases similar to our example will be derived will allow us to formulate relevant and refined hypotheses regarding our test population. This is one of its main contributions.

Through the quantitative and qualitative data analysis, we wanted to discover which elements are influential in the sense making, on the one hand perceptual and on the other hand related to perceptual characteristics while taking into account the data visualization embellishment. The contribution of qualitative analysis in this study allows to largely cover one of the two aspects of sense making while feeding the former by allowing to elaborate logical hypotheses regarding the participants’ comments. The data collections were carried out at two different times, but not independently of each other: the Likert scales, collected during the eye tracking test and mobilized during the semi-directive interview, bring consistency and stability to the study. The participants were able to comment and justify their answers, thus bringing new elements to the research. Acting this way consists in developing a qualitative strategy that allows to structure the interview process based on the participant’s opinions through the self-reported appreciations. The participant initiates the structure of the interview, but the researcher control sit. This facilitates the study of subjective properties during appropriation of the objects studied by an eye tracking method. This is true at least if we consider that this object appropriation can diverge depending on the participants’ habitus. This concerns many fields. Through this paper, we wish to show that such an interview is an opportunity not only to collect explanatory and complementary elements to the gaze data, but also to identify deeper, more individual-specific qualitative elements related to the habitus, which is often highlighted in social sciences. It is therefore a question of going beyond the simple complementary and anecdotal contribution of information by completely extending the results of one’s own research.

5 Conclusion

While eye tracking studies are increasing, we wanted to share our experience developed as part of our research about the reader’s sense making of data visualizations, especially when these visualizations are embellished. Forty digital communication professionals came to our laboratory to take part in an experiment for which we designed the entire experimental protocol. The implementation of this protocol has sometimes required creativity. Indeed, combining theoretical and methodological ambitions could seem challenging for us. The production and analysis of qualitative data is often seen as an explanatory contribution to the statistical conclusions to which gaze data analysis leads. However, our qualitative analysis consisted in the production of a thematic coding to be interpreted. It can provide much more than conclusion justifications about gaze paths produced by the eye tracker. The real opportunity is to extend one’s research to other horizons while saving effort. It is not easy to gather so many people for a laboratory experiment. For us, considering to go beyond the simple interpretation of gaze path from a qualitative point of view perfectly corresponds to our theoretical ambitions that find the elements that generate or influence the sense making in the experience and habitus of each one. It is not an effortless process and can sometimes give the researcher the felling of being a “tinkerer” who must be inventive to give coherence to his experimental system. In this way, the conduct of an eye tracking test and a qualitative exploration that directly follows it is not to be considered as a sequenced way but rather as the integration of all dimensions within a single experience. In our case, the use of Likert scales filled in during the eye tracking test was crucial and constituted the interview guide necessary to conduct the qualitative phase. The interview model is therefore designed in an integrated way, in conjunction with the design of the experimental protocol developed as part of the eye tracking test. All this made it possible to better structure the semi-directive interviews that lie at the heart of the participant’s feelings. Subjective or habitus elements were easily highlighted regarding to data visualization sense making. At this time, with our research in full swing, we have not yet interpreted all the data nor drawn all the conclusions that it can offer. However, it appears that the interpretation of qualitative data leads to many conclusions, while the analysis of gaze paths and other visual data provides many clues as to the correct arrangement of visual elements on a data visualization. The real benefit of a method that gives full credit to qualitative exploration in an experimental model is to obtain real potentialities for research and the creation of consistent hypotheses, which bring coherence and strength to the said research. This is the opportunity to exploit the research to its full potential by going beyond the factual aspect that sometimes appears in eye tracking studies. Obviously, this has limits. The analysis of all these different data types takes a lot of time because of their diversity. Similarly, such a method would probably not be applicable if the researcher’s intention was to generalize his results to the entire population. In our case, this will only be possible for Belgian digital communication professionals. Its main asset remains the data quality and the results depth.