Keywords

1 Introduction

Augmented Reality (AR) can be described as a view of the real and physical world which incorporates additional information to augment this view [1], i.e., it is a system that supplements the real world with virtual objects synthesized by computer, making these two worlds coexist in the same space, respecting the following properties: a) it combines real and virtual objects in a real environment; b) it is executed interactively in real time; c) it overlaps real and virtual objects with each other; d) it can be applied to all human senses, including hearing, touch, smell and strength [2]. Thus, this technology has the advantage of allowing the use of tangible and multimodal actions that facilitate interaction and motivate users [3, 4].

On the other hand, usability is a system quality requirement that contains aspects related to the efficiency when using the system, ease of learning, subjective satisfaction from the user and adequacy to specific patterns; it is the process of assuring interface usability and guarantee that the user’s demands be meet [5, 6]. Although the aspects for usability mentioned above are conceptually clear, it is difficult to use these definitions in practice. When the evaluation is made through empirical studies, the researchers need to decide about metrics for each factor [7]. Usability metrics are usually divided into objective and subjective. The first is related to the effectiveness and efficiency of the user with the system, while the subjective measures collect the user opinions about the system usually through questionnaires or interviews. The objective criteria can be further divided into quantitative and qualitative [6].

Although AR presents the same core usability challenges as traditional interfaces – for instance, the potential for overloading users with too much information and making it difficult determine a relevant action -, AR aggravates some of these problems because multiple types of augmentation are possible concomitantly, and proactive applications run the risk of overwhelming users [1]. There are certain peculiarities inherent in AR applications that should be evaluated in a more specific context, such as the use of markers and multimodal interaction in 3D space.

Although AR is being studied for over 40 years, only a few years researchers began to look after the formal evaluation of these systems [8]. A question that may arise is how the developers of AR systems have been following minimum criteria that can guarantee its quality in terms of usability. In many cases, AR applications have been developed without following a methodology or using a traditional software development methodology, which does not consider the peculiarities of this kind of applications. Furthermore, RA uses a natural interface that allows a non-conventional interaction, essentially making the analysis of their quality, especially related to the usability of their applications. It is worth noting that although this quality may be responsible for the success of applications, knowledge about the opinion of the users, their satisfaction and frustrations in the use of these applications is still rather limited [9, 10].

The lack of formal assessments in the area has already been pointed out by [8]. This, on the world scenario, revealed a low number of papers related with some evaluation technique of AR applications. According to this study, less than 8 % of AR-related papers, from 1993 to 2007, were evaluated according to the following parameters: perception, user performance, collaboration and usability. Moreover, they showed that only seven papers out of 169 evaluation techniques include usability.

In our work, 992 papers containing the keywords “Augmented Reality” and “Usability” in their abstract were found. However, only 58 papers contained in fact some kind of usability evaluation, especially subjective aspects of the evaluation. It is also clear that many of these studies do not address a usability study correctly.

Therefore, the focus of this paper is to present the main attributes that have been used to evaluate the usability of AR applications on the world scenario since 2008 until 2013. The papers considered most relevant, which bring more specific attributes for evaluation of AR are discussed in this paper. In addition, we propose a set of questions on these AR usability attributes, based on established questionnaires and also experience in the evaluation of the authors.

This paper is organized as follows. Section 2 discusses the methodology of study development. Section 3 presents the results and discussions of the research. Finally, Sect. 4 presents the conclusions on the subject.

2 Materials and Methods

In order to reach the main usability attributes used on AR searches, were considered the following steps: conducting a systematic review of papers in Portuguese (Brazilian), from 1998 to 2013; categorization of the attributes [11]; conducting a systematic review of papers in English, from 1998 to 2013 (the research conducted by [8] brings the studies on the world scenario until 2007); completion of the categorization of the attributes, using papers in English; and preparation of an assessment instrument (a set of questions according to the attributes). On the whole 992 papers (227 in Portuguese and 765 in English) were collected from “Portal de Periódicos CAPES” (this Brazilian site contains a database of major scientific journals and can be configured to the area of Computing), and IEEE Xplore and ACM Digital Library to found conference papers. From these papers, 16 in Portuguese and 42 in English papers could be used for this research. The research protocol developed for this study is adapted from the models proposed by [12] and [13].

From this research, 51 attributes were found. They were divided into nine categories (System Interaction, Application Interface, Representation, Sensory and Behavioral Aspects, Motivation and Effort, Spatial Association, Internal Aspects and Configuration, General Functionality and Others Attributes), according to [11] and the second systematic review. Some of these attributes are used to evaluate the usability of any interactive computer system and any others are specific for AR applications. Subsequently, a research was conducted in studying the main usability satisfaction questionnaires: QUIS [14], PUEU [15], NAU [16], NHE [17], CSUQ [17], ASQ [18], PHUE [19], PUTQ [20], and USE [21].

In the next stage, these 51 attributes were mapped to these existing questionnaires, as indicated in Table 1. Some attributes had more than a question mentioned by the same questionnaire and sometimes the same attribute was in different questionnaires. Others attributes could not be mapped to any existing questionnaire. It was necessary to standardize questions for the new questionnaire. Questions not mapped had to be created.

Table 1. Attributes and questions of augmented reality

In addition to finding the main attributes for assessment of RA applications, it was desirable to know:

  • What usability attributes are being more widely used, general and specific?

  • Which areas are being more developed on usability analysis issues?

  • Which kind of AR systems has been used more frequently: markers or markerless?

  • Which kind of AR environment has been used more frequently: desktop or mobile?

  • How many users, on average, has been used to perform a usability test for RA applications?

3 Results and Discussion

In the process of evaluating usability of AR applications, papers were selected by its importance and it is possible verify, from 2008 to 2013, that their distribution, through the years, took place, as shown in Fig. 1. One might understand a gradual increase in the number of papers that focus some kind of usability evaluation in the last year. The areas covered by these papers can be seen in Fig. 2.

Fig. 1.
figure 1

Amount of papers distributed over the years 2008 to 2013

Fig. 2.
figure 2

Amount of papers distributed in the 13 areas

It was found that the greatest amount of usability evaluation work occurred in the “Education” area; Secondly, we have “Base –AR” that means tests performed with new algorithms, techniques and user’s tools utilized for evaluation; thirdly, we have “Arts”.

Figure 3 presents the types of AR environments. Figure 4 presents the information about the use or not of markers.

Fig. 3.
figure 3

Types of AR environment

Fig. 4.
figure 4

Use of marker or markeless

Through Fig. 3, it is possible to see that even the development of AR focuses on applications for desktops.

On the other hand, it is possible to see that markerless use surpassed the use of AR markers (Fig. 4).

We could see, through the systematic reviews, that the most commonly used attributes (which appeared at least ten times) for these papers are: attractiveness, ease of learning the application, ease of use and level of user satisfaction.

Other information obtained by analyses of these papers makes reference on the users’ quantity during usability tests: in average 30,167 people are used. This information was obtained from the information read the papers with the number of users for testing (71 % of papers containing such information).

3.1 Integration of Criteria and Peculiarities of AR

Our initial studies were performed from a systematic review of evaluation of usability of AR systems in the Brazilian scenario, indicated by [11]. This study ranked the usability attributes for AR applications according to eight dimensions. Throughout this study it was possible to establish:

  • Interaction with the system: refers to the mechanisms that allow the user to interact with the system (markers, audio, mouse, unconventional devices).

  • Application Interface: is related to issues, such as interface /system. It is presented to the user in terms of ease of use and learning, flexibility of use, among others.

  • Representation: relates to aspects perceived by the user, such as the appearance that the interface presents to the user.

  • Sensory and Behavioral Aspects: are related to how the interface can be intuitive, promote user adaptation, in addition to immersion.

  • Motivation and Effort: relates to how the system can hold the user’s attention, motivating him to use and reuse the application.

  • Spatial Association: refers to the distribution in space of the virtual and real environment, overlapping them. Therefore, the virtual objects inserted onto the scene should have proportional sizes to the real environment.

  • Internal and Configuration aspects: refers to aspects that allow the application to be ready for use.

  • General Functionality: is related to the criteria of software utility when it meets the objectives and requirements that are proposed.

Including the second systematic review, now with papers on the world stage (written in English), were found 51 attributes. A category the most, called “Others attributes” was inserted to the categorization, incorporating attributes that had no relation to the other original eight categories [11]. These attributes mapped in questions are presented in Table 1.

4 Conclusions

This paper presented a study on the topic of usability evaluation of Augmented Reality applications in the world scenario, since 2008 until 2013.

The central goals of this paper were to investigate the state of the art of AR evaluation and also extract the most relevant attributes that have been used in the world scenario to evaluate AR applications.

Evidence shows that 94 % of researched papers have used the word “Usability” and “Augmented Reality”, but not really deal with this subject or use erroneous and /or irrelevant criteria. Only 58 out of 992 papers researched in journals and conferences contained some relevance and could be used in order to contribute to the assessment area.

Many of the general attributes are presented to evaluate interactive applications, such as: ease of learning the application, ease of use and level of user satisfaction.

On our first systematic review, we find 42 usability attributes for Augmented Reality, in the Brazilian scenario. On our second systematic review on the world stage, nine attributes were found. Moreover, 85.71 % of the attributes found in the first systematic review were present in the second systematic review. These attributes were divided into nine categories.

A second goal of this work was to propose a questionnaire for the attributes found by systematic review. This questionnaire is grounded in questionnaires established in the literature and the authors’ experience in evaluation of application of Augmented Reality.

It is worth mentioning the prevalence of researches on usability evaluation of Augmented Reality with markerless (Fig. 4), especially in papers related to conferences. This stands out the need to check if there are specific attributes for this class of applications that should be addressed.

As future work, we intend to study and propose attributes to evaluate the usability of Mobile Augmented Reality.