1 Introduction

The current challenge for museums is how to successfully turn their institutional knowledge and authority into meaningful, engaging experiences by leveraging the appropriate technological media in the context of their physical settings, and for heterogeneous audiences [10]. In order to solve this problem, a growing number of initiatives integrating serious games, gamification, augmented reality and virtual reality through mobile devices have appeared in the last years [6, 9].

In this paper we present results from a project that intends to promote informal learning in a natural history museum through a treasure hunt type of game. The game incorporates image recognition, mini-games, 3D virtual reconstructions and augmented reality elements, being part of a growing number of initiatives that seek to exploit the use of augmented reality and related technologies in informal science learning sites [4].

As designers of educational games we face the problem of designing just a new version of the well known “chocolate-covered broccoli”, a term first introduced by Amy Bruckman in a presentation at the Developers Game Conference in 1999 [1]. Actually, she used the term “dipped-covered broccoli” for an approach used when combining learning content with gameplay where the gaming element of the product is used as a separate reward or sugar-coating for completing the educational content. This is an intrinsic problem of educational games. Although digital games may be capable of providing activities which are intrinsically motivating in their own right, it is critical to consider the effect of adding learning content to an intrinsically motivating game [5].

Ideally the learning goals in an educational game should be attained through activities that are intrinsic to the game play. For certain subjects or learning goals the use of intrinsic game mechanics with learning purposes can be straightforward, for example when the game serves as some kind of simulator of the target content. On the other hand, some types of learning content are very hard to turn into game mechanics, such as factual knowledge, as the one we want to include in our game for a Natural History museum. Therefore, either we accept that games are just adequate for certain types of learning content or we accept some broccoli in our game recipes.

Enigma MNCN is a treasure hunt for mobile devices designed for the National Museum of Natural Sciences (referred to by its Spanish acronym MNCN) in Madrid, one of the oldest museums of Natural History in Europe and the most important in Spain. This is the second game in the Enigma sagaFootnote 1, after Enigma Galdiano released in 2016 and designed to be played at the Lázaro Galdiano Museum in Madrid [2].

Enigma MNCN, described in more detail in Sect. 2, is designed for kids from 8 to 12 years old. The kid plays as a Paleontologist apprentice who has to find some objects in the collection and solve some puzzles and quizzes along the way. The learning content in the game is provided through a field notebook with images and textual information about pieces in the museum, which pages are revealed as goals in the game are fulfilled. In order to make the reading of the field notebook more intrinsic to the game, we include quizzes which answer is included in one of the pages of the notebook, usually the last one that was revealed.

Enigma MNCN includes a number of mini-games and treasure hunt mechanics, the chocolate, in order to cover a reading task, the broccoli. The long term goal of the work presented here is to determine whether a personalized version of the playful part, the game mechanics around the learning part of the game, can improve the satisfaction of the player and therefore make the whole learning experience more enjoyable. Keeping up with the metaphor, we want to determine whether a personalized chocolate recipe can make the chocolate-covered broccoli taste better. The first step towards that goal is to obtain a model for the preferences of game mechanics for this particular type of game, so that we can later use that model to guide the selection of game mechanics.

In this paper we present the different game mechanics included in Enigma MNCN and the results of several experiments trying to determine the preferences for those game mechanics among kids from 8 to 12, based on age and gender. We run experiments with two different versions of the game, the full version including the learning component (field notebook), and a reduced one without the learning component, in order to measure the variability in preferences for game mechanics in a treasure hunt game versus an educational version of the game. The main finding of these initial experiments is that preferences in game mechanics get shadowed when combined with a mostly disliked learning mechanic. We can detect significant differences in preferences for game mechanics when evaluating those preferences in the purely playful version of the game, but those differences get blurred when measured in the educational version, with broccoli (i.e., reading tasks) added in.

The rest of the paper runs as follows. Next Section describes the game mechanics in Enigma MNCN. Section 3 details the experimental set-up along with the results from the experiments and our conclusions about those results. Finally, Sect. 4 presents related work and concludes the paper.

2  Enigma MNCN

Enigma MNCN is a treasure hunt for mobile devices designed to be played at the National Museum of Natural Sciences of Spain, in Madrid. The players, kids between 8 and 12 years, are committed to become the new assistant of Dr. Anning, one of the paleontologist in the museum, helped by two of her current assistants: Pérez and Neand. The three of them, depicted in Fig. 1 will alternatively propose new challenges that will let the kid to demonstrate her merits to join the team. The theme of the exposition where the game is played is the evolution of life on Earth, from the first micro-organisms to Homo Sapiens, through a collection of fossils, skeletons, reconstructions and illustrations that recreate the life on Earth at different points in time.

Fig. 1.
figure 1

Enigma MNCN characters

The core game mechanic in Enigma uses the camera of a mobile device to recognize an object in the museum (we use Vuforia\(^{TM}\) and Unity 3D\(^{TM}\) as underlying technology). The object to be found is indicated to the player with its scientific name, such as Brachiopoda Strophonema, Calamopora Spongites or Ammonitida Ammonitina Perisphinctidae. We want the kids to pay attention to the signs in the exposition and make sense of the organization of the objects, where, for example, every fossil from the Ammonitida family is in the same showcase. Once the name of the object to be found has been read, the mobile turns into “search mode” becoming the “Paleo lens” in the game, as depicted in Fig. 2.

In some search tasks of the game just one object has to be found while in others we provide up to three different names of objects that have to be found in any order. Depending on the complexity of the search we can also provide a hollow silhouette of the target object in order to facilitate its identification. Since the player will typically forget the exact name of the object, she can ask Pérez, our friendly sloth, to remind her the name (see the sloth in the right bottom of Fig. 2).

As an additional clue, we can also provide the time depicted in the showcase of the object to be found. The exposition uses the well known metaphor of mapping the history of Earth into 24 h, so for example first fossils appear at 5:36 am and humans at 11:58 pm. Once the camera is pointing at the target object, it will be recognized and the task will have been fulfilled. We do not used QR codes but the actual objects in the collection. Since kids are usually unfamiliar with this technology, the game begins with a tutorial explaining: how the Paleo lens can be used to recognize objects in the museum; that you can ask Pérez to remind you the name of the object; and that it is possible to give up and quit a search if you can not make the Paleo lens to see it (kids almost never give up in their searches).

Fig. 2.
figure 2

Paleo lens while searching

In addition to the Paleo lens searches we have 4 different types of mini-games in Enigma MNCN: Packaging, Skeletons, AR Hunt and Magic Fields. Every mini-game comes after a Paleo lens search and usually relates somehow to the object just found. This serve to give some context for the mini-game, since we know where is the kid in the museum and we can use in the mini-game those elements in front of her.

Packaging, Fig. 3 left, is a puzzle game where the kid has to put all the pieces appearing to the right of the box, inside of the box. Following the narrative of the game, the kid is helping the museum by packaging some fossils that need to be sent to another museum. The fossils in the mini-game are similar to those in the showcase in front of the kid.

Skeletons, as Packaging, also require visual-spatial skills to be solved. As shown in Fig. 3 right, a partial skeleton is provided on the left with some missing bones on the right. The goal is to place the bones at the right positions. To provide and contextualize the task, the game has taken the kid in front of that same skeleton in the museum (a Deinotherium in the example of Fig. 3), so that she can look at the original to get inspiration.

Fig. 3.
figure 3

Packaging and skeletons

Magic fields are 3D reconstructions of prehistoric life. The Museum already displays large panels with illustrations of prehistoric life (an example is depicted in Fig. 4 left). A magic field is a 3D scene that is loaded after a search task that has led the player in front of the illustration panel, making that illustration come alive. The goal in the mini-game is to move around the 3D scene, through the gyroscope of the mobile device, and find a particular prehistoric animal, which usually corresponds to an skeleton we have seen before in the game.

AR hunts use augmented reality technology to insert an image of a prehistoric animal into one of the illustration panels of the museum, as shown in Fig. 4 left where a Meganeura, an extinct insect from the Carboniferous period, is moving around the illustration as seen through the camera of the device. The goal is to capture the moving animal by tapping it.

Fig. 4.
figure 4

AR hunt and magic field

For the educational version of the game we add two more elements: the field notebook and the quizzes.

The field notebook plays the role of the notebook of Dr Anning, where the paleontologist is writing down some of the main facts about the pieces she find, which are actually the ones that we found in the game. After every search, one more page, as the one showed in Fig. 5 left, is added to the initially empty notebook. In order to motivate the kids to read the contents of the notebook we include a new game mechanic: the quizzes.

The quizzes, as the one showed in Fig. 5 right, are multiple choice questions with a humorous tone where there is only one right answer. There is time limit to answer the question of 30 s, but the kid can stop the timer by opening the field notebook and reading it. Every time, the right answer to the quiz is provided in the notebook.

Fig. 5.
figure 5

Field notebook and quiz

The first version of the game used for the experiments consists of 8 stages, where each stage includes a search or multi-search and a mini-game, roughly including two mini-games of each type. In average it takes between 20 and 25 min to complete this version of the game. The educational version includes, in addition, 4 quizzes, every 2 or 3 stages, and adds between 7 and 8 more minutes to the gameplay time.

3 Experiments and Results

3.1 Experimental Set-Up

The experiments are run in the museum, when it is closed to the general public. In every run of the experiment we have a group of between 10 and 14 kids playing the game individually, using the same type of device, a Lenovo\(^{TM}\) TAB3 10 Plus, provided by the museum. The choice of the size of the group seeks to find a balance between the effort to run the experiments and the bias from having more people playing the game at the same time. Imagine that if you see five kids pointing their devices at the same point then maybe you should be pointing there too. In our experience, the variability in the time that the kids need to find the first objects alleviates this bias, and they quickly distribute among the different searches.

In the game we collect metrics obtained during game play, including data such as the time spent in every task, whether the task is successful or not, and how many times did the player make use of the help. Although the game is implemented in Unity 3D which offer some functionality for metrics collection, we have developed our own system that collects the metrics of interest and send them, at the end of every task in the game, as JSON files to a server where they are made persistent. The opinion of the kids about the game play mechanics are collected at the end of the game, and also made persistent as metrics in the server for later analysis.

For the experiments described in this paper, satisfaction is our main variable. We collect satisfaction data through a questionnaire integrated at the end of the game. On the questionnaire there is a question for every mechanic included in the game, with an image from the game to remind the kid what mechanic is she being asked for (see Fig. 6). Answers are given by selecting among a smiley, a neutral or a sad face, to make it more appealing for kids. Since most of the answers were smiley faces, we decided to dichotomize the answers into “positive” (smiley face) and “negative” (sad or neutral face), in order to increase the power in our analysis.

Fig. 6.
figure 6

Satisfaction questionnaire

We ran two sets of experiments. The first one with the treasure hunt game without learning mechanics (quizzes and field notebook), and the second one with the full version of the educational game.

3.2 First Experiment: Treasure Hunt game

The sample of the first experiment consisted of 30 subjects with an average age of 9.4 years (SD = 0,56; 17 subjects of 9 years and 13 of 10 years). Of the 30 subjects, 14 were boys and 16 girls. We consider it a specially homogeneous sample since they all attended fourth grade at the same school. Due to the small size of the sample on this experiment, we decided to run only non-parametric analysis.

Average satisfaction was above 80%, an encouraging result for the game, but our main goal was to determine whether we can detect differences in preferences for the different mechanics. For this purpose we have analyzed differences in preferences based on gender, Fig. 7, and age, Fig. 8.

Fig. 7.
figure 7

First experiment: differences by gender

Fig. 8.
figure 8

First experiment: differences by age

Although results are not statistically significant (p values higher than .05 for each chi square analysis run), we can observe that boys have a larger preference than girls for the Paleo lens, packaging and skeletons mechanics, while girls tend to prefer AR hunt and magic fields. Considering the type of activities in the game mechanics we can conclude that boys prefer instrumental activities, focused on action, while girls tend to prefer expressive activities, focused on aesthetics and emotion. It is interesting to see this tendency, usually observed in adults [11], to appear in young kids.

Regarding age, even with such a little age difference, we can observe some differences between 9 and 10 year old kids, as shown in Fig. 8 right. Again not statistically significant results point in the direction that 9 year old kids find more enjoyable less complex mechanics, such as Paleo lens and magic fields, while 10 year old kids tend to prefer skeletons or packaging that require a more complex problem solving abilities.

These initial results make us think in the possibility of finding some correlation between game mechanics preferences and demographic data such as age and gender.

3.3 Second Experiment: Educational Treasure hunt

A total of 213 children participated in the second experiment. A part of them did not finish the game (\(N\) = 28, which is 13%), and from the remaining 185 an additional 20% (\(N = 36\)) were outside of the established age range. Therefore, the results refer to a sample of 149 subjects, in the age range from 8 to 12 (\(M = 9.60\), \(SD = 1.17\)). Of these 73 were boys and 76 were girls.

Average satisfaction is again above 80%, as shown in Fig. 9, except for the additional learning mechanic: quizzes. Regarding differences based on gender we can observe in Fig. 10 that differences are at most of 3% points, or 4 in the case of quizzes, what does not introduce any significant difference.

Fig. 9.
figure 9

Satisfaction by game mechanic

Higher variability can be observed when considering age differences, as shown in Fig. 11. Nevertheless, again the differences are not statistically significant, as shown by the results of Pearson’s chi-squared test (\(\chi ^2\)) in Table 1.

Our hypothesis is that although differences in preferences for game mechanics can be observed just by considering demographics data as study 1 shows, such differences mostly disappear when adding a mechanic that has a much lower acceptance value, and therefore make the rest of mechanics equally acceptable in comparison.

Table 1. Pearson’s chi-squared test (\(\chi ^2\)) on age differences

4 Related Work and Conclusions

Research on personalized content for serious games is a growing area of interest. Once it has been accepted that digital games are an appropriate instrument for applications beyond pure entertainment, in training and communication, the question of how to personalize serious games is being raised to increase their effectiveness. In this sense, we find works that show that the effectiveness of serious games improves with personalization in adults, and others that advance in the definition of instruments to facilitate such customization.

Fig. 10.
figure 10

Second experiment: differences by gender

In [7] some initial results are provided showing the importance of tailoring games for change in the context of a game designed to improve healthy eating habits. Tailoring the game design to players’ personality type improved the effectiveness of the game, as was later shown in [8] with a large-scale study of more than 500 participants where their results reveal that people’s gamification user types play significant roles in the perceived persuasiveness of different strategies in serious games.

Regarding related work on instruments designed to facilitate the constructions of customized games, in [13] a conceptual framework of player preferences based on game elements and game playing styles is presented. Such framework can be used by designers to create games that are tailored to their target audience. Applying these ideas to serious games, [12] presents a general framework for personalized gameful applications using recommender systems, by describing the different building blocks of a recommender system (users, items, and transactions) in a personalized gamification context.

In this paper we have presented some initial results of experiments conducted in order to determine whether a model for game mechanics preferences can be found for an specific type of games: treasure hunts in museums with educational purposes for children. We have experimentally analyzed differences in preferences based on demographics, which are easier to measure than other potentially more informative differences such as temperament or cognitive capacities that we plan to measure in the future. Accepting that in some situations it is not possible to find an intrinsic game mechanic that serve the learning purposes of an educational game, our goal is to find if by tailoring the selection of accompanying game mechanics we can improve the whole educational experience.

For this purpose, we first ran a set of experiments with a non educational version of the game, and obtained initial values of variability in preferences along age and gender axes. Although not statistically significant, we found some indication of the existence of a preference model.

Fig. 11.
figure 11

Second experiment: differences by age

In a second run of experiments, with a larger population, we measured again preferences for game mechanics, but using the full educational version of the game. In this case differences in preference for game mechanics were hardly measurable, invalidating our initial results.

Our hypothesis is that although differences in preferences for game mechanics can be observed just by considering demographics data, such differences mostly disappear when adding a mechanic that has a much lower acceptance value, and therefore make the rest of mechanics equally acceptable in comparison. This effect resembles the “contrast effect” from Psychology, or the “contrast principle” as defined by [3]. We need to collect additional data in order to test this hypothesis, and we will do in future work.

Also as future work, we want to measure the effect of the time spent with the educational content, in our case reading the field notebook in the game, both in terms of the quality of the answers for the quizzes in the game, and for the general satisfaction of the player. It would be a positive result if we could demonstrate that making use of the educational content in the game promotes learning, and does not decrease the quality of the playful experience.