1 Introduction

There is currently considerable globalization of markets for interactive systems, from smartphones to washing machines. This creates a need for good usability and user experience (UX) with these systems for users from many different cultures. There is growing awareness of the need for consideration of cross-cultural issues in the design of interactive systems, but as yet not nearly as much consideration that the methods used in eliciting user requirements and evaluating usability and UX are also potentially subject to cultural differences. The ways people in India or China typically interpret and react to particular kinds of questions and materials used in usability and UX work may be very different from the ways people in North America or northern Europe typically interpret and react to them [1].

Our research objective was to study how cultural issues affect methods used for eliciting usability and UX information with a web survey. We present preliminary findings from two case studies both using the remote web survey. First we will review background relevant research on this topic, on internationalization and localization of products, on cross-cultural differences in usability and UX methods and on frameworks for understanding cultural differences between users in their interpretation of the methods. Our first case study investigated the reaction of Nigerian and Anglo-Celtic (AC) participants to two types of visual material, photos and sketches, in the elicitation of their experiences with smartphones. The second case study investigated the reactions of Chinese and British participants to three different question types (Likert rating items, sentence completion questions and open ended questions), again in the elicitation of their experiences with smartphones. Finally we will draw some conclusions about cross-cultural in working with users in the design of interactive systems.

2 Background

There is a growing interest in the relationship between people’s culture and their use of interactive systems [2, 3], largely due to the globalization of markets for interactive systems. In addition, users around the world expect more from their interactive systems than only utility and usability: they are looking for positive user experiences [4]. Thus the design of UX needs to take into account users’ cultural contexts as well as the focus on content, brand and emotion emphasized by UX [5]. Therefore understanding the concept of culture and how it impacts on the UX of an interactive system is becoming an important part of the design of the system. As Ito and Nakakoji [6] note, much of cognitive reasoning depends on social norms and background culture. This cultural diversity makes it unrealistic for developers to rely only on intuition or personal experience when designing for good UX in cross-cultural contexts [7]. Ouygi et al. [8] suggest that cultural differences in signs, meanings, actions, conventions, norms and values raise challenging issues in the design of usable interactive systems for different cultures.

Two major strategies are adopted in the attempt to meet the demands of globalization: internationalization and localization [2]. Internationalization is the designing of systems in such a way that they do not need to undergo changes to their core form and functionality in order to be adopted for various target markets [e.g. 2, 9]. Conversely, localization means that interactive systems are adapted to suit the specific cultures and languages of target markets thereby making the systems usable and acceptable by members of the target cultures [2, 9].

To create a basis for localization of interactive systems, the key factors differentiating cultures from one another need to be clearly identified [10]. Two broad types of issues related to cross-cultural design have been identified. Firstly, there are objective issues, such as language and format conventions of time of day, dates and number, text directionality in writing systems and so on [2, 7]. Secondly, there are subjective issues such as value systems, behavioural and intellectual systems of the cultural groups and the ways in which people in different cultures interact with interactive systems [7]. Young [9] argues that localization requires authentication of the design through methods such as ethnographic research to make sure that the design specifications are truly representative of the target cultures. Chavan et al. [1] also support the idea of authenticating localization to ensure that designs are not biased to the designers’ own culture.

Furthermore, Oyugi et al. [8] argue that cultural aspects need to be considered also in the selection and use of methods used in UX research and design. Cultural differences potentially affect the manner in which users interpret and react to user requirements studies, design exercises and evaluation studies. A majority of the methods used to conduct user research have originated in the “western” world. These “western” methods may not always be appropriate for research with users from other cultures or users from other cultures may interpret aspects of these methods differently. These views are also supported by research by Vatrapu and Pérez-Quiñones [11] on structured interviews with Anglo-American and Indian participants which found that interviewees found more usability problems and made more suggestions when the interviewer was from the same cultural background.

The most well known framework for thinking about cross-cultural differences in value systems was developed by Hofstede [12], although it has also been strongly criticized. Hofstede studied over 116,000 people from over 50 countries. It was later discovered that all the participants were employees of IBM, and this discovery along with other observations lead to considerable criticism of Hofstede’s work. Critics believed that IBM may have a culture of its own that might have influenced Hofstede’s findings in ways that his dimensions do not apply to people outside the IBM world [13]. Another prominent criticism is the fact that Hofstede refers to culture in the national sense [14]. His study did not seem to take into account the fact that a country could have several distinct cultures within it.

Despite these criticisms, Hofstede’s findings have served as a useful foundation on which to conduct cross-cultural studies. Hofstede’s five cultural dimensions [12] provide a way to understand the influence of cultural differences on human-computer interaction. Therefore, these cultural dimensions have been one of the ways used to characterize the target cultures investigated in the current case studies. Hofstede’s cultural dimensions are as follows and examples are mostly from China, Nigeria and the UK, as an example of the Anglo-Celtic cultures used in this study.

  • Power Distance: the degree to which the less powerful members of a society accept that power is unequally distributed. Hofstede’s results found that both China and Nigeria have relatively high Power Distances while the UK has a relatively low Power Distance.

  • Individualism versus Collectivism: the high end of this dimension represents cultures in which individuals are more concerned with their individual needs. The low end represents cultures where the community acts as a whole and the collective needs of its members are more important than the needs of any one individual. Hofstede’s results found that the UK has relatively high Individualism when compared to high Collectivism in China and Nigeria.

  • Masculinity versus Femininity: is perhaps the most controversial of Hofstede’s dimensions, it is concerned with the expected social and emotional roles of women and men in a culture. The higher end of this dimension represents a masculine culture where there is a preference for heroism, assertiveness, achievement and material reward for success. The low end represents a feminine culture where there is a preference for modesty, caring for the weak, cooperation and quality of life. China, Nigeria and the UK all have similar values on Masculinity-Femininity according to Hofstede’s findings.

  • Uncertainty Avoidance: is concerned with the degree to which a culture tries to deal with the unpredictability of the future. The high end of this dimension represents cultures that are intolerant of unorthodox ways and use strict laws and rules of conduct to maintain some kind of predictability. The low end represents those cultures that are more flexible and welcoming of change. According to Hofstede’s findings, Nigeria has higher uncertainty avoidance than China or the UK.

  • Long-term versus Short-term Orientation: is representative of the degree to which a culture considers the future in its present actions. The high end of this dimension represents cultures with long-term orientations, they are more interested in the long run outcome of present situations and individuals in these cultures are more likely to save and invest. The low end represents cultures that have short-term orientation, they have great respect for tradition and have little consideration for long-term outcomes and individuals in these cultures do not have a strong habit of saving and investing and are more likely to buckle under social pressure to maintain an appearance of high social standing. According to Hofstede’s findings, the UK has a higher long term orientation than Nigeria and China has the highest long term orientation of the three.

Bearing this research and particularly Hofstede’s dimensions in mind, we now turn to the two case studies.

3 Case Study I: Cross-Cultural Differences in Reactions to Visual Materials

The first study investigated the use of two types of visual materials, photographs and sketches, in the elicitation of users’ experiences with smartphones. The latter are often perceived by designers to be less susceptible to cultural influences [15]. We worked with respondents from two very different cultures, Nigeria and the Anglo-Celtic (AC) cultures, being the UK and its ex-colonies that are populated largely from the UK and Ireland: Australia, Canada and New Zealand. Obviously, both Nigeria and the AC countries have a variety of cultures within them, but each have an identifiable dominant culture.

3.1 Method

3.1.1 Participants

Participants were 92 people who responded to an online survey. 50 were Nigerian and 42 were British or from other Anglo-Celtic countries.

The Nigerian participants comprised 25 women and 23 men. Ages ranged from 18 to over 60, with a median age group of 21–30 years. Respondents rated their proficiency in English (on a scale from 1 = beginner to 5 = native speaker level) and gave a mean rating of 4.43 (SD = 0.93). 48 (96.0 %) reported having at least one mobile phone (two respondents did not answer this question, so we cannot say whether they had a mobile or not) and 32 (64.0 %) reported having at least one tablet computer.

The AC participants comprised 24 women and 18 men and included 36 respondents from the UK, three from Canada, two from Australia and one from Ireland. Ages ranged from 18 to over 60, with a median age group of 31–40 years. All the AC participants reported having at least one mobile phone and 31 (73.8 %) reported having at least one tablet computer.

3.1.2 Materials

Three scenarios were created, with both photo and sketch versions, starting from the internationalized storyboards developed by Walsh et al. [15]. The first task was to critique these materials from a Nigerian perspective. A critique group of 7 Nigerians was used for this purpose. Members were given the four Walsh et al. storyboards and asked to work through them and note all the things that would seem out of place in a Nigerian setting, those things that would make it difficult for them to identify with the characters depicted in the storyboards and the activities being shown. The group produced 16 points on which the storyboards would not be appropriate for a Nigerian context, even though they were attempting to be international. Using these critiques as a basis, three of the scenarios were localized for the Nigerian cultural context, each with a photo and a sketch version. For example, Scenario 1 was about taking a photo on holiday, the main sketch showing a woman taking a photo of herself in front of the Eiffel Tower. The critique revealed that Paris is not a popular holiday destination with Nigerians, so the scene was changed to Dubai (with the Burj Al Arab in the background). In addition, the critique revealed that Nigerians would be unlikely to take a photo of themselves, a “selfie”, so the scenario was changed to a woman posing for a photo being taken by someone else. The four images used for this scenario in the current case study are shown in Fig. 1. The other two scenarios were checking information on the Web while travelling (on a bus or in a car) (Scenario 2) and having a casual social get together with friends and sending a photos to other friends (Scenario 3). Although the original sketches had been considered international, in light of the critique by the Nigerian group, we now considered them localized to the AC culture.

Fig. 1
figure 1

Photos and sketches for the Holiday Photo scenario (top panels Nigerian photo and sketch, bottom panel AC photo and sketch)

An online questionnaire presented the three scenarios, each with a short textual introduction. After questions about the three scenarios, respondents were asked to rate the following two 5 point Likert items: Did you feel that the photos/storyboards helped or hindered you in answering the questions about the situations? Were the photos appropriate to the kind of person you are? The questionnaire also included a number of questions to collect demographic data and information about mobile phone use.

3.1.3 Procedure

The study was publicized widely amongst professional and personal contacts of the authors in the AC countries and Nigeria, via personal emails and messages to online discussion groups. The survey took 20–30 min to complete. All participants were entered into a prize draw for one of four prizes of gadgets worth £15 (approx USD 24).

3.2 Results

To analyse the Likert scale rating items, we first investigated the relationship between the responses to the questions. The two Likert scale questions that asked respondents to reflect on all three scenarios they had seen correlated significantly (p < 0.001), so we created an Overall Reaction score, being the mean response for each respondent on these two questions.

A three way ANOVA on the Overall Reaction scores found a significant main effect for Material Culture (F = 17.27, df = 1, 82, p < 0.001), that is the culture depicted in the scenarios. Nigerian participants answered significantly more positively overall, with a mean score of 3.80 (SD: 0.82) compared to a mean score of 3.11 (SD: 0.78) for AC participants.

We were surprised to find that the Nigerian participants were more positive about both the Nigerian and the AC visual materials, we had expected that there would be a cultural matching, with Nigerian participants favouring Nigerian materials and AC participants favouring AC materials. However, this result may be explained using Hofstede’s cultural value framework. Nigeria is a high Power Distance culture which promotes reverence for authority and an unwillingness to offend those in authority. This unwillingness to offend may have hindered Nigerian participants in responding negatively. In this study, the researcher who asked the Nigerian participants to undertake the study may have been seen as an authority figure and expert. The Nigerian participants may not have wanted to offend the researcher by providing negative ratings that could be seen as criticism of the researcher’s work.

There was also a significant interaction between Materials Culture and Material Type (F = 3.53, df = 1, 82, p < 0.05). Figure 2 shows that all participants (both Nigerian and AC) were equally positive about the photo materials, whether they depicted Nigerian or AC culture, whereas for the sketches, participants were more positive about AC materials than about Nigerian materials. This result is more difficult to interpret, but may relate to the fact that Nigerian participants are exposed to much AC visual material through the mass media, whether AC participants are not generally exposed to Nigerian visual material.

Fig. 2
figure 2

Interaction in overall reactions to visual materials (photos and sketches) in the two cultures depicted in the scenarios (Nigerian and AC)

We are continuing analysis of other results from this case study, in particular the open-ended comments that participants made to explain their ratings and their ratings of the individual scenarios.

4 Case Study II: Cross-Cultural Differences in Reactions to Different Question Types

The second study investigated the use of three different question types with British and Chinese respondents: Likert rating items, sentence completion questions and open-ended questions, again in assessing UX with smartphones. We wished to investigate both how participants from the two cultures react to providing information in different formats and when answering in their native language or a second language (so half the Chinese participants responded in Chinese and half in English).

4.1 Method

4.1.1 Participants

Participants were 96 people who responded to an online survey, 56 women and 40 men, average age 26.8 years (range 18–60 years). 36 British respondents answered in English, 30 Chinese respondents answered in Chinese and 30 English speaking Chinese respondents answered in English. The Chinese participants who answered in English considered themselves proficient in English, they had studied English for an average of 11.0 years (range 1–20 years) and rated their proficiency on average 4.74 on a 7 point scale (1 = beginner to 7 = near native speaker).

4.1.2 Materials

A series of questions was developed about UX with smartphones that could be answered in three different formats: 7 point Likert rating items, sentence completion questions and open-ended questions. It was not possible to make questions across the three formats exactly equivalent, but they addressed similar issues. For example, an open-ended question asked “How does your smartphone make you feel?”, a sentence completion question asked “When I use my smartphone, I feel …” and Likert items asked participants to rate how much they agreed with the statements “I feel attached to my smartphone”, “My smartphone feels good in my hand” and “I find my smartphone inspiring”. In all there were 14 Likert items, 12 sentence completion questions and 7 open-ended questions.

These questions were embedded into a web survey that included questions about the participants experience with the survey itself and in particular with the different question types and demographic information. The survey was translated into Chinese by the native Chinese-speaking co-author and back-translated by another native Chinese speaker to ensure an accurate translation.

4.1.3 Procedure

The study was publicized widely amongst professional and personal contacts of the authors in China and the UK, via personal emails and messages to online discussion groups. The survey took 15–20 min to complete. All respondents were entered into a prize draw for one of four Amazon gift vouchers worth £10 (approx USD 17). The web survey was available for approximately 3 weeks to gather data from a sufficient number of participants.

4.1.4 Results

Participants were asked to rate the difficulty of answering each question type. The mean ratings for the three question types and the three groups of participants are shown in Fig. 3. A two way Analysis of Variance showed there was a significant difference between the question types (F = 40.44, df = 2, 186, p < 0.001) and a significant difference between the three groups of participants (F = 8.17, df = 2, 93, p < 0.001), but no significant interaction between the two variables. As can be seen in Fig. 3, all participants found the Likert items the easiest (mean rating: 2.10) and there was little overall difference between the sentence completion and open-ended questions (Sentence Completion mean: 3.59; Open-Ended mean: 3.43). The UK participants found all question types the easiest (mean: 2.52), and although the Chinese participants found all the question types harder, there was surprising little difference from them answering in English and Chinese (mean Chinese participants answering in English: 3.46; overall mean Chinese answering in Chinese: 3.24).

Fig. 3
figure 3

Mean ratings of difficulty of answering questions of different types for UK participants and Chinese participants (answering in English and Chinese)

We are continuing analysis of the data from this case study, to investigate both the relationship between the different question types and differences between the three groups of participants.

5 Discussion and Conclusions

The results of our two studies show that designers need to think carefully about the cultural biases in the methods they use with potential users of systems that they are developing. The first case study showed that storyboard which we thought were internationalized actually contained many cultural references that may be interpreted differently by participants from different cultures. In addition, it showed that participants with different cultural values may react differently to questions and that levels of agreement cannot be considered comparable across different cultural groups. The second case study showed a similar pattern of perceived levels of difficulty across the three groups. Contrary to our expectations, Chinese participants did not report greater difficulty or a different pattern of difficulty when answering in English compared to Chinese. However, further analyses may reveal differences in the quantity and quality of their answers. Our results will contribute to recommendations for conducting cross-cultural user research.