Human communication involves the transmission of abstract and concrete information using both verbal and nonverbal symbols (for a review, see Richmond & McCroskey, 2009). In the last few decades, and particularly as of the beginning of the 21st century, innovations in technology have dramatically changed the ways that people communicate with each other. The increasing worldwide Internet usage and smartphone ownership, including in emerging economies (PEW Research Center, 2016), has introduced different forms of written communication mediated by information and communication technologies (ICTs). These include instant messaging and e-mail applications based on ICT-device operating systems (Android, iOS) or messaging services (e.g., Gmail, Whatsapp), VoIP system providers (e.g., Skype), social networking sites (e.g., Facebook), and social media platforms (e.g., Twitter).

Some authors have suggested that these forms of communication filter out social, affective, and nonverbal/visual cues and can originate less effective communication outcomes (e.g., Walther, 1996; Walther & D’Addario, 2001). However, other studies have shown that the absence of such cues does not necessarily render communications less effective. Instead, this absence may promote the implementation of uncertainty reduction strategies to compensate for the absence (Antheunis, Valkenburg, & Peter, 2007, 2010). In particular, the use of written paralanguage cues in written communication, has been identified as a strategy to overcome the absence of certain cues, because they convey meaning (e.g., Lea & Spears, 1992). These cues include typographical marks (i.e., letters and numbers) and ideograms (e.g., graphic symbols), identified as “typographic or text-based emoticons” and “graphic emoticons,” respectively (e.g., Huang, Yen, & Zhang, 2008; Wang, Zhao, Qiu, & Zhu, 2014). In the late 1990s, the latter emerged as an independent strand of meaning and emotional expression through ideograms and pictographs that could be used across ICT platforms. These came to be known as emoji, created with the goal of facilitating mobile communication (Negishi, 2014; Nelson, Tossell, & Kortum, 2015).

In addition to their massive use in daily written communications, both emoticons and emoji are being used increasingly in applied domains, such as marketing and health, as well as stimulus materials in scientific research (e.g., Davidov, Tsur, & Rappoport, 2010; Hogenboom et al., 2013; Skiba, 2016; Thelwall, Buckley, & Paltoglou, 2012; Thelwall, Buckley, Paltoglou, Cai, & Kappas, 2010; Vashisht & Thakur, 2014; Wang & Castanon 2015). However, their selection, coding, and analysis may be somewhat biased if we assume a direct correspondence between the users’ interpretations of emoji/emoticons and their intended meanings (e.g., a sad face emoji is negative and will be perceived as such).

In this study we report evaluations of emoticons and emoji provided by ICT users. Specifically, we present the Lisbon Emoji and Emoticon Database (LEED), and provide the first set of normative evaluations for 238 stimuli, comprising 85 emoticons and 153 emoji, based on seven evaluative dimensions: aesthetic appeal, familiarity, visual complexity, concreteness, valence, arousal, and meaningfulness. In addition, we examined the meaning attributed to each stimulus. It is our contention that the LEED contributes to the literature by proposing subjective norms for emoji and emoticons and guaranteeing the quality of the codebooks used in both research and practice in a multitude of areas.

Emoticons and emoji in ICT-mediated communication

Emoticons and emoji have been considered a new medium to share daily narratives, emotions, and attitudes with others through ICTs (for a review, see Gülşen, 2016). Emoticons (from emotion + icon) are symbols created by using punctuation, numbers, or letters, with the intention of transmitting feelings, emotional states, or information in the absence of words, or complementing a written message (Dresner & Herring, 2010; Krohn, 2004; Thompson & Filik, 2016). The first known emoticons :( and :) were proposed in 1982 and are attributed to Scott E. Fahlman, a professor at Carnegie Mellon’s School of Computer Science, who created them in an attempt to differentiate serious posts from joke remarks on a bulletin board.Footnote 1 Since then, emoticons have hugely increased in number, and the current list of emoticons is extensive, running from simple symbols to highly complex ones (e.g., www.netlingo.com/smileys). Emoticons include representations of facial expressions, typically sideways [Western style; e.g., ;)], as well as representations of abstract concepts and emotions/feelings (e.g., <3). Other emoticons are represented in a right-way-up position [Eastern style; e.g., (*^.^*)].

Emoji (from the Japanese e [picture] + moji [character]) are graphic symbols with predefined names/IDs and code (Unicode), which include not only representations of facial expressions (e.g.,), abstract concepts (e.g.,), and emotions/feelings (e.g.,), but also animals (e.g.,), plants (e.g.,) activities (e.g.,), gestures/body parts (e.g.,), and objects (e.g.,). Emoji are presumed to have been first proposed by Shegetaka Kurita during the late 1990s, who created them while working at a mobile phone operator in Japan to facilitate mobile communication (Negishi, 2014). Currently, more than 2,000 emoji are supported by different platforms, and they are constantly evolving and becoming more diverse (http://emojipedia.org). For instance, new Unicode releases (e.g., Unicode 11.0, released in 2016) include emoji that represent different social groups—varying, for example, in ethnicity (e.g.,) and age (e.g.,).

There are major differences between emoji and emoticons (Ganster, Eimler and Kramer, 2012). As compared to emoticons, emoji are colored, are not rotated by 90°, and in those representing facial expressions, the face is often delimited by a circle and may include multiple facial cues.

However, both emoticons and emoji are increasingly popular in our everyday life. They are a constant presence in the ways we communicate in the virtual world (e.g., social media, e-mail, and text messages; Gülşen, 2016). Emoji are also being included in everyday products (e.g., toys, home decoration items, or even clothes). Moreover, emoji have been integrated in the ways artists communicate with their audience (e.g., Katy Perry’s “Roar” music video) and the ways brands connect with consumers (for a review, see Wohl 2016). For instance, brands have included emoji in advertising campaigns (e.g., McDonalds used people with emoji as their heads; Beltrone, 2015) and developed new sets of brand-related emoji (e.g., Dove launched a set of curly-haired emoji; Neff, 2015). In another example of emoji popularity, the Oxford dictionaries considered the emoji “face with tears of joy” to be “the word of the year 2015.” On Twitter alone, this emoji registered 6.6 billion uses that year (@TwitterData).

Scientific research about emoticons and emoji is also increasing. Some studies have examined naturalistic data, such as public messages posted on social media platforms (e.g., Twitter, Google forums, or Facebook) to understand and characterize emoticon/emoji usage. For example, Novak, Smailović, Sluban, and Mozetič (2015) proposed the emoji sentiment ranking, an index of positivity based on the frequency of each emoji used in negative, neutral, and positive tweets. Also, Ljubešić and Fišer (2016) used tweets as their dataset to investigate how popular emoji are on Twitter, which countries exhibit greater emoji usage, and the popularity of specific emoji. Similarly, Tossell and colleagues (2012) conducted a longitudinal study monitoring the use of emoticons in text messages. This type of descriptive analysis can also be conducted in specific domains. For example, Vidal, Ares, and Jaeger (2016) examined tweets about eating situations and how people used emoticons/emoji to spontaneously express food-related emotional experiences. Other studies used similar naturalistic data to monitor a given event (e.g., public health information; Paul & Dredze, 2011) or to examine event-centered reactions, opinions, feelings, evaluations, or emotions (e.g., elections; Burnap, Gibson, Sloan, Southern, & Williams, 2016). Even though these studies have typically relied on emotional word lexicons, more recently researchers have drawn attention to the need to extend these lexicons to include emoticons and emoji (B. Liu, 2012; Pang & Lee, 2008).

Research focusing on emoticon/emoji usage and functions has suggested that these stimuli serve two key functions: to portray emotional or social intent and to reduce potential discourse ambiguity (for a review, see Kaye, Wall, & Malone, 2016). Skovholt, Grønning, and Kankaanranta (2014) showed that such stimuli also function as contextualization cues (e.g., markers of positive attitudes that facilitate message interpretation) and as organizers of social relationships in written interaction (e.g., reducing perceived interpersonal distance by decreasing impersonality/formality). As examples of these functions, Lo (2008) showed that adding emoticons to online messages improved receivers’ understanding of the intensity and valence of the emotions (sad vs. happy) and attitudes (like vs. dislike) expressed by the sender. Likewise, Ganster and colleagues (2012) showed that using a smiling (vs. a frowning) emoji/emoticon influences how a message is evaluated (i.e., more positive and humorous), how the sender is perceived (i.e., more extroverted), and how the receiver feels (i.e., a more positive mood). Derks, Bos, and von Grumbkow (2008) further showed that emoticons strengthen the intensity of a message (e.g., a positive message with a smile emoticon is rated more positively than the same positive message without the emoticon). However, in the case of incongruence between the valences of the message and the emoticon (e.g., a positive message accompanied by a frown emoticon), a message’s interpretation relies more on the text content.

Another line of research has adopted experimental methodologies to examine how the presentation of emoticons/emoji influences different phenomena. For example, Wang and colleagues (2014) focused on the effects of adding positive and negative emoji to messages regarding workplace performance on acceptance of negative feedback. Likewise, Tung and Deng (2007) tested how the presentation of emoji in an e-learning environment affected children’s motivation. Furthermore, Siegel and colleagues (2015) investigated whether including emoji on food packages influenced children’s meal choices. Emoji and/or emoticons have also been used as the experimental materials in studies focusing on affective processing (e.g., Han, Yoo, Kim, McMahon, & Renshaw, 2014; Kerkhof et al., 2009; Yuasa, Saito, & Mukawa, 2011). For example, positive and negative emoji have been used as primes to induce valence, influencing responses (event-related potentials) to valenced target words (e.g., Comesaña et al., 2013). Research has shown that novel target words primed with positive emoji are more likely to be erroneously categorized as familiar (e.g., Garcia-Marques, Mackie, Claypool, & Garcia-Marques, 2004). Finally, emoji/emoticons have been used for research method development—for example, as anchors in rating scales assessing current emotional states (e.g., Moore, Steiner, & Conlan, 2013), emotional associations with specific stimuli (e.g., food names; Jaeger, Vidal, Kam, & Ares, 2017), well-being (Fane, MacDougall, Jovanovic, Redmond, & Gibbs, 2016), and pain (e.g., Chambers & Craig, 1998).

Methodologies and tools for emoticons/emoji analysis

The selection, coding, and analysis of emoticons and emoji as direct indicators of the emotional meanings conveyed by messages can follow either human-based (e.g., Park, Baek, & Cha, 2014; Vidal et al., 2016) or computer-based (Davidov et al., 2010; Hogenboom et al., 2013; Vashisht & Thakur, 2014; H. Wang & Castanon, 2015) procedures. A computer-based procedure relies on machine-learning algorithms and semantic lexicons that is thought to provide a more objective analysis of emoticon/emoji usage. Both human-based and computer-based procedures may be prone to bias because they rely exclusively on the evaluations of, and the meanings attributed by researchers/analysts, without taking into consideration the ways they are perceived by the users. One area in which this has been particularly worrisome is the field of computer-based sentiment analysis (Thelwall et al., 2012; Thelwall et al., 2010), which allows for detecting and analyzing sentiment/affective reactions on the basis of semantic analysis of written text. Such analyses rely on codebooks developed by researchers from of the commonly accepted designations/feelings portrayed by emoticons and emoji (e.g., emoticon-smoothed language models; Liu, Li, & Guo, 2012; SentiStrenght coding manual for sentiment in texts available at http://sentistrength.wlv.ac.uk; e.g., Thelwall et al., 2012; Thelwall et al., 2010).

The emoji sentiment ranking (Novak et al., 2015) constitutes an attempt to overcome some of these limitations. However, this index focuses exclusively on the valence dimension and does not take into account other relevant information, such as the level of arousal elicited by a given emoji or the meaning attributed to it. Therefore, standardized procedures for the classification of emoticons/emoji are still missing.

In our view, this state of affairs may have two potential problems. First, the stimulus selection, coding, and analysis may be prone to biases due to researchers’ own evaluations of the stimuli (e.g., analyses based on ad-hoc emotionality categorization made by two coders; Park et al., 2014). Second, there may be a biased assumption that emoticon/emoji users’ interpretations necessarily correspond to the meanings intended by developers/researchers. Because emoji/emoticons are not usually labeled when presented (with the exception of the Facebook emoji set), they are open to interpretation. Indeed, users can select an emoji on the basis of superficial visual features, which can lead to misinterpretations of its meaning and intent. For example, one may wish to express sadness by selecting a tearful emoji, and mistakenly choose “face with tears of joy” instead of “face with tears of sadness”. Additionally, the same emoticon/emoji can be used to represent a variety of meanings. For instance, a smiley face may be used to express happiness, but it may also be used to express agreement with or liking of something/someone, one’s own physical or mental well-being state, empathy, comprehension, or other meanings. Moreover, the same emoticon/emoji can be interpreted differently according to the communication context. For example, emoticons such as :p and ;) are typically described as positive, but they can also be used as markers of irony (Carvalho, Sarmento, Silva, & de Oliveira, 2009) or sarcasm (Thompson & Filik, 2016). Finally, emoji with the same intended meaning may have distinct visual representations across operating systems, potentially leading to different interpretations and evaluations (Miller et al., 2016). To sum up, as with other types of visual stimuli, emoji/emoticons are prone to subjectivity in their evaluation and interpretation, which supports the need to develop a normative database.

Normative data are abundant in the literature (for reviews, see Prada, Rodrigues, Silva, & Garrido, 2015; Proctor & Vu, 1999). These validated databases typically include stimuli such as words (e.g., Bradley & Lang, 1999a), sounds (e.g., Bradley & Lang, 1999b), or images depicting a broad range of contents (e.g., Dan-Glauser & Scherer, 2011; Lang, Bradley, & Cuthbert, 2008). Regarding the latter type of stimuli, some databases include, for example, visual materials such as simple line drawings (e.g., Bonin, Peereman, Malardier, Méot, & Chalard, 2003; Snodgrass & Vanderwart, 1980) or symbols (e.g., McDougall, Curry, & Bruijn, 1999; Prada et al., 2015). Other databases are theme-focused and include specific contents, such as food (e.g., Blechert, Meule, Busch, & Ohla, 2014; Charbonnier, van Meer, van der Laan, Viergever, & Smeets, 2016) or human faces (e.g., Ebner, Riediger, & Lindenberger, 2010; Garrido et al., 2016; Mendonça, Garrido, & Semin, 2016).

The absence of published normative data on visual stimuli such as emoticons and emoji has two important consequences. First, it implies that researchers should make the additional effort of pretesting materials to meet a study’s demands. For example, prior to their affective-priming study, Comesaña and colleagues (2013) had to conduct two extensive pretests in which 180 participants evaluated the valence, arousal, and meaning associated with each emoji. Second, the comparison of results between studies can be challenging because the stimuli are often categorized ad hoc. For example, in their study on tweets about food, Vidal and colleagues (2016) had two coders categorizing the emoji and emoticons as negative, neutral, or positive by considering their intended meaning or available description. Park and colleagues (2014) also had two coders categorizing emoticons on three levels, but they considered a different dimension (i.e., emotionality: sad, neutral, and happy) and distinct criteria (emotions conveyed by shape of the eyes and by the shape of the mouth).

In the present article, we present normative ratings of a set of emoticons and emoji from the two most used operating systems—Android and iOS. We also included “reaction” emoji from the most used social networking platform—Facebook. Each stimulus was evaluated with regards to its aesthetic appeal, familiarity, visual complexity, semantic clarity, the valence and arousal of the meaning conveyed, and meaningfulness. Additionally, we assessed the subjective meanings attributed by participants to each stimulus. We selected this set of seven evaluative dimensions on the basis of previous norms with other types of visual stimuli. Specifically, we followed the methodology adopted in a recent validation study (for a detailed review of the dimensions of interest, see Prada et al., 2015), with the exception of adding the dimension of clarity, which has emerged as being relevant for the evaluation of facial expressions (for a review, see Garrido et al., 2016).

Method

Participants

A sample of 505 Portuguese individuals (71.7% women; M age = 31.10, SD = 12.70) volunteered to participate in a Web survey. These individuals were recruited online through Facebook (university institutional page and online studies advertisement pages) and mailing services (students mailing lists). All participants were native Portuguese speakers or had lived in Portugal for the last 5 years. The sample comprised mostly university students (46.7%) and active workers (43.3%), with at least a bachelor degree (46.0%). Most participants indicated that Android/Google (67.5%) or iOS (26.3%) was their operating system.

Stimulus set

The LEED includes 238 stimuliFootnote 2: 85 emoticons and 153 emoji (77 from iOS, 63 from Android, 9 from Facebook, and 4 from Emojipedia), mostly representing facial expressions of emotions (e.g., “happy face”) and/or symbolic meanings (e.g., “silence”).Footnote 3

The emoticon set was developed on the basis of the list of emoticons presented in the “Twitter emotion coding instructions”Footnote 4 for the SentiStrenghth tool (Thelwall et al., 2010; adapted from Wiebe, Wilson, & Cardie, 2005), used for sentiment detection in short texts. This list included 63 Western-style emoticons (e.g., Emot07; see Fig. 1) and 23 Eastern-style emoticons (e.g., Emot56a; see Fig. 1). One symbol was removed due to its unavailability in mobile phone text packages ().

Fig. 1
figure 1

Sample emoticons and emoji across operating systems for “laughing” and “crying” (the stimulus codes are included)

Because a given emoticon can sometimes vary in its presentation, variations of the same stimulus were included. For example, Emot01 (“laughing, big grin”) has three variations identified in the database, from Emot01a to Emot01c. Each emoticon was generated in black, 28-point Arial font on a white background and was saved as a single image file (72 × 68 pixels, 72 dots per inch, RGB, PNG format).

According to information available from the Unicode Foundation (http://unicode.org/emoji/charts/full-emoji-list.html), we selected emoji with intended meanings similar to the emoticons. Figure 1 depicts examples of the emoticons and emoji for “laughing” and “crying.” As in the case of emoticons, variations of the same emoji were included. The set of 153 emoji was extracted from the Emojipedia database (http://emojipedia.org/) and included stimuli from the two most used and available operating systems at the time the study was performed: Apple iOS 9.3 (used in iPhone, iPad, iMac, Apple Watch, and Apple TV) and Google Android 6.0.1 (used in Android devices, the Gmail Web interface, Google Hangouts, and the Google Chrome Internet browser).

Emoji were matched across operating systems according to their Unicode references. Of the 153 emoji, 63 stimuli were represented in both operating systems (all 63 Android emoji had a corresponding iOS emoji), 14 were only represented in the iOS operating system (e.g., EmjAp51), and 8 were represented in both operating systems and in the Facebook reaction set (see Fig. 1).Footnote 5 The latter subset included nine emoji: the like/dislike buttons (EmjFb76 and EmjFb77, respectively), the recently added “Facebook reactions” (five faces expressing emotions, EmjFb07–EmjFb67, and one heart symbol, EmjFb71), and the new “like” button (EmjFb78). Finally, four Emojipedia images (EmjPe86–EmjPe89) were also included in the final set. These Unicode 9.0 emoji were not available in the Android or iOS operating systems at the time of the study (e.g., EmjPe89), but were included to represent potentially official future emoji not currently available. Each emoji was saved as a single image file (72 × 72 pixels, 72 dots per inch, RGB, PNG format).

The vast majority of the emoji set represents facial expressions (88.89%), with the exceptions of popular symbols (3.27%; e.g., heart, EmjAn71; heartbreak, EmjAn72) and hand gestures (7.84%; e.g., hand palm, EmjAp75).

Procedure and measures

The study was conducted using the Qualtrics software. Participants were invited to collaborate on a web survey about the perception and evaluation of emoticons and emoji. After clicking on the hyperlink, participants were directed to a secure webpage and were informed about the goals of the study and its expected duration (approximately 20 min). The initial instructions provided the definition of all emoji and emoticons, and examples of each type of stimulus were presented (emoticons and emoji ). To avoid overlap, these examples were different from the stimuli used in the evaluation task. Participants were also informed that all the data collected would be treated anonymously and that they could abandon the study at any point by closing the browser, without their responses being considered for analysis.

After providing their informed consent to collaborate in the study (by checking the “I agree” option), participants were asked to provide information regarding their age, sex, educational level, current occupation, and their operating system. Following this, they were given specific instructions to evaluate each stimulus in seven evaluative dimensions, namely: aesthetic appeal, familiarity (subjective frequency), visual complexity, clarity, valence, arousal, and meaningfulness (all dimensions rated using 7-point Likert-type scales; the detailed instructions for each scale are presented in Table 1; see also Garrido et al., 2016; Prada et al., 2015). These dimensions were randomly presented per trial in the evaluation task. Finally, participants were requested to write the first meaning or emotion that came to their mind for each stimulus in an open-ended response format, or alternatively select the option “I don't know” if they were not able to provide a specific meaning or emotion. The instructions also emphasized that responses would have to be fast and spontaneous and that there were no right or wrong answers.

Table 1 Instructions and scale anchors for each dimension

Participants then proceeded to the main task. To prevent fatigue and demotivation, each participant only saw a subset of 20 randomly selected stimuli from the available pool of 238 stimuli. Each stimulus was presented on a single page of the Web survey. We used a forced response option, such that participants were required to answer each question to progress in the survey. The number of participants evaluating each stimulus varied from 40 to 49. The stimuli were always presented at the top left corner of the page, with all evaluative dimensions presented below. Upon completing the task, participants were thanked and debriefed.

Results

The norms for the full set of stimuli are provided as supplementary material. (see also www.osf.io/nua4x) In the following sections we present (a) the preliminary analysis regarding outlier detection, (b) the analysis of the differences by gender and operating system, (c) the subjective rating norms for each dimension, (d) the correlations between evaluative dimensions, and (e) the analysis of attributed meaning/emotion.

Preliminary analysis

Because only completed surveys were included in the analysis, there were no missing data. Outliers were determined in terms of the criterion of 2.5 standard deviations above or below the mean evaluation of each stimulus on a given dimension. This analysis yielded a small percentage (1.32%) of outlier ratings. Moreover, none of the participants responded systematically in the same way (i.e., using the same value of the scale). Therefore, no participants were excluded.

Emoticons and emoji evaluations

When we compared the evaluations of emoticons and emoji on each dimension for the total sample (see Table 2), the overall results showed that emoji (vs. emoticons) were rated as aesthetically more appealing, t(498) = –24.82, p < .001, d = 1.11; more familiar, t(498) = –23.73, p < .001, d = 1.06; clearer, t(497) = –31.45, p < .001, d = 1.41; more positive, t(498) = –2.50, p = .013, d = 0.11; more arousing, t(498) = –21.51, p < .001, d = 0.96; and more meaningful, t(498) = –31.00, p < .001, d = 1.39.

Table 2 Evaluations of each dimension (means and standard deviations) for emoticons and emoji, for the total sample and for men and women, as well as mean difference tests

Gender differences

The general results of the comparison between emoticons and emoji were also observed in the subsamples of both women and men. However, men provided equivalent valence ratings for the emoticons and emoji. We also tested for gender differences in the evaluations of emoticons and emoji for each dimension. As shown in Table 2, no gender differences emerged in the ratings of emoticons. When we replicated this analysis for emoji, the results showed that women evaluated emoji as being more familiar, clear, and meaningful than did men, all ps ≤ .015. This pattern of results remained the same after controlling for the main operating systems used by participants, all ps < .019.

Operating system differences

Emoji evaluations were compared between the Android and iOS operating systems (Table 3). These results showed that iOS emoji were evaluated as being more aesthetically appealing, familiar, clear, and meaningful, all 5,000-sample bootstrapped ps ≤ .006. In contrast, no differences between operating systems were found for visual complexity, valence, and arousal, all 5,000-sample bootstrapped ps ≥ .059.

Table 3 Evaluations of each dimension (means and standard deviations) for Android and iOS emoji for the total sample, as well as mean difference tests

Subjective rating norms

To define subjective rating norms, data was further coded and analyzed by stimulus. For each stimulus, we calculated frequencies, means, standard deviations and confidence intervals (CIs) in each dimension (see Appendix 1 in the supplementary material). On the basis of these results, stimuli were categorized as low, moderate, or high in each dimension (for a similar procedure, see Prada et al., 2015). When the CI included the response scale midpoint (i.e., four) stimuli were considered “moderate” in a given dimension. Stimuli were categorized as “low” when the upper bound of the CI was below the scale midpoint and as “high” when the lower bound of the CI was above the scale midpoint. In the case of valence, “low” means negative, “moderate” means neutral, and “high” means positive. Figures 2 and 3 present summaries of this analysis for emoticons and emoji separately.

Fig. 2
figure 2

Emoticon frequency distributions for each dimension level. For valence: low = negative, moderate = neutral, high = positive

Fig. 3
figure 3

Emoji frequency distributions for each dimension level. For valence: low = negative, moderate = neutral, high = positive

As is shown in Fig. 2, the majority of emoticons were categorized as being low in aesthetic appeal (76.47%), familiarity (57.65%), and clarity (50.59%), and as being moderately arousing (55.29%). Moreover, the results show that most emoticons were categorized as being low (48.24%) or moderate (44.71%) in complexity, and as low (43.53%) or moderate (36.47%) in meaningfulness. Regarding valence, the emoticons were distributed across the three levels: negative (42.35%), neutral (30.59%), and positive (27.06%).

Figure 3 shows that the majority of emoji were categorized as being highly familiar (58.82%), clear (79.08%), arousing (65.36%), and meaningful (88.24%). The results further show that emoji were categorized as being high (49.02%) or moderate (45.10%) in aesthetic appeal, and moderate (54.25%) or low (43.10%) in complexity. Note that the emoji were somewhat polarized in their valence, being mostly categorized as either negative (49.02%) or positive (42.48%) in this dimension. Figure 4 depicts examples of emoticons and emoji for each level of each evaluative dimension.

Fig. 4
figure 4

Sample emoticons and emoji for each level across dimensions (LEED stimulus codes are included). For valence: low = negative, moderate = neutral, high = positive

Correlations between dimensions

Overall, the results showed significant correlations between the dimensions (see Table 4). For example, meaningfulness was strongly correlated with aesthetic appeal (r = .547), familiarity (r = .648), clarity (r = .743), and arousal (r = .506). Clarity was strongly associated with aesthetic appeal (r = .538) and familiarity (r = .704). Also aesthetic appeal was also strongly associated with familiarity (r = .556).

Table 4 Pearson’s correlations between the dimensions

Analysis of attributed meaning/emotion

In addition to meaningfulness ratings, participants were asked to indicate the meaning or emotion attributed to each stimulus. Percentage of responses was computed considering the sample size that evaluated a given stimulus. Two independent judges coded the meaning/emotion attributed by the participants to each symbol (for a similar strategy, see, e.g., Prada et al., 2015). Synonyms (e.g., “don’t speak” and “silence,” EmjAp31) and singular/plural forms (e.g., “smiles” and “smile,” Emot1c) were included in the same category. The meaning of 15 emoticons was not categorized due to a low percentage of responses (i.e., <25%). For example, from the 42 participants that evaluated Emot32, only eight indicated meaning, from which two were categorized as “smile,” two as “ignore,” and the remaining were uncategorized. Note that the sum of percentages of both categories does not necessarily equals 100. For example, 48.4% of the valid responses for EmjAp47 were categorized as “glad,” 25.8% as “upside down,” whereas the remaining responses (n = 8) were heterogeneous and therefore uncategorized (e.g., “normality,” “sarcasm”).

The percentages of meaning responses varied between 4.3% (Emot75) and 95.0% (Emot01a) for emoticons (M = 49.9%, SD = 24.1); between 46.9% (EmjAn24) and 100% (e.g., EmjAn71) for Android emoji (M = 84.6%, SD = 11.9); and between 48.8% (EmjAp24) and 100% (e.g., EmjAp57) for iOS emoji (M = 86.9%, SD = 11.3). The percentages varied between 90.7% (EmjFb17) and 100% (e.g., EmjFb76) for the Facebook emoji (M = 95.7%, SD = 2.9), and between 74.4% (EmjPe86) and 97.8% (e.g., EmjPe88) for the Emojipedia emoji (M = 82.9%, SD = 10.8). Within each operating system, results regarding the first category showed that, on average, participants agreed on the meanings of both the Android (64.95%) and iOS (66.78%) emoji.

A detailed discussion of the meaning or emotion attributed to each stimulus would be too extensive. The complete meaning analysis is presented in Appendix 2 in the supplementary material alongside the Unicode intended meaning for comparison purposes. In some cases, the meaning categorization converged with the Unicode intended meaning. For instance, participants attributed a congruent meaning to the “winking face” stimulus in its different formats. For emoticon (Emot08a) the most frequent meanings were “wink” (40.5%) and “agree” (21.6%), for the iOS emoji (EmjAp08) these were “agree/compliance” (40.0%) and “wink” (28.6%), and for the Android emoji (EmjAn08) these were “wink” (40.6%) and “compliance” (25.0%).

In other cases there was only partial convergence. For example, the emoji “face savoring delicious food” was interpreted as “cheeky/fun” (63.2%) and “tasty” (18.4%) in the iOS emoji (EmjAp10), and as “wink/cheeky” (59.4%) and “tasty” (12.5%) in the Android emoji (EmjAn10). In another example, the emoji “imp” was attributed the meanings “evil” (60.0%) and “mischief/prank” (30.0%) in the iOS emoji (EmjAp70), and “evil/mischief” (62.5%) and “rage” (22.5%) in the Android emoji (EmjAn70).

For other stimuli, the attributed meaning differed across operating systems and from the Unicode intended meaning. For example, the emoji “dizzy face” was attributed the meaning “shocked” (66.7%) in the iOS emoji (EmjAp66), and “confusion” (46.5%) and “hypnotized” (18.6%) in the Android emoji (EmjAn66). These examples clearly illustrate that the meaning participants assign to emoji is not always convergent with their Unicode intended meaning and also varies across operating systems.

Discussion

In this article we have presented the LEED, which includes 238 emoticons and emoji, evaluated across seven evaluative dimensions: aesthetic appeal, familiarity, visual complexity, clarity, valence, arousal, and meaningfulness. Additionally, participants attributed meaning to each stimulus. To our knowledge, this is the first available emoticon/emoji normative database.

Results showed that, in comparison to emoticons, emoji are perceived as more aesthetically appealing, familiar, clear, and meaningful. Most emoticons were categorized as low in aesthetic appeal, familiarity, clarity, valence, and meaningfulness, whereas most emoji were categorized as high in familiarity, clarity, arousal, and meaningfulness. This may be associated with an increasing popularity and use of emoji. Indeed, recent evidence shows that as emoji usage has increased the usage of emoticons has decreased (Pavalanathan & Eisenstein, 2015). Furthermore, in the case of stimuli depicting facial cues, the graphical representation of emoji may be more appealing because they are better proxies to human facial expressions (e.g., Ganster et al., 2012).

Results also showed no gender differences regarding the evaluation of emoticons. Emoji, however, were evaluated as more familiar, clear, and meaningful by women. This finding converges with empirical evidence showing that women are more likely than men to use emoji (e.g., Fullwood, Orchard, & Floyd, 2013).

Recent literature has suggested the need to take into account possible differences in emoji evaluation across operating systems (Miller et al., 2016). Indeed, our results showed that iOS emoji were evaluated as more aesthetically appealing, familiar, clear and meaningful than Android emoji. We also found significant correlations between the evaluative dimensions (e.g., stimuli that were perceived as more meaningful were also perceived as more aesthetically appealing, familiar, clear and arousing). This pattern replicates findings from databases of other visual stimuli using the same evaluative dimensions (Garrido et al., 2016; Prada et al., 2015).

In addition to presenting normative ratings across dimensions, our database includes participants’ interpretation of the meaning of each stimulus. Participants were more likely to attribute meaning to emoji than to emoticons irrespectively of the operating system (iOS vs. Android). It is important to note that even though participants described the meaning in terms of what the stimulus directly represents (e.g., wink), they were also likely to go beyond this mere description and infer its intent (e.g., being cheeky). This is particularly relevant because it allows researchers to assess the extent to which the intended meaning overlaps with the meaning attributed by users, and more importantly because our findings show this is not always the case. However, as in previous research, our coding system for the meaning has shortcomings that render this overlap subjective.

Emoticons and emoji are often analyzed in the absence of information about the contexts in which they are communicated (Gaspar, Pedro, Panagiotopoulos, & Seibt, 2016). This was also the case of the present research, in which ratings were obtained by presenting the stimuli in isolation. This can constitute a limitation, because the interpretation of visual stimuli is often context-dependent (e.g., Wolff & Wogalter, 1998). Emoticons/emoji are typically incorporated in a message and research has already shown that they can influence how the message is interpreted (e.g., Derks et al., 2008; Fullwood et al., 2013). Moreover, the reverse may occur, such that the content of the message can influence the interpretation of emoticons/emoji (e.g., Miller et al., 2016). For instance, a winking emoticons/emoji can be interpreted differently when accompanied by “Let’s go to the movies ;)” versus “Let’s watch a movie at my place ;)” Furthermore, emoticons/emoji interpretation can also depend on how the sender’s goals are perceived (Gaspar, Barnett, & Seibt, 2015; Gaspar et al., 2016). For instance, winking emoji accompanying a sarcastic remark can be differently interpreted when the sender is a close friend or when the sender is one’s boss.

Another limitation to the present study concerns the specific cultural context in which this dataset was developed. Culture has emerged as a factor that influences emoticon and emoji usage in online communication (Park et al., 2014). Our normative dataset was obtained with Portuguese participants and, according to recent data (Ljubešić & Fišer, 2016), Portugal ranks fourth in Europe for emoji usage on Twitter. Nevertheless, as with other normative databases, generalizations to other populations should be made with caution and cross-validation is recommended. Therefore, future studies should consider extending this database to other countries/cultures to assess cross-cultural differences and similarities. It should also be noted that differences may arise between studies that analyze how emoticon and emoji are evaluated in isolation from the context in which they are often used, and those focusing on how users actually contextualize them in communication. For example, in our study participants perceived emoji as negative or positive, whereas the work by Novak and colleagues (2015) showed that users mostly use positive emoji in their tweets.

Finally, the results from the meaning analysis indicated that intended meaning and users’ interpretation of that meaning do not always overlap. Two independent coders analyzed and categorized the responses given by participants to each stimulus. Although this procedure is not exempt from bias, the lack of overlap constitutes an important indicator that the selection of emoji and emoticons to use in research or practice should be carefully conducted, on the basis of more objective normative data such as that reported in the LEED. Other procedures could be used to determine users’ interpretation of meaning. For instance, researchers could use forced choice tasks (i.e., decide which emotion/meaning is expressed by the stimuli; Vaiman, Wagner, Caicedo, & Pereno, 2017).

The LEED mostly includes stimuli depicting graphical representations of faces. Research has shown that this type of emoji is processed similarly to other human nonverbal information (e.g., voice and facial expression; Yuasa et al., 2011) and that emoji can be used to prime social presence (Tung & Deng, 2007). Therefore, our stimuli can be used in affective processing studies and as experimental primes.. Future studies could also seek to expand our normative ratings to other emoji representing humans (e.g., bodily postures and activities). Considering that recently new emoji varying in age group and skin tone were added to the available set in different platforms, it would be interesting to examine whether they are suitable as stimulus materials in research designed to examine topics such as person perception, intergroup relations, and social influence.

The LEED is a useful tool for researchers and practitioners (e.g., public health officials) interested in conducting research with naturalistic data (e.g., user-generated messages shared on social media platforms). It can also be used in a variety of experimental paradigms, particularly when the control of stimuli characteristics is required. Instead of their selection, coding, and analysis of emoticons and emoji relying on ad hoc categorization and intended meaning, researchers and analysts can rely on the systematic normative ratings offered by the LEED.

This type of database also has the potential to be used in more applied contexts comprising ICTs mediated written communication, such as in marketing, education, and professional contexts (e.g., Skiba, 2016; Skovholt et al., 2014). Particularly promising is the field of health informatics (see, e.g., Eysenbach, 2011). Both human-based and computer-based evaluations of ICT users reactions to health related events have been used for a variety of public health issues monitoring and surveillance (e.g., influenza like diseases and dengue; Milinovich, Williams, Clements, & Hu, 2014). In such monitoring, machine-learning algorithms and semantic lexicons often use computer-based techniques. These techniques would benefit if they were based on normative ratings such as those offered by the LEED.