Do Social Bots (Still) Act Different to Humans? – Comparing Metrics of Social Bots with Those of Humans

Stieglitz, Stefan; Brachten, Florian; Berthelé, Davina; Schlaus, Mira; Venetopoulou, Chrissoula; Veutgen, Daniel

doi:10.1007/978-3-319-58559-8_30

Stefan Stieglitz¹⁴,
Florian Brachten¹⁴,
Davina Berthelé¹⁴,
Mira Schlaus¹⁴,
Chrissoula Venetopoulou¹⁴ &
…
Daniel Veutgen¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10282))

Included in the following conference series:

International Conference on Social Computing and Social Media

5571 Accesses
22 Citations

Abstract

As a consequence of the growing relevance of social media and improved algorithms and techniques, social bots have become a widely recognized phenomenon. Social bots can disrupt or influence online discourse in social media in many ways (e.g. spreading spam or astroturfing). In this paper, we compare 771 social bots with 693 human accounts. Our analysis is based on a Twitter data set concerning the U.S. election in 2016. Our study shows that human Twitter users and bot accounts differ in many ways. E.g. we found differences regarding the number of follower and the retweets per day of an account, as well as between the used links per day and the retweets per day. Our findings are helpful to identify social bots and to get insights about the impact of social bots in public social media communication.

You have full access to this open access chapter, Download conference paper PDF

Analyzing Social Bots and Their Coordination During Natural Disasters

You Are Known by Your Friends: Leveraging Network Metrics for Bot Detection in Twitter

Characterizing the role of bots’ in polarized stance on social media

Article Open access 04 February 2022

Keywords

1 Introduction

Social media such as Facebook or Twitter offer new ways to publicly discuss political issues and to spread information in order to reach people and possibly influence their opinion [1]. Social media provides the infrastructure to easily share and produce content such as text, links or images. In this environment bots can spread information to human users even without being disclosed as pieces of software [2, 3]. Bots are defined as “software designed to act in ways that are similar to how a person would act in the social space” [4].

Be it the Brexit, the Arab Spring, Brazilian Protests or elections – social bots increasingly participate in communication on public social media. One goal they are following might be to influence political debates [5, 6]. Among others, political actors and governments might use social bots to manipulate the public opinion, to heat up debates or to muddy political issues [7]. During the second debate of the U.S. electional campaign 2016, one out of three of the 2,4 million pro-Trump-tweets were produced by social bots, whereas pro-Clinton-bots produced one out of four tweets of the overall 720.00 pro-Clinton-tweets [7]. Data about the 2016 U.S. election revealed that the activity of automated pro-Clinton-accounts increased over time but never reached the traffic of automated pro-Trump-accounts. Overall, there were five times as many automated pro-Trump-tweets as automated pro-Clinton-tweets [8].

This example illustrates the high relevance of social bots and the need to conduct research in this field. The phenomenon of bots is nothing new though. As Ferrara et al. [9] point out, bots exist since the beginnings of computers. The term ‘bot’ is used to describe “a software designed to automate a task in a computing system” [4]. Social bots, as a more particular kind of bots produce or share content and interact with humans on social media, trying to mimic human behavior [10,11,12]. Even more specific, the term political (social) bots is used to describe “automated accounts that are active on public policy issues, elections, and political crisis” [6].

In practice, social bots can follow different goals. On the one hand, they can perform tasks such as (re)posting news or automatically share updates (e.g. about the weather) in a conversational (human) tone. On the other hand, bots could be used to imitate human behavior and to spread incorrect information, spam or viruses.

Often these bot profiles lack basic account information like name or profile pictures. Whereas regular users get access from front-end websites, bots obtain access through a site’s application programming interface (API) [6]. As the API of Twitter is especially accessible, many social bots (respectively their creators) focus on this platform.

With more than 310 million active users and over 500 million tweets a day, Twitter is one of the biggest social media platforms in the world [13]. Twitter allows users to post and read messages (“tweets”), each restricted to a maximum of 140 characters. To create posts and to share messages of other users (“retweet”) are amongst the main features [14]. Moreover, users can subscribe (“follow”) other Twitter profiles and can be followed by others (“friends”). Following a user does not have to be reciprocal. A user can follow another user and thus receives this user’ public posts which encourages to follow users one doesn’t know in person even if those don’t follow back [15]. While it is possible to restrict visibility of posts to users oneself approves, about 90% of the Twitter users make their content public for everybody [16] and thereby enable researchers and companies to gather and analyze personal data [17]. At the same time Twitter becomes interesting for spammers and others looking to spread their messages and possibly to influence users [18]. However, until now it is unclear how social bots act in social media and if it is different to human behaviour [17].

As research shows, social bots have been used in the political context (e.g. in 2016 in the Brexit vote and the presidential election in the United States). Also, in the forefront of the election of a new government in Germany 2017 politicians have begun a debate about the use of social bots in election campaigns. However, it remains unclear (a) if social bots actually have an influence on human users and (b) if so, how strong this influence is. For example, even though the Brexit campaign seemed to be subject of social bot activity, it has to be considered that Twitter is not used by the whole population in the same way (e.g. young people use it more intensively) [5]. The impact of social bots therefore remains unclear.

Our objective is thus to investigate the behaviour of social bots and compare it to the way human users act in the same data set. The aim is to reveal differences between those two groups and to garner insights into their behavior thus inferring the impact and providing further contribution in the research field of social bots.

Following, we first present prior research about the potential influence bots could have. Moreover, different approaches to identify bots and characteristics of users who are susceptible for bots will be explained. The Computers as Social Actors framework (CASA) [19] is presented to point out an important mechanism that plays a role in Human-Computer-Interaction (HCI). Furthermore, we explain our research design and present the results of our study. Finally, the implications of our findings are reported and discussed. The paper ends with a conclusion.

2 Related Work

Online Influence of Bots.

Twitter and other social media platforms enable users to reach and possibly influence other users (and vice versa) through the networks’ structure [20]. Political actors as well as actors from other fields, such as marketing, therefore have an interest to better understand or even influence social media communication [20]. In order to detect the influence of Twitter users, several factors can be considered. E.g. the more followers a user has, the more potential receivers of a message exist. The user is possibly more often mentioned and cited (or “re-tweeted”) by other users and can therefore be considered a being influential [21].

As much as 61.5% of total web traffic is produced by bots [4] though it is important to note, that this number includes not only social bots but all kinds of automated processes – as well those outside of social media. As such social bots may serve as a channel to spread information in an efficient manner because several accounts are able to spread the same information or opinion without having to be controlled by a human. As studies show social bots can be used to influence political communication [5]. To reach this goal, bots can follow different strategies:

1.
Mimic human behavior: in this context, Misener [25] emphasizes that bots try to simulate all online activities of real users with the aim of appearing human to blend in. Consequently, it may be harder to detect those social bot accounts that behave like human actors. One can assume that bot accounts that simulate human behavior very well, are hard to identify by third parties and might more efficiently influence communication [26].
2.
Overstate trends: by using special hashtags or combining various hashtags, social bots can distort the weight of certain opinions covered by these hashtags. A widespread way trying to manipulate trends in social networks is to selectively attack the parameters that are collected purely quantitatively. These are, for example, ‘likes’ and shares on Facebook and the frequency of hashtags on Twitter. In this sense bots do not need to produce new content, they just multiply existing content.
3.
Astroturfing: Another possibility of influencing the perception of users is astroturfing. According to Zhang et al. [27] astroturfing describes the practice of trying to create the impression of widely-supported ideas or opinions although they don’t have support of a majority. The usage of astroturfing in a political context has been shown multiple times [28]. Traditionally, astroturfing is conducted by public relations institutions or lobbying organisations [29]. However, since the rise of social media, astroturfing became a popular tool for organizations or single persons to create the impression of grassroots movements in the Internet [28]. Just like astroturfing, ‘smoke screening’ and misdirecting have been used in political contexts, as for example the Syrian civil war [4]. The practices describe the overlapping of a topic with spam or non-related content to distract the attention and to aggravate the finding of information.

Offline Influence of Bots.

The question of just how big the offline influence is and whether there is an offline influence at all is another important aspect to address. Even if there is an influence online, it is still unclear if this influence could carry over to the behaviour outside of the Internet (e.g. regarding voting behavior during elections or buying decisions). The offline influence detection problem addresses these questions. The difficulty lies in the identification of users who may be influential in real life based on Twitter accounts and related data [21]. According to Cossu et al. [21] features predicting online influence, like number of followers or retweets, are inefficient to solve this problem. There are a few studies, which try to capture the offline impact of bots. Aiello et al. [26] outline that influence seems to depend on trust – an aspect which is related to the mimicking of human behaviour exhibited by social bots. Also the position in a social graph [30] seems to have an impact on influence. In addition, Cossu et al. [23] point out that the language used in tweets has an impact on the offline influence. Studies concerning the interdependence between popularity and influence in social media clarify that popularity is not a requirement of being influential and vice-versa [16]. It is thus that the awareness of communicating with a bot causes a decreasing influence. A massive influence of trends cannot be equated with effective manipulation. For this reason, there is a danger of overestimation of bot influence [5]. Hegelich [5] points out that all studies argue that someone does not change his political conviction just because of reading a message in social media. He however sees a subtler manipulation as very probable [5]. The author names the potential influence “Bot-Effekt” and illustrates that its impact in theory could be very large but is difficult to prove empirically [26].

Threats.

In spite of the aforementioned difficulties, researches tried to measure influence, especially threats which may be caused by social bots. Findings suggest, risks for private information and the stock market as well as manipulated behavior [9, 10]. If social bots infiltrate social media, they can easily collect user data from connected (human) accounts. This has been demonstrated by Boshmaf et al. [10] who infiltrated users on Facebook and could collect sensible and monetary valuable private data like email addresses, phone numbers, and profile information. Moreover, they succeeded in collecting the same amount of data related to Facebook friends of the infiltrated user. If a network is infiltrated, social bots can manipulate the users’ perception [9], for example by using astroturfing or smoke screening. According to Chu et al. [31] most spam messages on Twitter are generated by bots, whereas only a small amount is generated by human accounts. Moreover, the authors discovered that tweets by bots contained a higher number of external hyperlinks. Some bots generated tweets even contained more than one link. Many of these bot accounts shared and retweeted spam links. Grier et al. [32] and Gao et al. [33] also discovered large spam attacks by bots. Human accounts, in general, try to avoid and refuse to interact with spam [32].

As described, social bots can focus on ‘negative actions’ of certain kinds, like manipulation [9], stealing and misuse of private information [10], astroturfing [28] or smoke screening [4].

Identification of Bots.

Since the advent of social bots, researchers developed several ways to identify bots in social media. Previous scientific approaches can be grouped into different categories: Concepts were based on network information [34,35,36] machine learning [9, 12, 31, 37], crowdsourcing [38] or a mixture of the attempts [39].

In contrast to the assumption of Xie et al. [35] as well as Paradise et al. [36], Boshmaf et al. [10] found that humans do not refuse to interact with bots. As a result, network information does not seem to deliver satisfying disclosure for the identification of potential bots. In contrast a detailed consideration of machine learning features seems to be more a more promising approach. For instance, Chu et al. [31] selected the parameters automation of tweeting, presence of spam, duplicate tweets, aggressive following behavior, originality of tweets and tweets with unrelated links as features. Chu et al. found that in contrast to human accounts, bots have less follower than friends, generate less tweets, tweet more often via API-based tools and use more URLs in their tweets.

Meanwhile, it is possible for some bots to avoid simple bot recognition algorithms. These more sophisticated bots for example try to keep a balanced relationship between friends and followers to avoid Twitter from deleting them because of the limit the platform stipulates on the ratio of followers over friends [31]. Furthermore, these bots simulate breaks and sleeping times and can even slightly modify messages, so that the semantic content is still the same, but automatic text programs do not recognize these texts as identical [5].

User Interaction with Bots.

The Computers as Social Actors (CASA) framework focuses on interaction between humans and computers. Thereby, the framework helps to investigate and understand the possible differences or similarities between bots and humans [22]. The framework is derived from many studies examining human responses to a variety of media and is based on Media Equation-Theory which postulates that humans are treating computers as if they were real people [40]. Research showed that human-computer-interaction has important emotional [41] and behavioral outcomes [42] e.g. that humans do see bot accounts as more likeable or trustworthy if these bots exhibit empathy or that humans act despiteful when they feel betrayed by a computer. It was also demonstrated that theories from social science as well as experiments on human-human-interaction were reproducible in human-computer-interaction [19]. The same human scripts that guide human-human interaction are used [43] and cues that reveal asocial nature of a computer are ignored. In summary, humans interact with computers in similar ways as they interact with other humans which is also the case, when only restricted social cues are available [22].

More recent studies focus on bots as actors [14, 44, 45]. For example, Edward et al. [14] demonstrated that bots are perceived as credible, attractive, competent in communication, and interactional all of which are characteristics normally associated with humans.

Furthermore, Edwards et al. [22] conducted a two-part study to explore the differences in perception of communication quality between a human agent and a bot. They focused on the areas of cognitive elaboration, information seeking and learning outcomes. The authors could show that humans indicate similar levels of information seeking, cognitive elaboration, affective learning, and motivation behaviors when they receive information from a bot compared to a human user. Results suggest that participants learned the same from either a bot or a human agent. These findings are in accordance with the CASA framework which implies that organizations and individuals who use a bot to influence social media in different ways can be successful. In this case bots appeared to be just as efficient as a human agent for specific information [22]. This shows that automated accounts in general have the ability to evoke human affect towards them. This should go even further in regard to social bots as these try to mimic human behavior and thus could act unnoticed and try to improve their chances of affecting the social graph [12, 25]. However one recent study did not find that mimicking human behavior increased the chance having an influence [4]. The authors analyzed how the content of tweets by a social botnet differs from regular users in the same data set. They conducted a content analysis of tweets from Arabic and English Twitter users as well as from a social botnet and found that over 50% of the tweets generated by the botnet and approximately one third of the tweets from regular users contained news. Moreover, there was a difference in the expression of opinions in tweets. Arabic Twitter users expressed their opinion in 45% of their tweets, English Twitter users in 25.8%, while 12.4% of tweets from the botnet were opinion-related tweets. The authors conclude that it is ambiguous if social bots try to mimic human behavior which differs from previous research.

Research Gap and Hypotheses.

Abokhodair et al. [4] note that an effective analysis should be able to differentiate the participation of humans from bot behavior. Consequently, to identify bots it is necessary to know what kind of differences exist. Studies show that differences exist in the number of followers and friends, the tweeting behavior and frequency, and in the integration of external URLs in tweets [31].

In order to contribute to this, we compare the behavior of bots and humans in the same Twitter data set. Deeper knowledge about the differences may contribute to the exploration of the bots’ impact and facilitate future identification of bots.

Our first hypothesis deals with fundamental differences between bot and human accounts. We presume that there are differences concerning standard features, which can also be used to predict the online influence, like number of followers or retweets [21]. As previous studies show, bot accounts are often organized in a network-like structure and retweet more often than humans [9, 24]. Because of this, we assume that bot accounts retweet each other more often than human accounts. Beyond that Chu et al. [31] postulate that retweeting indicates a lack of originality and that bots post more external links. Therefore, our first hypotheses are:

H1a: :: Bot accounts have a lower number of followers than human accounts.
H1b: :: Bot accounts have a higher number of retweets than human accounts.
H1c: :: Bot accounts have a lower number of @ in Tweets than human accounts.
H1d: :: Bot accounts have a higher number of links than human accounts.

Cossu et al. [21], postulate that an account is more influential when it has more followers and is retweeted often. Therefore, we infer a relationship between these features and presume that:

H2a: :: The number of followers of a bot account is positively related to the number of retweets of this account.
H2b: :: The number of follower of a human account is positively related to the number of retweets of this account.

Because of the relationship between spam, hyperlinks and bots which was emphasized by Chu et al. [31], Grier et al. [32] and Gao et al. [33], as our third hypothesis, we expect:

H3: :: There is a positive relationship between tweeted links per day and retweets per day which differs between bot and human accounts.

According to Aiello et al. [26] who showed a relationship between activity and popularity, we indicate the following hypothesis:

H4: :: The more tweets are created per week, the more followers are generated.

3 Method

To test our hypotheses we gathered a data set of 6.5 million tweets via the Stream API. The set was collected from 31^st of October 2016 to 6^th of November 2016 – the last week before the presidential election in the United States. The data set contains tweets with either the term “Hillary Clinton” or “Donald Trump” – the both candidates running for presidency. The reason for collecting this specific data was, that to test our hypotheses we had to ensure that we could analyze a data set with a sufficient amount of bot accounts. As Howard and Kollanyi [6] showed for the Brexit and Mustafaraj and Metaxas [46] and Forelle et al. [47] for elections in America and Venezuela respectively, political events seem to be a regular aim for the use of bot accounts. Furthermore, to analyze a recent dataset (and thus take into account possible recent developments of bot account strategies) we chose the US election as a background for our dataset. The meta-data of Twitter contained information like the content of the tweet, the date and time when the tweet was posted, the authors name, description and ID, the number of followers and friends, the source from which the tweets were send and the physical location. Our goal was to identify approximately 100 humans and 100 bots for each day. In the following, the procedure of identification is explained stepwise.

Identifications of Bots.

To identify bot accounts in our dataset we chose to exclude all accounts of which we could say with a high certainty that these were human accounts. Our aim was to maintain tweets from users with a high probability to be a social bot. The first step was to exclude all verified user accounts. The next step was to exclude all accounts with less than 500 followers. As our aim was to identify social bots which may have an influence on people, 500 seemed like a fitting threshold to include only those bot accounts that could reach a significant number of people. The third step was to have a closer look on the source from which the tweets were posted. Most of the social bots are auto piloted and generate their tweets via unregistered API-based tools [31]. Because of this, all sources containing “Twitter” in the URL were excluded. Hereby, all sources which are normally used by humans were left out: This can be mobile applications like Twitter for tablets and mobile phones or the Twitter web client. We could reproduce that many of the remaining sources include social bot software which controlled the bot accounts. In the next step, we created a variable to show the relationship between the friends and followers of every user. For this calculation, the number of friends was divided by the number of followers. This friends-follower-ratio indicates if a user has more friends than followers. Previous literature proposes that this ratio should be high for bots and well-balanced for humans [31]. This means that bots normally have a lot more friends than followers. Thus, all accounts with a lower friend-follower-ratio than 150% were excluded. In the last step, all tweets of a user at one day were counted to have the number of tweets per day for every user. Prior research concerning Twitter data and social bots shows that bots have a much higher tweet per day rate than humans [15, 18, 31]. Therefore, all tweets of users with a lower tweet-per-day-rate than 10 were excluded to ensure that only active users, which could possibly be influential, remain.

Identification of Humans.

To compare the bot accounts with human accounts we had in return to assure that the comparable sample only contained accounts controlled by humans with a comparable reach. The process of identification consisted of six steps. At first, all verified users were excluded to make sure that there were no organizations or news sites in the data set anymore. Second, all users which had less than 500 followers were excluded to make sure that only humans with a relatively high reach (and thus possibly influence) were considered. For the third step, only those sources which were most likely used exclusively by humans were chosen: “Twitter for iPhone” and “Twitter for Android”. These tweets were posted via mobile phones and thus with a strong probability to come from a human as the application of bots on a mobile phone is much more difficult than using the API from a desktop-computer. In the next step, the friend-follower-ratio was calculated. As said before, humans tend to have a well-balanced ratio between accounts they follow and accounts which follow them. For that reason, we assumed that humans in our data set should have a friend-follower ratio of nearly 100% (meaning that they have an equal number of friends and followers). Therefore, all tweets from users which had a ratio of more than 105% or less than 95% were excluded. The next step to identify human accounts was to exclude all tweets from users which had the term “bot” in their author description which lead to the exclusion of accounts which were obviously automated (e.g. accounts that posted weather data). In the last step the sum of all tweets per day for every human was calculated and all humans which had less than ten tweets per day were excluded.

Description of the sample. Following the described steps we were able to identify approximately 100 active bots for every day (Table 1). Summed up, there were 270 different bots which posted 19,390 tweets on seven days, at which not every bot posted tweets every day. Table 2 shows how many bots were active on how many days in the covered week. Table 3 shows how many tweets the 1,175 identified human accounts posted on each day. In the whole week humans posted 62,365 tweets at least 500 identified humans were active each day (except for the 31^st of October) (Table 4). To have a comparable number of bots and humans we decided to decrease the set of humans further. Therefore, we randomly selected nearly 100 humans for each day to compare them to approximately 100 bots. The final sample consisted of 693 identified human accounts which posted 12,699 tweets.

Table 1. Distribution of bot-tweets

Full size table

Table 2. Frequency of bot activity

Full size table

Table 3. Distribution of human tweets

Full size table

Table 4. Frequency of human activity

Full size table

Used Variables.

Besides the aforementioned variables, which were used for the identification of accounts, further variables to test the hypotheses were required. The following variables were used.

Preliminary Calculations.

To calculate the statistical analyses IBM SPSS Statistics Version 23.0 was used. In order to determine the statistical significance, a confidence interval of 95% was used.

4 Results

Descriptive Statistics.

The 771 identified bots on average generated 25.15 (SD = 20.19) tweets per day. In contrast the 693 humans generated only 18.32 (SD = 11.32) tweets per day. However, the humans had way more retweets per day (M = 15,921.55, SD = 19,108.23) than bots (M = 1,381.16, SD = 5,850.18). The ratio from average tweets to re-tweets was 85.45 (SD = 361.01) for bots and 897.05 (SD = 1,053.91) for humans. This means that on average every tweet of a bot was retweeted approximately 85 times, whereas a tweet of a human was retweeted almost 900 times. The average number of followers in our sample was 1,241.76 (SD = 655.25) for bots and 5,056.95 (SD = 7,278.91) for humans. The average friends count from bots was at 2,896.42 friends (SD = 1371.73) whereas humans had 5,022.51 friends on average (SD = 7,137.26). The ratio between mentions via @ per day and tweets per day show that bots used approximately one @ every second tweet (M = .42, SD = .66). In contrast humans used at least one @ in every tweet they posted (M = 1.2, SD = .24). Moreover, bots posted one or more links in every tweet (M = 1.24, SD = .6) whereas humans used less links per tweet (M = .88, SD = .22).

Hypothesis Testing.

To test the first hypothesis we compared the mean values of bots and humans for the variables ‘Followers’, ‘Retweets per day’, ‘@ per day’ and ‘Links per day’. Results of Hypotheses 1a–1d are displayed in Table 5. Table 5 shows support for Hypotheses 1a, c, and d. As neither the sampling distribution was normal, nor the variances were equal, the nonparametric Mann-Whitney U test was used for comparison. Hypothesis 1a (bot accounts have a lower number of followers than human accounts), was supported (p < .001). The same applies to Hypothesis 1c - bot accounts have a lower number of @-characters in tweets than human accounts (p < .001), and Hypothesis 1d, tweets created by bot accounts have a higher number of links in their tweets than tweets by human accounts (p < .001). For hypothesis 1b on the other hand (bot accounts have a higher number of retweets than human accounts) the opposite was true. The Mann-Whitney U test showed a significant effect (p < .001), but with respect to the mean values, against the assumption – in our sample humans exhibited a larger number of retweets than bots.

Table 5. Means comparisons for Hypotheses 1a to 1d

Full size table

Hypothesis 2 assumed a relationship between the number of followers and the number of retweets for an account. We tested this for bots and humans for each day individually. Hypothesis 2a focused on this relation at bot accounts, whereas 2b focused on human accounts. Sampling distribution was not normal, wherefore Spearman’s rank correlation was calculated instead of a Pearson correlation. Table 6 shows the results of the correlation for each of the seven days. Hypothesis 2a and 2b are partially confirmation. Within the seven days, a correlation for bots was found on four days, and on five days for humans. There was a significant correlation for bot accounts for the 1^st November (p = .009), 2^nd November (p = .019), 3^rd November (p = .006) and 6^th November (p = .009). For human accounts a significant correlation on the 1^st November (p = .004), 2^nd November (p < .001), 4^th November (p = .021), 5^th November (p < .001) and 6^th November (p = .001) was found. Interestingly, the strongest overall correlation which was found in the human sample on November 2^nd was a negative one, meaning that the more follower an account has, the less retweets it got. Since only a few days showed a significant effect, a correlation was calculated for the whole week. It showed a significant moderate correlation for bot accounts (r _s = .12, p = .001) but not for human accounts.

Table 6. Correlations of followers and retweets for each day from 30^th Oct. – 6^th Nov.

Full size table

Hypothesis 3 claims a relationship between links per day and retweets per day. The third hypothesis was tested employing a correlation analysis between these two variables. This hypothesis was supported. As for the calculations of hypotheses 2a and 2b, sampling distribution was not normal leading to the use of Spearman’s rank correlation. In both groups, humans (rs (693) = .33, p < .001) and bots (rs (771) = .16, p < .001), a significant effect was found for a moderate correlation the group of human accounts and a small correlation in the group of bot accounts.

The assumption “the more tweets were created per week, the more followers are generated” was tested in Hypothesis 4. To test the fourth hypothesis, a regression analysis with tweets per week and their number of followers was performed. Calculations showed no significant relationship for human accounts. However, for bot accounts, there is a significant if small correlation (F (1, 769) = 5.01, p = .025) with R ² = .006. The number of followers increases with β = .41 for each tweet.

5 Discussion

The aim of this paper is to show differences between human and bot accounts to infer the impact of bots. The dataset of the US election was selected, because it is known that the candidates have used bots [6, 7]. Following, results as well as implications and limitations are discussed.

Interpretation of Results.

The first hypothesis considers fundamental differences between human and bot accounts concerning standard features. The human and bot accounts differ in all examined features – number of follower, number of retweets, number of mentions, and number of links. This result comply with previous findings [21, 31]. Hypothesis 1a is supported which means that bot accounts have less followers than human accounts. This result matches with those of Chu et al. [31] who point out that bots have less followers than friends and that humans have a balanced follower-friend-ratio. Whereas humans have a reciprocal relationship, it seems that bots try to gain attention by adding friends and hope to be followed back. Further, we assumed that bot accounts garnered more retweets than human accounts. However, for our sample the opposite is true. Hence, Hypothesis 1b is not supported. This contradicts previous findings [9] which found that especially by retweeting each other, bots try to gain influence [24]. A possible explanation could be that human users in general are able to identify bot accounts and therefore those are retweeted less frequently. The assumption that bots use less @-characters than humans was supported (H1c). This can also be linked to the results of Chu et al. [31]. They suppose that bots generated less tweets and therefore bots may use less @-characters. The hypothesis that bot accounts use a higher number of links in their tweets than human accounts was also supported (H1d). This result coincides with the findings of Chu et al. [31]. It can be assumed that bots use links more frequently because they aim at spreading information and thus want to generate influence. Also, by using links the bot accounts don’t need to produce original content in their tweets but could just link to a source with current information. The programming of those bots could be more easy.

Another assumption was that the number of followers of a bot account is related to the number of retweets of this account (H2a), as well as the notion that the number of followers of a human account is related to the number of retweets of this account (H2b). For some days, this assumption was supported. Regarding the U.S. election, no special event (e.g. a TV-debate) took place which might have explained these fluctuations. Regarding the entire week, only a relationship between the number of followers and retweets of the bot accounts could be found. This indicates that the more followers a bot has the more he is being retweeted or vice versa. The rejection of the hypothesis for human accounts could be attributed to the content of a tweet and it seems that the relation of followers and retweets is not important for human accounts. One can anticipate that the content of human accounts is more valuable because humans use more emotional words as Abokhodair et al. [4] showed.

Hypothesis 3 which assumed a relationship between tweeted links per day and retweets per day is supported. Studies showed that bots produce a great amount of spam [32]. It could thus be assumed that other bots retweet this spam and therefore the number of retweets rises with every new link. Since a correlation does not explain witch variable has impact and which is influenced it could be that the influence is in the opposite direction. As the relationship between links and retweets was stronger for the human accounts one possible explanation may be that (a) OSN users in general are able to differentiate between postings from human and bots accounts and (b) that links which are posted by a human account are perceived as more credible and trustworthy and therefore will be retweeted more often.

The results of hypothesis 4 show that bots generate one additional follower with almost every second tweet. These findings are similar to the results of Aiello et al. [26] who postulate that the activity of users influences their popularity. On Twitter the number of followers of a profile can, among other measures, express the popularity of a user. This could be a reason for the behavior of bots of producing a large amount of tweets [6]. The fourth hypothesis could not be supported for human accounts, though. Here, a possible explanation may be, that humans generate followers based on different features such as the content. In contrast to bot accounts humans may post more emotional tweets and express their opinions which may lead others to follow them as opposed to the pure quantity of their tweets.

6 Conclusion, Limitations, and Future Research

Conclusion.

In this paper, we compared 771 identified bots accounts with 693 identified regular users of the same data set. As one of the first studies, we compared human and bot accounts from the same data set. Prior research mostly focused on bots in one data set and compared them to features taken from literature. Our goal was to understand how different aspects differ between the two groups to better assess the influence that social bots might have on human users on Twitter. According to prior research, important differences in the two groups’ standard features likes followers, friends and retweeting behaviour exist. Moreover, coherences between features like followers and retweeting-behavior, the number of links used in tweets and the number of retweets in one day were reported. In our analyses, the hypothesis concerning the relation between followers and retweeting behavior was partially supported – one can see that there is a relation between these two features but this does not count for all investigated days in our data set. One important finding was that bots generated a new follower with every second tweet which may justify and explain the purpose behind an extremely high bot activity.

Furthermore, bots are often considered to operate on and within a network of several bot accounts as such limiting the impact of a single account but that is certainly able to produce a certain amount of content in OSNs over time. Summing everything up, one can say that widespread features were considered, compared and set into relation. First insights are unveiled regarding the number of tweets that are necessary to gain more followers. Against the backdrop of influencing users in OSNs, these findings are important to characterize the acting of Social Bots and to enlighten their intentions.

Limitations.

Our findings are limited by the reduction of data. We focused on examining approximately 100 bot accounts for every day of the seven days. This temporal restriction accompanies another limitation – it is uncertain how the identified bots and human accounts act over a longer time period such as the complete US presidential campaign. This could lead to very active bots not attaining any focus in the investigation at hand and thus underestimating a possible impact. Beyond that human accounts were gathered via random samples for every investigated day. Moreover, the focus in this investigation lays on quantitative data. This implies that the content of a tweet is not considered. However, content and choice of words can possibly explain why some tweets are hyped or an account is followed by many others.

Future Research.

The identification method used in our paper could be tested and refinded further. In turn our analyses can be carried out after applying different identification methods (e.g. from a machine learning perspective). Future research should also dig deeper on the meaning of URLs in tweets as our findings indicate that there is a relationship between posting links and retweeting a tweet. Furthermore, it would be interesting to take a profound insight into differences between professed bots which state themselves as bots and hidden bots which try to obscure their identity. In a third step these bots could be compared to human users.

References

Dang-Xuan, L., Stieglitz, S.: Impact and diffusion of sentiment in political communication–an empirical analysis of political weblogs. In: AAAI Conference on Weblogs and Social Media, pp. 3500–3509 (2012)
Google Scholar
Bruns, A., Stieglitz, S.: Twitter data: what do they represent? IT - Inf. Technol. 56, 240–245 (2014)
Google Scholar
Bruns, A., Stieglitz, S.: Metrics for understanding communication on Twitter. In: Weller, K., Bruns, A., Burgess, J., Marth, M., Puschmann, C. (eds.) Twitter and Society, pp. 68–82. Peter Lang, New York (2014)
Google Scholar
Abokhodair, N., Yoo, D., McDonald, D.W.: Dissecting a social Botnet. In: Proceedings of 18th ACM Conference on Computer Supported Cooperative Work and Social Computing - CSCW 2015, pp. 839–851 (2015)
Google Scholar
Hegelich, S.: Invasion der Meinungs-Roboter. Anal. und Argumente, Konrad-Adenauer-Stiftung 221, 2–9 (2016)
Google Scholar
Howard, P.N., Kollanyi, B.: Bots, #StrongerIn, and #Brexit: Computational Propaganda during the UK-EU Referendum (2016)
Google Scholar
Kollanyi, B., Howard, P.N., Woolley, S.C.: Bots and automation over Twitter during the second U.S. presidential debate. In: Comprop Data Memo, vol. 2 (2016)
Google Scholar
Kollanyi, B., Howard, P.N., Woolley, S.C.: Bots and automation over Twitter during the U.S. Election, Oxford (2016)
Google Scholar
Ferrara, E., Varol, O., Davis, C., Menczer, F., Flammini, A.: The rise of social bots. Commun. ACM 59, 96–104 (2016)
Article Google Scholar
Boshmaf, Y., Muslukhov, I., Beznosov, K., Ripeanu, M.: Design and analysis of a social botnet. Comput. Netw. 57, 556–578 (2013)
Article Google Scholar
Lee, K., Eoff, B.D., Caverlee, J.: Seven months with the devils: a long-term study of content polluters on Twitter. In: ICWSM 2011, pp. 185–192 (2011)
Google Scholar
Wagner, C., Mitter, S., Körner, C., Strohmaier, M.: When social bots attack: modeling susceptibility of users in online social networks. In: CEUR Workshop Proceedings (2012)
Google Scholar
IPO: Twitter. https://www.sec.gov/Archives/edgar/data/1418091/000119312513390321/d564001ds1.htm
Edwards, C., Edwards, A., Spence, P.R., Shelton, A.K.: Is that a bot running the social media feed? Testing the differences in perceptions of communication quality for a human agent and a bot agent on Twitter. In: Computers in Human Behavior, pp. 372–376 (2014)
Google Scholar
Wald, R., Khoshgoftaar, T.M., Napolitano, A., Sumner, C.: Which users reply to and interact with Twitter social bots. In: Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI (2013)
Google Scholar
Cha, M., Haddai, H., Benevenuto, F., Gummadi, K.P.: Measuring user influence in Twitter: the million follower fallacy. In: International AAAI Conference on Weblogs and Social Media, pp. 10–17 (2010)
Google Scholar
Freitas, C.A., Benevenuto, F., Ghosh, S., Veloso, A.: Reverse engineering socialbot infiltration strategies in Twitter. In: Proceedings of 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 25–32 (2015)
Google Scholar
Wald, R., Khoshgoftaar, T.M., Napolitano, A., Sumner, C.: Predicting susceptibility to social bots on Twitter. In: Proceedings of 2013 IEEE 14th International Conference on Information Reuse and Integration, IRI 2013, pp. 135–144. IEEE (2013)
Google Scholar
Nass, C., Moon, Y.: Machines and mindlessness: social responses to computers. J. Soc. Issues 56, 81–103 (2000)
Article Google Scholar
Messias, J., Schmidt, L., Oliveira, R.A.R., Benevenuto, F.: You followed my bot! Transforming robots into influential users in Twitter. First Monday 18 (2013)
Google Scholar
Cossu, J.-V., Labatut, V., Dugué, N.: A review of features for the discrimination of Twitter users: application to the prediction of offline influence. Soc. Netw. Anal. Min. 6, 1–23 (2016)
Article Google Scholar
Edwards, C., Spence, P.R., Gentile, C.J., Edwards, A., Edwards, A.: How much Klout do you have … A test of system generated cues on source credibility. Comput. Human Behav. 29, A12–A16 (2013)
Article Google Scholar
Cossu, J.V., Dugue, N., Labatut, V.: Detecting real-world influence through Twitter. In: Proceedings - 2nd European Network Intelligence Conference, ENIC 2015, pp. 83–90 (2015)
Google Scholar
Danish, M., Dugué, N., Perez, A.: On the importance of considering social capitalism when measuring influence on Twitter. In: International Conference on Behavioral, Economic, and Socio-Cultural Computing (BESC 2014), pp. 1-7, Shanghai (2014)
Google Scholar
Misener, D.: Rise of the social bots: they could be influencing you online. http://www.cbc.ca/news/technology/rise-of-the-socialbots-they-could-be-influencing-you-online-1.981796
Aiello, L.M., Deplano, M., Schifanella, R., Ruffo, G.: People are strange when you’re a stranger: impact and influence of bots on social networks. In: ICWSM 2012 - Proceedings of 6th International AAAI Conference on Weblogs and Social Media, pp. 10–17 (2012)
Google Scholar
Zhang, J., Carpenter, D., Ko, M.: Online astroturfing: a theoretical perspective. In: 19th American Conference on Information Systems, AMCIS 2013, vol. 4, pp. 2559–2565 (2013)
Google Scholar
Ratkiewicz, J., Conover, M., Meiss, M., Gonçalves, B., Patil, S., Flammini, A., Menczer, F.: Truthy: mapping the spread of astroturf in microblog streams. In: Proceedings of 20th International Conference Companion World Wide Web (WWW 2011), pp. 249–252 (2011)
Google Scholar
McNutt, J.G.: Researching advocacy groups: internet sources for research about public interest groups and social movement organizations. J. Policy Pract. 9, 308–312 (2010)
Article Google Scholar
Ilyas, M.U., Radha, H.: Identifying influential nodes in online social networks using principal component centrality. In: Proceedings of 2011 IEEE International Conference on Communications, pp. 1–5 (2011)
Google Scholar
Chu, Z., Gianvecchio, S., Wang, H., Jajodia, S.: Detecting automation of Twitter accounts: are you a human, bot, or cyborg? IEEE Trans. Dependable Secure Comput. 9, 811–824 (2012)
Article Google Scholar
Grier, C., Thomas, K., Paxson, V., Zhang, M.: @spam: The underground on 140 characters or less. In: Proceedings of 17th ACM Conference on Computer and Communications Security, pp. 27–37 (2010)
Google Scholar
Gao, H., Hu, J., Wilson, C., Li, Z., Chen, Y., Zhao, B.Y.: Detecting and characterizing social spam campaigns. In: Proceedings of 10th ACM SIGCOMM Conference on Internet Measurement, pp. 35–47 (2010)
Google Scholar
Cao, Q., Sirivianos, M., Yang, X., Pregueiro, T.: Aiding the detection of fake accounts in large scale social online services. In: NSDI 2012 – Proceedings of 9th USENIX Conference on Networked Systems Design and Implementation, vol. 15 (2012)
Google Scholar
Xie, Y., Yu, F., Ke, Q., Abadi, M., Gillum, E., Vitaldevaria, K., Walter, J., Huang, J., Mao, Z.M.: Innocent by association: early recognition of legitimate users. In: Proceedings of 2012 Computer and Communications Security, pp. 353–364 (2012)
Google Scholar
Paradise, A., Puzis, R., Shabtai, A.: Anti-reconnaissance tools: detecting targeted socialbots. IEEE Internet Comput. 18, 11–19 (2014)
Article Google Scholar
Davis, C.A., Varol, O., Ferrara, E., Flammini, A., Menczer, F.: BotOrNot: a system to evaluate social bots. In: Proceedings of 25th International Conference Companion World Wide Web, 1602.009, pp. 273–274 (2016)
Google Scholar
Wang, G., Mohanlal, M., Wilson, C., Wang, X., Metzger, M., Zheng, H., Zhao, B.Y.: Social turing tests: crowdsourcing sybil detection. In: 20th Network and Distributed System Security Symposium (2012)
Google Scholar
Alvisi, L., Clement, A., Epasto, A., Lattanzi, S., Panconesi, A.: SoK: the evolution of sybil defense via social networks. In: Proceedings - IEEE Symposium on Security and Privacy, pp. 382–396 (2013)
Google Scholar
Reeves, B., Nass, C.: The media equation: how people treat computers, television, and new media (1996)
Google Scholar
Brave, S., Nass, C., Hutchinson, K.: Computers that care: investigating the effects of orientation of emotion exhibited by an embodied computer agent. Int. J. Hum. Comput. Stud. 62(2), 161–178 (2005)
Article Google Scholar
Ferdig, R.E., Mishra, P.: Emotional responses to computers: experiences in unfairness, anger, and spite. J. Educ. Multimed. Hypermedia 13, 143–161 (2004)
Google Scholar
Kim, Y., Sundar, S.S.: Anthropomorphism of computers: is it mindful or mindless? Comput. Hum. Behav. 28, 241–250 (2012)
Article Google Scholar
Lee, K.M., Park, N., Song, H.: Can a robot be perceived as a developing creature? Hum. Commun. Res. 31, 538–563 (2005)
Google Scholar
Stoll, B., Edwards, C., Edwards, A.: “Why aren’t you a sassy little thing”: the effects of robot-enacted guilt trips on credibility and consensus in a negotiation. Commun. Stud. 67, 530–547 (2016)
Article Google Scholar
Mustafaraj, E., Metaxas, P.: From obscurity to prominence in minutes: political speech and real-time search. In: WebSci 2010: Extending the Frontiers of Society On-Line, p. 317 (2010)
Google Scholar
Forelle, M.C., Howard, P.N., Monroy-Hernandez, A., Savage, S.: Political bots and the manipulation of public opinion in Venezuela. In: Social Science Research Network. SSRN Scholarly, Rochester, NY (2015). Paper ID 2635800
Google Scholar

Download references

Author information

Authors and Affiliations

Professional Communication in Electronic Media/Social Media, Department of Computer Science and Applied Cognitive Science, University of Duisburg-Essen, Duisburg, Germany
Stefan Stieglitz, Florian Brachten, Davina Berthelé, Mira Schlaus, Chrissoula Venetopoulou & Daniel Veutgen

Authors

Stefan Stieglitz
View author publications
You can also search for this author in PubMed Google Scholar
Florian Brachten
View author publications
You can also search for this author in PubMed Google Scholar
Davina Berthelé
View author publications
You can also search for this author in PubMed Google Scholar
Mira Schlaus
View author publications
You can also search for this author in PubMed Google Scholar
Chrissoula Venetopoulou
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Veutgen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Stefan Stieglitz .

Editor information

Editors and Affiliations

Towson University, Towson, Maryland, USA
Gabriele Meiselwitz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Stieglitz, S., Brachten, F., Berthelé, D., Schlaus, M., Venetopoulou, C., Veutgen, D. (2017). Do Social Bots (Still) Act Different to Humans? – Comparing Metrics of Social Bots with Those of Humans. In: Meiselwitz, G. (eds) Social Computing and Social Media. Human Behavior. SCSM 2017. Lecture Notes in Computer Science(), vol 10282. Springer, Cham. https://doi.org/10.1007/978-3-319-58559-8_30

Download citation

DOI: https://doi.org/10.1007/978-3-319-58559-8_30
Published: 13 May 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-58558-1
Online ISBN: 978-3-319-58559-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Do Social Bots (Still) Act Different to Humans? – Comparing Metrics of Social Bots with Those of Humans

Abstract

Similar content being viewed by others

Analyzing Social Bots and Their Coordination During Natural Disasters

You Are Known by Your Friends: Leveraging Network Metrics for Bot Detection in Twitter

Characterizing the role of bots’ in polarized stance on social media

Keywords

1 Introduction

2 Related Work

Online Influence of Bots.

Offline Influence of Bots.

Threats.

Identification of Bots.

User Interaction with Bots.

Research Gap and Hypotheses.

3 Method

Identifications of Bots.

Identification of Humans.

Used Variables.

Preliminary Calculations.

4 Results

Descriptive Statistics.

Hypothesis Testing.

5 Discussion

Interpretation of Results.

6 Conclusion, Limitations, and Future Research

Conclusion.

Limitations.

Future Research.

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation