Keywords

1 Introduction

Reddit is a Social News site where users submit content and vote to dictate the prominence with which each item will be displayed. It was viewed by 174 million unique users in September 2014Footnote 1. Reddit is divided into more than 240,000 “subreddits” or content categories (but 50 % of these are inactive and have received less than 5 posts in total). According to surveys conducted with reddit’s users in 2010 and 2011, the site’s users are predominantly male, white, young and living in the United States - a survey conducted in 2010 and completed by 25,000 reddit readers found that 40 % were white males aged 18–34 living in the United States [1].

Research on Social News platforms has tended to involve either a quantitative approach to describing the functioning of a particular platform as a whole [26] or users’ behavior [7, 8] - or a more qualitative approach to a particular event or phenomenon which occurred on a Social News platform [911]. Published work tends to either focus on the platform itself (without making reference to the content) or on a particular case study (without making reference to how the platform works). This paper combines quantitative approaches to explore how reddit functions with a more qualitative approach to telling the story of reddit’s involvement in the anti-SOPA movement and how this was shaped by the former. Reddit is recognized by [12] as playing an important role in this movement but their method, based on in-links, only captures reddit’s role when it became part of the wider story (through the GoDaddy boycott). We consider how the 79 posts about SOPA which had previously appeared on reddit’s Front page laid the foundations for the successful GoDaddy boycott, and the manner in which other attempts at collective action on reddit succeeded or failed.

2 Who Sees What on Reddit? Pages and Algorithms

Every post submitted to reddit is voted on through the ubiquitous up/down voting buttons. The posts which an individual user is shown are selected based on subreddit and page criteria. Users most commonly browse aggregated pages where they are presented with posts from the set of subreddits they subscribe to – for users who are not signed into an account the posts they see on aggregated pages are drawn from the ‘default’ subreddits. Users also have the option of seeing posts from ‘all’ subreddits or browsing posts from a specific subreddit.

The primary page type for browsing posts selects these based on the ‘Hot’ ranking algorithm. Reddit’s ‘Front page’ is the site’s most viewed page and shows posts from the subreddits a user is subscribed to with the highest ranks as determined by this algorithm – users who aren’t signed into an account see the ‘default Front page’. This ‘Hot’ algorithm uses only two variables to generate ranks – the post’s score and time of submission [13]. Voting scores undergo a log10 transformation such that a score of 1000 bears only twice as much weight as a score of 100. Time of submission is used to penalize older posts, and coupled with the log10 transformation of voting scores the result is that even very popular posts cannot remain on the Front page for long.

To appear on pages which use the ‘Hot’ algorithm a post must have a high score –this requires users to browse (and vote) on pages which select posts on other criteria. The ‘New’ page shows posts in reverse chronological order – every post appears on this page initially and is then pushed down the order by subsequent submissions. The ‘Rising’ page shows posts which are relatively fresh and have already begun to accumulate up-votes. Before appearing on the default Front page, a post will always appear on the ‘Hot’ page for the subreddit it was submitted to (referred to here as the ‘Main’ page for the subreddit) – votes cast here can help the post to out-score competitors from other subreddits and be shown on the Front page.

Each of these page types has been observed at 30-min intervals between August 20th and September 27th 2012. The subreddit-specific versions of the New, Rising, and Main pages have been recorded for a variety of subreddits. At each observation point, the posts’ previous numbers of votes and comments have been subtracted to yield values which reflect the previous 30 min. These values have been modelled in R with a Negative Binomial regression using time since previous observation as an offset and the page(s) a post appeared on at a given observation point as explanatory variables. This allows us to estimate the levels of activity associated with appearance on each page. Table 1 shows details of the model fit for up-votes on the /r/funnyFootnote 2 and /r/worldnews subreddits (both ‘default’ subreddits at the time).

Table 1. Model parameters for a negative binomial regression model of number of new up-votes by page- with an offset for the time since previous observation. In generating estimates for the number of up-votes per page, posts appearing on the Front page are assumed to appear simultaneously on the Main page.

The model for /r/funny posts shows that voting rates were lowest on the New page (0.3 vpm), with appearance on the Rising (4.1 vpm), Main (12.2 vpm) and Front (75.7 vpm) pages being associated with greatly increased voting rates. The /r/worldnews subreddit follows a similar pattern but with lower voting rates on every page – the New (0.01 vpm), Rising (0.04 vpm) and Main (0.5 vpm) pages are particularly disadvantaged as compared to /r/funny, while posts appearing on the Front page (20.2 vpm) suffer less of a deficit. Models of other activity types (down-votes, comments) are not presented here but follow the same pattern [1]. This suggests that /r/worldnews may be a subreddit which many users remain subscribed to but don’t actively engage with – voting on posts only when they encounter them on the Front page.

These voting rates indicate that it is better to conceive of reddit’s voting system as involving a number of stages or hurdles which a post must overcome if it is to earn a place on the Front page – rather than the singular process which the ‘Hot’ ranking algorithm intimates. The function served by the New page is to determine which posts will be displayed on the Rising page – while a post can appear on both pages simultaneously the voting rate on the Rising page is considerably higher. A similar relationship exists between the Rising page and the Hot pages (Main/Front). Posts which appear on the same page are in effect competitors, and only some of these will score well enough to earn a place on the next page in the sequence and the increased voting rate which comes with that territory – propelling them to much higher scores than rival posts which don’t make the transition. The role of the New page as a gatekeeper for the Rising page, low levels of voting on this page, and log10 transformation of voting scores in the Hot algorithm – all combine to give early votes accentuated influence over reddit’s collective decision-making.

3 Redditors’ Submission and Voting Behavior

Reddit have provided access to post submission and voting data for the month of March 2009 – although this data relates to an earlier time in reddit’s history it allows for analyses which are otherwise not possible with data collected through the API (as voting is anonymous). In March 2009 102,232 users cast 3,346,062 votes (of which 77 % upvotes) on 352,902 posts. Of these users 33,589 (33 %) only acted once (one vote or submission), whereas the most active user for the month registered 23,776 actions. The distribution of votes and submissions between users is highly skewed and similar to a power law distribution – with 80 % of votes being cast by 11.6 % of users. Users tend to specialize in either voting or submitting. Of the 68,643 users who acted more than once, 68 % only engaged in one form of activity (53 % voters, 47 % submitters). Users who submit posts are largely distinct from users who vote on these posts.

The most active voting users diverge from their less active peers by the frequency of their votes, but also by the nature of their votes – to the extent that they can be considered a class of voting ‘superparticipants’ [14]. These voting superparticipants are more likely to cast down-votes, are more likely to vote on fresh posts (first 20 votes cast on a post) and are more likely to vote quickly (voting within 10 s of their previous vote). Figure 1 shows the prevalence of these behaviors in user groups defined by activity level.

Fig. 1.
figure 1figure 1

Top: Percentage of a user’s votes which correspond to the three ‘expert’ voting behaviors. Bottom: Percentage of total users and votes for each activity level group.

As noted above, early stage pages (New, Rising) see low voting rates but all posts must progress through these pages to reach higher visibility locations. If a user wished to try and censor reddit one approach would be to down-vote posts when they are fresh and thus prevent them from reaching areas where they will be seen and assessed by many more users. There may be strategic value to voting on the New and Rising pages, and some of the voting superparticipants who are active there may be aiming to maximize their impact on the site or to curb the capacity of certain perspectives to be heard – down-voting posts quickly based on their title alone.

There is however a more innocuous explanation which sees these users as pitching in on the New page and filtering out the many poor submissions so that other users don’t have to sift through these. The argument against this explanation is the prevalence of quick votes (19.5 % of all votes in this month were cast within 10 s of the user’s previous vote) – for most subreddits it is not possible to assess the quality of a post within 10 s and these votes are likely based on the title alone. The outcome is that many posts submitted to default subreddits are quickly discarded when early down-votes from a few users prevent them from progressing to the Rising page.

Digg, reddit’s predecessor as the most active Social News site, was perceived to have problems related to censorship (e.g. the ‘Digg Patriots’, a group which coordinated to bury left-leaning content soon after submission [15]) and to be dominated by an elite of ‘Power Users’ whose submissions often occupied high-attention areas of the site [16].

The voting data for March 2009 allows us to assess whether ‘voting superparticipants’ on reddit fare well when they submit posts. The scores for a users’ posts were first converted into an ordinal variable with 4 levels (<0, 0, 1–20, 21+) and then modelled in STATA’s GLLAMM package using a multi-level ordinal logistic regression. Incorporating a random effect for users allows the model to estimate effects for the explanatory variables without being distorted by individual differences in users’ ability to submit high-scoring posts. Details of the model fit are included in Table 2.

Table 2. Model parameters for a multi-level ordinal logistic regression model, with a random effect for Users and Post Score (ordinal) as the response variable (Deviance 552780.7).

Users who submit a high number of posts, and have a high proportion of post submissions (relative to voting activity) tend to submit posts which don’t score well. High levels of user voting are associated with successful post submissions – but this effect is more than offset by the effects of ‘Quick’ and ‘Early’ votes, with the result that users engaging in these behaviors don’t tend to submit high-scoring posts. Users with older accounts, and those who had some activity on a non-default subreddit, tended to submit higher-scoring posts. These effects indicate that posts submitted by users who have more rounded involvement in reddit are more likely to score well. Through the lens of Common Pool Resources [17] – reddit appears resistant to free-riders (those who would consume the attentional resource through their submissions without contributing much of their own attention to the common pool by browsing and voting).

Longitudinal analysis of reddit’s Front page over two years confirms that its content is not dominated by elite users to the same degree as Digg’s was perceived to be at the peak of its popularity [16] – with 55 % of all posts observed on the Front page being submitted by a user who only achieved this once.

4 Reddit and the Stop Online Piracy Act (SOPA)

The above explorations of how reddit functions will now be used to provide context for a case study of reddit’s involvement in the ‘internet campaign’ against SOPA. This case study considered the content, comments and timing of 258 posts about this subject observed on reddit’s default Front page between October 27th 2011 and February 3rd 2012 [1]Footnote 3.

Posts about SOPA appeared on reddit’s default Front page before the legislation was being covered by Newspapers (television news channels didn’t begin to cover the legislation until much later), being regularly mentioned on Twitter, or being regularly searched for on Google [1]. Early posts about SOPA on reddit linked to blog posts (redd.it/lq2b1Footnote 4) and videos (redd.it/lrv38) about the legislation. It is telling that the 3rd, 4th and 5th posts about SOPA to appear on reddit’s Front page were already moving towards collective action by asking users to sign petitions (redd.it/lvr2x & redd.it/lrux7) or recounting a user’s own experience of writing to their elected representative to express their disapproval (redd.it/lvr2x). The appearance of two posts on the default Front page appears to have raised awareness of SOPA quite broadly among reddit’s users, and the subsequent appearance here of posts calling for action suggests that a consensus quickly emerged that this legislation should be opposed.

From this point, many of the posts about SOPA observed on reddit’s Front page were geared towards raising awareness and spreading opposition beyond the site’s user-base. Two courses of action were particularly effective - a boycott of GoDaddy aiming to change their pro-SOPA stance which emerged from ordinary users in a grass-roots fashion, and reddit’s involvement in the ‘blackout’ of January 18th which was implemented by administrators and moderators. The GoDaddy boycott will be described here.

On December 22nd /u/selfprodigyFootnote 5 submitted a post to the politics subreddit titled “GoDaddy supports SOPA, I’m transferring 51 domains & suggesting a move your domain day” (redd.it/nmnie). Once this post appeared on the reddit Front page it received considerable attention and ultimately more than 3,000 comments. Popular comment threads were supportive of the proposal, some criticizing GoDaddy on other grounds, some offering advice on how to avoid transferring to a GoDaddy subsidiary, some asking for and receiving evidence of GoDaddy’s support for SOPA. Over the following eight days there were a further 24 posts about the GoDaddy boycott which appeared on reddit’s Front page. The first of these was submitted to /r/technology and linked to the original boycott post on /r/politics (redd.it/nmsiu) – this post served to spread awareness of the boycott among redditors who were not /r/politics subscribers.

On the day after the original post was submitted two posts appeared on the Front page which linked to articles about the boycott published on the Techdirt blog and Ars Technica (redd.it/nn4j5). These articles were a sign that the boycott attempt was being noticed outside reddit, and this feedback was in turn broadcast through the Front page. Later in this eight day period posts appeared on the Front page which put numbers on the effectiveness of the boycott - a loss of 72,000 domains in a week (redd.it/npmav), 21,000 domains in a single day (redd.it/npj2q). Other posts appearing on the Front page during this period draw attention to the fact that certain organizations (e.g. Wikipedia) have domains registered with GoDaddy – the intention being to suggest that these organizations should transfer their domains. Subsequently there are posts announcing/celebrating when organizations state that they will join the boycott.

On December 29th a post was submitted and up-voted to the Front page which reminded users that the designated ‘transfer day’ had arrived (redd.it/nuimq) - this post was not submitted by the same user who sparked the boycott. Later that day two posts appeared on the Front page through /r/politics (redd.it/nvdf4) and /r/technology (redd.it/nvg18) which detailed a press release from GoDaddy stating that they now actively opposed SOPA – thus drawing the GoDaddy boycott episode to a close. In addition to meeting the aim of changing GoDaddy’s stance, this boycott was also widely covered by the conventional news mediaFootnote 6 and online media [12], thus raising awareness of SOPA beyond reddit’s users.

While the GoDaddy boycott was unfolding six posts also appeared on the Front page which proposed boycotting other organizations that supported SOPA, or questioned the emphasis on GoDaddy as the sole target of boycotting (redd.it/nq7cy). Suggested targets for boycotts included movie theatres (redd.it/nokhw), Time Warner (redd.it/nob8i), EA (redd.it/nqumv) and Nintendo (redd.it/nr2m3). These posts are interesting for two reasons. The first of these concerns the expression of dissenting opinions [18]. If one considers only posts about the GoDaddy boycott this idea seems to have emerged largely from a single post and quickly received widespread support from reddit users. The other posts considered here reveal that not all of reddit’s users jumped on the ‘GoDaddy boycott’ bandwagon. These posts which question the strategy of only boycotting GoDaddy appeared on the Front page alongside posts which were supportive of the GoDaddy boycott.

Reddit users, through their use of the voting system, collectively ‘decided’ to broadcast opinions on the Front page which went against the perceived majority opinion. This could happen because redditors (excluding voting superparticipants) generally cast more up-votes than down-votes. To appear on the Front page a post does not necessarily need the backing of a majority of reddit’s users - it only needs to attract more up-votes than the posts which it is directly competing with (i.e. those which it appears alongside on the same page(s)).

The second point of interest here concerns the importance of comments. Although these posts reached the Front page their comments pages hosted discussions which tended to argue against the suggestion made by the post itself. Many popular comments on these posts explain why it is prudent to focus on GoDaddy or criticize the suggested target of the post. It is probably no coincidence that, despite appearing on the Front page, these posts failed to establish the momentum on reddit which was critical to the success of the GoDaddy boycott. This fits with the assertion in Sect. 3 that voting on posts tends to be quite ‘shallow’ and this in combination with the simplistic ‘Hot’ algorithm and low voting rates on early stage pages results in a degree of ‘randomness’ around the decision as to which posts appear on the Front page. The default ‘best’ comment sorting algorithm is more nuanced, using the proportion of up/down votes to generate a Wilson score confidence intervalFootnote 7.

There are further examples from the SOPA case study of ‘poor quality’ posts which appeared on the Front page and had popular comments which were fiercely critical. For example, the post redd.it/mgw7f asks users to sign a Whitehouse.gov petition - popular comments are critical of the petition’s grammar and the fact that it calls on the Whitehouse to take action which is outside its power. Previous research indicated that once a post appeared on the Front page it was unlikely to receive more down-votes than up-votes - even when all of its high-scoring comments were very critical the level of up-voting at worst matched the level of down-voting [1]. This suggests a dichotomy of users who vote on Front page posts, those who read and contribute/vote on their comments pages and those who do not.

There are also examples which suggest that the ‘Best’ ranking algorithm on the comments pages may allow users to pick out and promote comments which are of high value (e.g. redd.it/lvr2x - a user details their letter to their member of congress and top-scoring comments include some from a ‘former capitol hill staffer’ explaining what happens to such letters and how to make this kind of appeal more effectively).

Reddit’s success in raising awareness of SOPA and facilitating collective action against it can be largely attributed to its capacity to show a very large number of individuals the same set of items. The broadcasting of Front page posts sets the agenda on reddit, and among the many attending users there are often some with particular expertise – post and comment voting can allow contributions from these users to be widely seen, even where the user themselves is not a well-established redditor. In addition to insightful high-scoring comments, reddit’s coverage of this story also incorporated a post by an individual attending a SOPA House Judiciary Committee hearing, providing a running commentary on proceedings and answering the questions of other users (redd.it/ne9zn) – a crowd-sourced on-location reporter.

During the SOPA case study reddit’s users were also able to use the voting tools at their disposal to deal with a ‘false dawn’ in the GoDaddy boycott and to quickly adapt to a rapidly developing story (where the length of postponement of a vote on SOPA was initially over-estimated by a Front page post but quickly corrected by subsequent posts which displaced this on the Front page).

4.1 Collective Action on Reddit

The collective action against SOPA which was incubated on reddit called on a familiar set of repertoires to those employed by more conventional activism organizations (e.g. petitions, contact with representatives, boycotts). The major contrasts with activism organizations like 38degrees and Moveon [19] are that (1) on reddit users were not dichotomized as ‘leaders’ or ‘followers’, proposals could come from any user and be circulated widely through the Front page, and (2) reddit set its own agenda for collective action in this case, rather than organizing around an issue which was already in the media spotlight.

In the absence of designated leaders, post and comment voting take on an organizational role. Instead of leaders who have the capacity to reach followers with messages – the messages which reach participants are merely those which score the most highly. This can result in mixed messages and chaotic action, but on the whole reddit’s campaigns against SOPA show a surprising degree of strategy (quickly dropping campaigns which are seen as flawed, focusing on those which show early signs of impact through further Front page posts). The 258 posts about SOPA which appeared on reddit’s default Front page were submitted by 223 different users. 16 users each submitted more than one post which appeared on the Front page, with 51 posts in total being submitted by these users. Aside from two administrator accounts with two posts each – all of the front page posts were submitted by ‘ordinary’ reddit users (i.e. not administrators or moderators). There is very little evidence of an oligarchy of power submitters who dominated reddit’s coverage of this story or coordinated the community’s attempts at collective action.

Reddit’s capacity to set its own agenda for collective action stems from its function as an information broadcasting system – this could be likened to having an ‘in-house’ newspaper with a large readership. Although reddit is a ‘Social News’ site, collective action could emerge and thrive because of its loose specification of which items were eligible for submission. The latest article from Ars Technica could appear alongside a 23-page treatise from a Harvard law professor (redd.it/na3z8), a single-sentence summary by a reddit user (redd.it/nbepe), and the response received by another user from their member of congress (redd.it/lxrpk). All are merely ‘posts’, and have the capacity to appear on the Front page depending on how users vote.

Reddit has not been designed with collective action in mind, and its ‘Social News’ structure imbues attempts at collective action with certain qualities and limitations. Reddit posts are transitory in nature - for the hours they appear on the Front page they are highly visible, but when they slip from this page they become difficult to locate due to the site’s notoriously poor search functionality. The post which proposes or ignites collective action is often edited to direct readers towards more lasting venues where that action can be pursued. These venues include newly created subreddits, websites, Facebook groups and IRC channels. Any user can provide these resources, and those who do so first, or whose resources are utilized, become de facto leaders within the ad hoc group that forms around the endeavor.

5 Reddit and the Fragmentation of Public Discourse

We posit that the speed with which awareness of SOPA could permeate reddit was dependent on the presence of ‘default’ subreddits with large readerships. For the first six weeks of the observation period high-visibility posts about SOPA were largely confined to the /r/technology subreddit – but through the default Front page these were also being shown to many readers with no particular interest in technology. By the end of the observation period posts about SOPA had appeared on the default Front page through 16 of the 20 default subreddits of that time – including humor-oriented subreddits like /r/funny and /r/AdviceAnimals. The variety of subreddits which ‘covered’ this story is evidence that awareness and opposition permeated reddit site-wide. Without this ‘cross-fertilisation’ of subreddits through the default Front page it is unlikely that SOPA opposition would have permeated the site’s community.

Historically, reddit began in 2005 with just a single ‘subreddit’. In 2008, the capacity to create a new subreddit was opened up to all users. This signaled the beginning of a slow but accelerating fragmentation of reddit’s user-base: 240,000 subreddits have now been created by users, in February 2014 around 400 of these received more than 1,000 posts. This proliferation of subreddits, and the apparently increasing tendency for users to customize their subreddit subscriptions, is one of the most important developments on the site with regard to its broader social impact. In its early years reddit had a strong anti-fragmentation effect – users voted to select a small set of posts from the deluge of incoming submissions and these posts were shown to all users on the site’s Front page. Now, with thousands of active subreddits to choose from, a reddit user can customize the site to generate their own personalized ‘Daily Me’ [18].

The site’s administrators have been influential in this regard. For many years the number of default subreddits was fixed at 12, including /r/reddit.com which served as a ‘miscellaneous’ subreddit. In October 2011 the number of default subreddits was expanded to 25 and /r/reddit.com was decommissioned. In May 2014 the number of default subreddits was expanded again to 50. These changes served to expose reddit users to a much wider variety of subreddits, most likely including some they would have no interest in (therefore pushing users towards managing their subreddit subscriptions and viewing the site while signed into an account). Changes to the set of default subreddits are the primary means through which reddit’s administrators have influenced the site’s development and perceived identity. Reddit also began to display more messages which highlighted the subreddit system and prompted new users to begin managing their subscriptions.

The data which would allow us to quantify the level of fragmentation of reddit’s readership is unavailable, but we can consider the levels of activity which occurs on default subreddits (and which can therefore appear on the uniform ‘default Front page’) as compared to non-default subreddits. Figure 2Footnote 8 shows the number of posts submitted to default and non-default subreddits by month. There is clearly an increasing proportion of posts being submitted to non-default subreddits, and these posts have no capacity to appear on the default Front page.

Fig. 2.
figure 2figure 2

Number of posts submitted to default and non-default subreddits by month, from July 2010 to September 2013.

While the trend may be towards users customizing their subreddit subscriptions and Front page, reddit has also added an ‘/r/all’ option which shows the top-ranked posts from every subreddit (showing the same content to every user). Page view statistics for this page as compared to the ‘default’ and ‘signed in’ versions of the Front page would inform us about whether reddit users choose to embrace the ‘Daily Me’.

6 Conclusions

The majority of posts submitted to reddit’s bustling default subreddits are quickly jettisoned based on the votes of relatively few users – a small group of ‘voting superparticipants’ specialize in this role. There is however no evidence that this group systematically filters out certain types of content, or dominates the website with their own submissions. Taking the post voting system’s structure and usage into account, it is no surprise that it does not result in reliable decisions about which posts to display on the ‘Front page’ [6]. These decisions are however controlled by reddit’s users en masse and posts which appear on the Front page are broadcast to these users – their content and comments influences subsequent voting behavior in a feedback loop.

The reddit of late 2011 could be described as a more or less unified ‘public sphere’ [20] – albeit one which marginalized deliberation in favor of the rapid broadcasting of new information and ideas. In subsequent years this unified public sphere appears to be fragmenting as users increasingly choose a set of subreddits unique to their interests. Some of these subreddits exist for the purpose of criticizing other areas of the site (e.g. /r/ShitRedditSays, /r/SubredditDrama) and can be considered as ‘subaltern counter-publics’ [21].

Many of these newly populated subreddits have comprehensive rules about what can be submitted and moderators who enforce these rules through the deletion of submitted content – a departure from reddit’s earlier history where the ideal of distributed moderation through voting dominated. In the campaign against SOPA, highly attended to default subreddits like /r/technology were instrumental in raising awareness and fomenting opposition broadly among reddit’s users. The loose specification of what could be submitted to these subreddits was also important in allowing these users to transition seamlessly from raising awareness of the legislation to organizing against it. Writing now in early 2015 it appears that the campaign against SOPA represented the high water-mark of reddit’s capacity to unite effectively behind a cause.