Skip to main content

The Anatomy of Reddit: An Overview of Academic Research

  • Conference paper
  • First Online:
Dynamics On and Of Complex Networks III (DOOCN 2017)

Part of the book series: Springer Proceedings in Complexity ((SPCOM))

Included in the following conference series:

Abstract

Online forums provide rich environments where users may post questions and comments about different topics. Understanding how people behave in online forums may shed light on the fundamental mechanisms by which collective thinking emerges in a group of individuals, but it has also important practical applications, for instance, to improve user experience, increase engagement or automatically identify bullying. Importantly, the datasets generated by the activity of the users are often openly available for researchers, in contrast to other sources of data in computational social science. In this survey, we map the main research directions that arose in recent years and focus primarily on the most popular platform, Reddit. We distinguish and categorize research depending on their focus on the posts or on the users and point to different types of methodologies to extract information from the structure and dynamics of the system. We emphasize the diversity and richness of the research in terms of questions and methods and suggest future avenues of research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://en.wikipedia.org/wiki/Reddit.

  2. 2.

    https://en.wikipedia.org/wiki/Slashdot.

  3. 3.

    https://en.wikipedia.org/wiki/Hacker_News.

  4. 4.

    https://en.wikipedia.org/wiki/Digg.

  5. 5.

    https://praw.readthedocs.io/en/latest.

  6. 6.

    The authors used karmadecay.com—the reverse image search tool specifically designed for Reddit.

References

  1. Aragón, P., Gómez, V., Kaltenbrunner, A.: Visualization tool for collective awareness in a platform of citizen proposals. In: Proceedings of the International AAAI Conference on Weblogs and Social Media, pp. 756–757 (2016)

    Google Scholar 

  2. Aragón, P., Gómez, V., García, D., Kaltenbrunner, A.: Generative models of online discussion threads: state of the art and research challenges. J. Internet Serv. Appl. 8(1), 15 (2017)

    Article  Google Scholar 

  3. Aragón, P., Gómez, V., Kaltenbrunner, A.: To thread or not to thread: the impact of conversation threading on online discussion. In: International AAAI Conference on Web and Social Media (2017)

    Google Scholar 

  4. Backstrom, L., Boldi, P., Rosa, M., Ugander, J., Vigna, S.: Four degrees of separation. In: Proceedings of the 4th Annual ACM Web Science Conference, pp. 33–42. ACM, New York (2012)

    Google Scholar 

  5. Bandari, R., Asur, S., Huberman, B.A.: The pulse of news in social media: forecasting popularity. In: Proceedings of the Sixth International AAAI Conference on Weblogs and Social Media (ICWSM), vol. 12, pp. 26–33 (2012)

    Google Scholar 

  6. Bishop, J.: The effect of de-individuation of the internet troller on criminal procedure implementation: an interview with a hater. Int. J. Cyber Criminol. 7(1), 28–48 (2013)

    Google Scholar 

  7. Chandrasekharan, E., Pavalanathan, U., Srinivasan, A., Glynn, A., Eisenstein, J., Gilbert, E.: You can’t stay here: the efficacy of Reddit’s 2015 ban examined through hate speech. Proc. ACM Hum.-Comput. Interact. 1, 31 (2017)

    Google Scholar 

  8. Chandrasekharan, E., Samory, M., Jhaver, S., Charvat, H., Bruckman, A., Lampe, C., Eisenstein, J., Gilbert, E.: The internet’s hidden rules: an empirical study of Reddit norm violations at micro, meso, and macro scales. Proc. ACM Hum.-Comput. Interact. 2, 32:1–32:25 (2018). http://doi.acm.org/10.1145/3274301

  9. Cohen, R., Havlin, S.: Scale-free networks are ultrasmall. Phys. Rev. Lett. 90(5), 058701 (2003)

    Article  ADS  Google Scholar 

  10. Das, S., Lavoie, A.: The effects of feedback on human behavior in social media: an inverse reinforcement learning model. In: Proceedings of the 2014 International Conference on Autonomous Agents and Multi-Agent Systems, pp. 653–660. International Foundation for Autonomous Agents and Multiagent Systems (2014)

    Google Scholar 

  11. Derczynski, L., Rowe, M.: Tracking the diffusion of named entities. (2017, preprint). arXiv:1712.08349

    Google Scholar 

  12. Dommers, S., Van Der Hofstad, R., Hooghiemstra, G.: Diameters in preferential attachment models. J. Stat. Phys. 139(1), 72–107 (2010)

    Article  ADS  MathSciNet  Google Scholar 

  13. Fang, H., Cheng, H., Ostendorf, M.: Learning latent local conversation modes for predicting comment endorsement in online discussions. In: Proceedings of The Fourth International Workshop on Natural Language Processing for Social Media, pp. 55–64 (2016)

    Google Scholar 

  14. Gaffney, D., Matias, J.N.: Caveat emptor, computational social science: large-scale missing data in a widely-published Reddit corpus. (2018, preprint). arXiv:1803.05046

    Google Scholar 

  15. Gilbert, E.: Widespread underprovision on Reddit. In: Proceedings of the 2013 Conference on Computer Supported Cooperative Work, pp. 803–808. ACM, New York (2013)

    Google Scholar 

  16. Glenski, M., Weninger, T.: Predicting user-interactions on Reddit. In: Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017, pp. 609–612. ACM, New York (2017)

    Google Scholar 

  17. Glenski, M., Pennycuff, C., Weninger, T.: Consumers and curators: browsing and voting patterns on Reddit. IEEE Trans. Comput. Soc. Syst. 4(4), 196–206 (2017)

    Article  Google Scholar 

  18. Gómez, V., Kaltenbrunner, A., López, V.: Statistical analysis of the social network and discussion threads in Slashdot. In: Proceedings of the 17th International Conference on World Wide Web, pp. 645–654. ACM, New York (2008)

    Google Scholar 

  19. Gómez, V., Kappen, H.J., Kaltenbrunner, A.: Modeling the structure and evolution of discussion cascades. In: Proceedings of the 22Nd ACM Conference on Hypertext and Hypermedia, pp. 181–190 (2011)

    Google Scholar 

  20. Gómez, V., Kappen, H.J., Litvak, N., Kaltenbrunner, A.: A likelihood-based framework for the analysis of discussion threads. World Wide Web 16(5–6), 645–675 (2013)

    Article  Google Scholar 

  21. Gonzalez-Bailon, S., Kaltenbrunner, A., Banchs, R.E.: The structure of political discussion networks: a model for the analysis of online deliberation. J. Inf. Technol. 25(2), 230–243 (2010). https://doi.org/10.1057/jit.2010.2

    Article  Google Scholar 

  22. Halfaker, A., Keyes, O., Kluver, D., Thebault-Spieker, J., Nguyen, T., Shores, K., Uduwage, A., Warncke-Wang, M.: User session identification based on strong regularities in inter-activity time. In: Proceedings of the 24th International Conference on World Wide Web, pp. 410–418. International World Wide Web Conferences Steering Committee, Geneva (2015)

    Google Scholar 

  23. Hamilton, W.L., Zhang, J., Danescu-Niculescu-Mizil, C., Jurafsky, D., Leskovec, J.: Loyalty in online communities. In: Proceedings of the International AAAI Conference on Weblogs and Social Media, vol. 2017, p. 540. NIH Public Access (2017)

    Google Scholar 

  24. Hanson, W.A., Putler, D.S.: Hits and misses: herd behavior and online product popularity. Mark. Lett. 7(4), 297–305 (1996)

    Article  Google Scholar 

  25. Hessel, J., Tan, C., Lee, L.: Science, askscience, and badscience: on the coexistence of highly related communities. In: The Tenth International Conference on Web and Social Media (ICWSM), pp. 171–180 (2016)

    Google Scholar 

  26. Hessel, J., Lee, L., Mimno, D.: Cats and captions vs. creators and the clock: comparing multimodal content to context in predicting relative popularity. In: Proceedings of the 26th International Conference on World Wide Web, pp. 927–936. International World Wide Web Conferences Steering Committee, Geneva (2017)

    Google Scholar 

  27. Hirsch, J.E.: An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Sciences 102(46), 16569–16572 (2005)

    Article  ADS  Google Scholar 

  28. Horne, B.D., Adali, S.: The impact of crowds on news engagement: a Reddit case study. (2017, preprint). arXiv:1703.10570

    Google Scholar 

  29. Horne, B.D., Adali, S., Sikdar, S.: Identifying the social signals that drive online discussions: a case study of Reddit communities. In: 26th International Conference on Computer Communication and Networks (ICCCN), pp. 1–9 (2017). https://doi.org/10.1109/ICCCN.2017.8038388

  30. Jaech, A., Zayats, V., Fang, H., Ostendorf, M., Hajishirzi, H.: Talking to the crowd: what do people react to in online discussions? In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 2026–2031 (2015)

    Google Scholar 

  31. Kaltenbrunner, A., Gomez, V., Lopez, V.: Description and prediction of Slashdot activity. In: Latin American Web Conference 2007 (LA-WEB 2007), pp. 57–66. IEEE, Piscataway (2007)

    Google Scholar 

  32. Karsai, M., Kivelä, M., Pan, R.K., Kaski, K., Kertész, J., Barabási, A.L., Saramäki, J.: Small but slow world: how network topology and burstiness slow down spreading. Phys. Rev. E 83(2), 025102 (2011)

    Article  ADS  Google Scholar 

  33. Kumar, S., Hamilton, W.L., Leskovec, J., Jurafsky, D.: Community interaction and conflict on the web. In: Proceedings of the 2018 World Wide Web Conference on World Wide Web, pp. 933–943. International World Wide Web Conferences Steering Committee, Geneva (2018)

    Google Scholar 

  34. Lakkaraju, H., McAuley, J.J., Leskovec, J.: What’s in a name? Understanding the interplay between titles, content, and communities in social media. In: International AAAI Conference on Web and Social Media (ICWSM), vol. 1, no. 2, 3 (2013)

    Google Scholar 

  35. Lambiotte, R., Kosinski, M.: Tracking the digital footprints of personality. Proc. IEEE 102(12), 1934–1939 (2014)

    Article  Google Scholar 

  36. Lee, J.G., Moon, S., Salamatian, K.: Modeling and predicting the popularity of online contents with cox proportional hazard regression model. Neurocomputing 76(1), 134–145 (2012)

    Article  Google Scholar 

  37. Lumbreras, A., Jouve, B., Velcin, J., Guégan, M.: Role detection in online forums based on growth models for trees. Soc. Netw. Anal. Min. 7(1), 49 (2017)

    Article  Google Scholar 

  38. Marckert, J.F., Mokkadem, A., et al.: The depth first processes of Galton–Watson trees converge to the same Brownian excursion. Ann. Probab. 31(3), 1655–1678 (2003)

    Article  MathSciNet  Google Scholar 

  39. Medvedev, A.N., Delvenne, J.C., Lambiotte, R.: Modelling structure and predicting dynamics of discussion threads in online boards. J. Complex Netw. 7, 67–82 (2018). https://doi.org/10.1093/comnet/cny010

    Article  MathSciNet  Google Scholar 

  40. Mishne, G., Glance, N.: Leave a reply: an analysis of weblog comments. In: Proceedings of 3rd Annual Workshop on the Weblogging Ecosystem at the 15th International World Wide Web Conference (2006)

    Google Scholar 

  41. Mojica, L.G.: Modeling trolling in social media conversations. (2016, preprint). arXiv:1612.05310

    Google Scholar 

  42. Morstatter, F., Pfeffer, J., Liu, H., Carley, K.M.: Is the sample good enough? comparing data from Twitter’s streaming API with Twitter’s Firehose. In: International AAAI Conference on Web and Social Media (ICWSM) (2013)

    Google Scholar 

  43. Moyer, D., Carson, S.L., Dye, T.K., Carson, R.T., Goldbaum, D.: Determining the influence of Reddit posts on Wikipedia pageviews. In: Proceedings of the Ninth International AAAI Conference on Web and Social Media (2015)

    Google Scholar 

  44. Muchnik, L., Aral, S., Taylor, S.J.: Social influence bias: a randomized experiment. Science 341(6146), 647–651 (2013)

    Article  ADS  Google Scholar 

  45. Newell, E., Jurgens, D., Saleem, H.M., Vala, H., Sassine, J., Armstrong, C., Ruths, D.: User migration in online social networks: a case study on Reddit during a period of community unrest. In: International AAAI Conference on Web and Social Media (ICWSM), pp. 279–288 (2016)

    Google Scholar 

  46. Nishi, R., Takaguchi, T., Oka, K., Maehara, T., Toyoda, M., Kawarabayashi, K.I., Masuda, N.: Reply trees in twitter: data analysis and branching process models. Soc. Netw. Anal. Min. 6(1), 1–13 (2016)

    Article  Google Scholar 

  47. Saleem, H.M., Ruths, D.: The aftermath of disbanding an online hateful community (2018). Preprint. arXiv:1804.07354

    Google Scholar 

  48. Salganik, M.J., Watts, D.J.: Leading the herd astray: an experimental study of self-fulfilling prophecies in an artificial cultural market. Soc. Psychol. Quart. 71(4), 338–355 (2008)

    Article  Google Scholar 

  49. Sinatra, R., Lambiotte, R.: Topical issue-quantifying success. Adv. Complex Syst. 21, 3–4 (2018)

    Article  Google Scholar 

  50. Singer, P., Flöck, F., Meinhart, C., Zeitfogel, E., Strohmaier, M.: Evolution of Reddit: from the front page of the internet to a self-referential community? In: Proceedings of the 23rd International Conference on World Wide Web, pp. 517–522. ACM, New York (2014)

    Google Scholar 

  51. Singer, P., Ferrara, E., Kooti, F., Strohmaier, M., Lerman, K.: Evidence of online performance deterioration in user sessions on Reddit. PloS One 11(8), e0161636 (2016)

    Article  Google Scholar 

  52. Stoddard, G.: Popularity dynamics and intrinsic quality in Reddit and hacker news. In: International AAAI Conference on Web and Social Media (ICWSM), pp. 416–425 (2015)

    Google Scholar 

  53. Stuck_In_the_Matrix: Dataset is available on the following webpage. https://files.pushshift.io/reddit/ (Query: 2017-06-01)

  54. Stuck_In_the_Matrix: I have every publicly available Reddit comment for research. approx. 1.7 billion comments @ 250 gb compressed. any interest in this? https://redd.it/3bxlg7 (Query: 2017-07-14)

  55. Stuck_In_the_Matrix: Update for the Reddit corpus. https://redd.it/8aen5g (Query: 2018-09-27)

  56. Szabo, G., Huberman, B.A.: Predicting the popularity of online content. Commun. ACM 53(8), 80–88 (2010)

    Article  Google Scholar 

  57. Tan, C.: Tracing community genealogy: how new communities emerge from the old. (2018, preprint). arXiv:1804.01990

    Google Scholar 

  58. Tan, C., Lee, L.: All who wander: on the prevalence and characteristics of multi-community engagement. In: Proceedings of the 24th International Conference on World Wide Web, WWW ’15, pp. 1056–1066. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva (2015)

    Google Scholar 

  59. Tsagkias, M., Weerkamp, W., De Rijke, M.: Predicting the volume of comments on online news stories. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 1765–1768. ACM, New York (2009)

    Google Scholar 

  60. Wakefield, J.: Are you scared yet? Meet Norman, the psychopathic AI. BBC News https://www.bbc.com/news/technology-44040008

  61. Wang, C., Ye, M., Huberman, B.A.: From user comments to on-line conversations. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’12, pp. 244–252 (2012)

    Google Scholar 

  62. Zannettou, S., Caulfield, T., Blackburn, J., De Cristofaro, E., Sirivianos, M., Stringhini, G., Suarez-Tangil, G.: On the origins of memes by means of fringe web communities (2018). Preprint. arXiv:1805.12512

    Google Scholar 

  63. Zayats, V., Ostendorf, M.: Conversation modeling on Reddit using a graph-structured LSTM. Trans. Assoc. Comput. Linguist. 6, 121–132 (2018)

    Article  Google Scholar 

  64. Zhang, J., Hamilton, W.L., Danescu-Niculescu-Mizil, C., Jurafsky, D., Leskovec, J.: Community identity and user engagement in a multi-community landscape. In: Proceedings of the International AAAI Conference on Weblogs and Social Media, vol. 2017, p. 377. NIH Public Access (2017)

    Google Scholar 

  65. Zhao, Q., Erdogdu, M.A., He, H.Y., Rajaraman, A., Leskovec, J.: Seismic: A self-exciting point process model for predicting tweet popularity. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1513–1522 (2015)

    Google Scholar 

Download references

Acknowledgements

This work was supported by Concerted Research Action (ARC) supported by the Federation Wallonia-Brussels Contract ARC 14/19-060; Flagship European Research Area Network (FLAG-ERA) Joint Transnational Call “FuturICT 2.0”; and by grant 16-01-00499 of the Russian Foundation for Basic Research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jean-Charles Delvenne .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Medvedev, A.N., Lambiotte, R., Delvenne, JC. (2019). The Anatomy of Reddit: An Overview of Academic Research. In: Ghanbarnejad, F., Saha Roy, R., Karimi, F., Delvenne, JC., Mitra, B. (eds) Dynamics On and Of Complex Networks III. DOOCN 2017. Springer Proceedings in Complexity. Springer, Cham. https://doi.org/10.1007/978-3-030-14683-2_9

Download citation

Publish with us

Policies and ethics