Skip to main content

Two Sides of a Coin: Separating Personal Communication and Public Dissemination Accounts in Twitter

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8443))

Abstract

There are millions of accounts in Twitter. In this paper, we categorize twitter accounts into two types, namely Personal Communication Account (PCA) and Public Dissemination Account (PDA). PCAs are accounts operated by individuals and are used to express that individual’s thoughts and feelings. PDAs, on the other hand, refer to accounts owned by non-individuals such as companies, governments, etc. Generally, Tweets in PDA (i) disseminate a specific type of information (e.g., job openings, shopping deals, car accidents) rather than sharing an individual’s personal life; and (ii) may be produced by non-human entities (e.g., bots). We aim to develop techniques for identifying PDAs so as to (i) facilitate social scientists to reduce “noise” in their study of human behaviors, and (ii) to index them for potential recommendation to users looking for specific types of information. Through analysis, we find these two types of accounts follow different temporal, spatial and textual patterns. Accordingly we develop probabilistic models based on these features to identify PDAs. We also conduct a series of experiments to evaluate those algorithms for cleaning the Twitter data stream.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bodnar, T., Salathé, M.: Validating models for disease detection using twitter. In: WWW, pp. 699–702 (2013)

    Google Scholar 

  2. Chang, C.-C., Lin, C.-J.: LIBSVM: A library for support vector machines. ACM Tran. on IST 2, 27:1–27:27 (2011)

    Google Scholar 

  3. Cheng, Z., Caverlee, J., Lee, K.: You are where you tweet: a content-based approach to geo-locating twitter users. In: CIKM, pp. 759–768 (2010)

    Google Scholar 

  4. Cheng, Z., Caverlee, J., Lee, K., Sui, D.Z.: Exploring millions of footprints in location sharing services. In: ICWSM, pp. 81–88 (2011)

    Google Scholar 

  5. Cho, E., Myers, S.A., Leskovec, J.: Friendship and mobility: user movement in location-based social networks. In: KDD, pp. 1082–1090 (2011)

    Google Scholar 

  6. Chu, Z., Gianvecchio, S., Wang, H., Jajodia, S.: Who is tweeting on twitter: human, bot, or cyborg? In: ACSAC, pp. 21–30 (2010)

    Google Scholar 

  7. Golder, S.A., Macy, M.W.: Diurnal and seasonal mood vary with work, sleep, and daylength across diverse cultures. Science 333(6051), 1878–1881 (2011)

    Article  Google Scholar 

  8. González, M.C., Hidalgo, C.A., Barabási, A.-L.: Understanding individual human mobility patterns. Nature 435, 779–782 (2008)

    Article  Google Scholar 

  9. Grier, C., Thomas, K., Paxson, V., Zhang, M.: @spam: the underground on 140 characters or less. In: CCS, pp. 27–37 (2010)

    Google Scholar 

  10. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: An update. SIGKDD Explorations 11(1), 10–18 (2009)

    Article  Google Scholar 

  11. Hecht, B., Hong, L., Suh, B., Chi, E.H.: Tweets from justin bieber’s heart: the dynamics of the location field in user profiles. In: CHI, pp. 237–246 (2011)

    Google Scholar 

  12. Kinsella, S., Murdock, V., O’Hare, N.: “i’m eating a sandwich in glasgow”: modeling locations with tweets. In: SMUC, pp. 61–68 (2011)

    Google Scholar 

  13. Laboreiro, G., Sarmento, L., Oliveira, E.: Identifying automatic posting systems in microblogs. In: Antunes, L., Pinto, H.S. (eds.) EPIA 2011. LNCS, vol. 7026, pp. 634–648. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  14. Noulas, A., Scellato, S., Mascolo, C., Pontil, M.: An empirical study of geographic user activity patterns in foursquare. In: ICWSM, pp. 570–573 (2011)

    Google Scholar 

  15. Song, J., Lee, S., Kim, J.: Spam filtering in twitter using sender-receiver relationship. In: Sommer, R., Balzarotti, D., Maier, G. (eds.) RAID 2011. LNCS, vol. 6961, pp. 301–317. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  16. Taboada, M., Brooke, J., Tofiloski, M., Voll, K., Stede, M.: Lexicon-based methods for sentiment analysis. Computational Linguistics 37(2), 267–307 (2011)

    Article  Google Scholar 

  17. Thelwall, M., Buckley, K., Paltoglou, G., Cai, D., Kappas, A.: Sentiment strength detection in short informal text. Journal of the American Society for Information Science and Technology 61(12), 2544–2558 (2011)

    Article  Google Scholar 

  18. Lea, D.: Detecting spam bots in online social networking sites: A machine learning approach. In: Foresti, S., Jajodia, S. (eds.) Data and Applications Security and Privacy XXIV. LNCS, vol. 6166, pp. 335–342. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  19. Wang, A.H.: Don’t follow me - spam detection in twitter. In: SECRYPT, pp. 142–151 (2010)

    Google Scholar 

  20. Wang, D., Pedreschi, D., Song, C., Giannotti, F., Barabasi, A.-L.: Human mobility, social ties, and link prediction. In: KDD, pp. 1100–1108 (2011)

    Google Scholar 

  21. Yang, C., Harkreader, R.C., Gu, G.: Die free or live hard? Empirical evaluation and new design for fighting evolving twitter spammers. In: Sommer, R., Balzarotti, D., Maier, G. (eds.) RAID 2011. LNCS, vol. 6961, pp. 318–337. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  22. Yardi, S., Romero, D.M., Schoenebeck, G., Boyd, D.: Detecting spam in a twitter network. First Monday 15(1) (2010)

    Google Scholar 

  23. Zhang, C.M., Paxson, V.: Detecting and analyzing automated activity on twitter. In: Spring, N., Riley, G.F. (eds.) PAM 2011. LNCS, vol. 6579, pp. 102–111. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Yin, P., Ram, N., Lee, WC., Tucker, C., Khandelwal, S., Salathé, M. (2014). Two Sides of a Coin: Separating Personal Communication and Public Dissemination Accounts in Twitter. In: Tseng, V.S., Ho, T.B., Zhou, ZH., Chen, A.L.P., Kao, HY. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2014. Lecture Notes in Computer Science(), vol 8443. Springer, Cham. https://doi.org/10.1007/978-3-319-06608-0_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-06608-0_14

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-06607-3

  • Online ISBN: 978-3-319-06608-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics