Skip to main content
Log in

Managing longitudinal exposure of socially shared data on the Twitter social media

  • Published:
International Journal of Advances in Engineering Sciences and Applied Mathematics Aims and scope Submit manuscript

Abstract

On most online social media sites today, user-generated data remains accessible to allowed viewers unless and until the data owner changes her privacy preferences. In this paper, we present a large-scale measurement study focused on understanding how users control the longitudinal exposure of their publicly shared data on social media sites. Our study, using data from Twitter, finds that a significant fraction of users withdraw a surprisingly large percentage of old publicly shared data—more than 28% of 6-year old public posts (tweets) on Twitter are not accessible today. The inaccessible tweets are either selectively deleted by users or withdrawn by users when they delete or make their accounts private. We also found a significant problem with the current exposure control mechanisms—even when a user deletes her tweets or her account, the current mechanisms leave traces of residual activity, i.e., tweets from other users sent as replies to those deleted tweets or accounts still remain accessible. We show that using this residual information one can recover significant information about the deleted tweets or even characteristics of the deleted accounts. To the best of our knowledge, we are the first to study the information leakage resulting from residual activities of deleted tweets and accounts. Finally, we propose two exposure control mechanisms that eliminates information leakage via residual activities. One of our mechanisms optimize for allowing meaningful social interactions with user posts and another mechanism aims to control longitudinal exposure via anonymization . We discuss the merits and drawbacks of our proposed mechanisms compared to existing mechanisms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. This study was conducted respecting the guidelines set by our institute’s ethics board and with their explicit knowledge and permission.

  2. Facebook’s longitudinal exposure control mechanisms are more granular as observed by previous studies [6, 21]. Facebook users can choose to make their content available to only themselves, to their friends, subsets of friends, friends of friends or to general public.

  3. https://dev.twitter.com/overview/api/response-codes.

  4. We observed that Twitter provides a tweet in their random sample nearly instantaneously (within seconds) after a user posts the tweet. Consequently, there is at most a minimal chance that a user deleted a tweet even before it could appear in our random sample.

  5. We only considered original tweets (and not retweets) during sampling since our goal is to understand how much of the tweets originally posted by users are withdrawn today.

  6. We obtained the country of our users by leveraging location data of Twitter users gathered by Kulshrestha et al. [23]. They used the location and timezone field of the Twitter profile for inferring location of users.

  7. We use a list of English stopwords and a list of Twitter-specific stopwords from [27].

  8. For an interested reader to check the resemblance in meaning between the guessed and original tweets, we put our complete AMT evaluation result at http://twitter-app.mpi-sws.org/soups2016/amt_guess.html.

  9. For example, Twitter today automatically deletes re-tweets of a deleted tweet, but not replies or mentions generated by other users.

  10. https://support.twitter.com/articles/20171990.

    https://www.facebook.com/help/437430672945092.

References

  1. Jenkins Jr., H.W.: Google and the search for the future. http://www.wsj.com/articles/SB10001424052748704901104575423294099527212 (2010)

  2. Ayalon, O., Toch, E.: Retrospective privacy: managing longitudinal privacy in online social networks. In: Proceedings of the 9th Symposium on Usable Privacy and Security (SOUPS ’13) (2013)

  3. Bauer, L., Cranor, L.F., Komanduri, S., Mazurek, M.L., Reiter, M.K., Sleeper, M., Ur, B.: The post anachronism: the temporal dimension of Facebook privacy. In: Proceedings of the 12th ACM Workshop on Privacy in the Electronic Society (WPES’13) (2013)

  4. Dey, R., Jelveh, Z., Ross, K.W.: Facebook users have become much more private: a large-scale study. In: Proceedings of the 10th Annual IEEE International Conference on Pervasive Computing and Communications (perCom’12) (2012)

  5. Johnson, M., Egelman, S., Bellovin, S.M.: Facebook and privacy: it’s complicated. In: Proceedings of the 8th Symposium on Usable Privacy and Security (SOUPS’12) (2012)

  6. Liu, Y., Gummadi, K.P., Krishnamurthy, B., Mislove, A.: Analyzing Facebook privacy settings: user expectations vs. reality. In: Proceedings of the 11th ACM/USENIX Internet Measurement Conference (IMC’11) (2011)

  7. Stutzman, F., Gross, R., Acquisti, A.: Silent listeners: the evolution of privacy and disclosure on Facebook. J. Priv. Confid. 4(2), 7–41 (2012)

    Google Scholar 

  8. Bernstein, M.S., Bakshy, E., Burke, M., Karrer, B.: Quantifying the invisible audience in social networks. In: Proceedings of the 31st SIGCHI Conference on Human Factors in Computing Systems (CHI’13) (2013)

  9. Besmer, A., Lipford, H.R.: Moving beyond untagging: photo privacy in a tagged world. In: Proceedings of the 28th SIGCHI Conference on Human Factors in Computing Systems (CHI’10) (2010)

  10. Hoadley, C.M., Xu, H., Lee, J.J., Rosson, M.B.: Privacy as information access and illusory control: the case of the Facebook news feed privacy outcry. Electron. Commer. Res. Appl. 9(1), 50–60 (2010)

    Article  Google Scholar 

  11. Madejski, M., Johnson, M., Bellovin, S.M.: The failure of online social network privacy settings. Technical Report CUCS-010-11, Department of Computer Science, Columbia University (2011)

  12. Mazzia, A., LeFevre, K., Adar, E.: The PViz comprehension tool for social network privacy settings. In: Proceedings of the 8th Symposium on Usable Privacy and Security (SOUPS’12) (2012)

  13. Petrovic, S., Osborne, M., Lavrenko, V.: I wish I didn’t say that! Analyzing and predicting deleted messages in Twitter. CoRR arXiv:abs/1305.3107 (2013)

  14. Zhou, L., Wang, W., Chen, K.: Tweet properly: analyzing deleted tweets to understand and identify regrettable ones. In: Proceedings of the 25th International Conference on World Wide Web (WWW’16) (2016)

  15. Madden, M., Lenhart, A., Cortesi, S., Gasser, U., Duggan, M., Smith, A., Beaton, M.: Teens, social media, and privacy. http://www.pewinternet.org/2013/05/21/teens-social-media-and-privacy/

  16. Almuhimedi, H., Wilson, S., Liu, B., Sadeh, N., Acquisti, A.: Tweets are forever: a large-scale quantitative analysis of deleted tweets. In: Proceedings of the 16th Conference on Computer Supported Cooperative Work (CSCW’13) (2013)

  17. Jain, P., Kumaraguru, P.: On the dynamics of username changing behavior on Twitter. In: Proceedings of the 3rd IKDD Conference on Data Science (CODS’16) (2016)

  18. Liu, Y., Kliman-Silver, C., Mislove, A.: The tweets they are a-changin’: evolution of Twitter users and behavior. In: Proceedings of the 8th International AAAI Conference on Weblogs and Social Media (ICWSM’14) (2014)

  19. Snapchat: https://www.snapchat.com/ (2016)

  20. Mondal, M., Messias, J., Ghosh, S., Gummadi, K.P., Kate, A.: Forgetting in social media: understanding and controlling longitudinal exposure of socially shared data. In: Proceedings of the 12th Symposium on Usable Privacy and Security (SOUPS’16) (2016)

  21. Mondal, M., Liu, Y., Viswanath, B., Gummadi, K.P., Mislove, A.: Understanding and specifying social access control lists. In: Proceedings of the 10th Symposium on Usable Privacy and Security (SOUPS’14) (2014)

  22. Cha, M., Haddadi, H., Benevenuto, F., Gummadi, K.P.: Measuring user influence in Twitter: the million follower fallacy. In: Proceedings of the 4th AAAI Conference on Weblogs and Social Media (ICWSM’10) (2010)

  23. Kulshrestha, J., Kooti, F., Nikravesh, A., Gummadi, K.P.: Geographic dissection of the Twitter network. In: Proceedings of the 6th International AAAI Conference on Weblogs and Social Media (ICWSM’12) (2012)

  24. Mullen, L.: Predicting gender using historical data. https://cran.r-project.org/web/packages/gender/vignettes/predicting-gender.html (2015)

  25. Sloan, L., Morgan, J., Burnap, P., Williams, M.: Who tweets? Deriving the demographic characteristics of age, occupation and social class from Twitter user meta-data. PLoS ONE 10(3), e0115,545 (2015)

    Article  Google Scholar 

  26. Tufekci, Z.: Facebook, youth and privacy in networked publics. In: Proceedings of the 6th International Conference on Weblogs and Social Media (ICWSM’12) (2012)

  27. Zafar, M.B., Bhattacharya, P., Ganguly, N., Gummadi, K.P., Ghosh, S.: Sampling content from online social networks: comparing random vs. expert sampling of the Twitter stream. ACM Trans. Web 9(3), 12:1–12:33 (2015)

    Article  Google Scholar 

  28. Aiello, L.M., Barrat, A., Schifanella, R., Cattuto, C., Benjamin, M., Menczer, F.: Friendship prediction and homophily in social media. ACM Trans. Web 6(2), 1131–1559 (2012)

    Article  Google Scholar 

  29. Thelwall, M.: Homophily in MySpace. J. Am. Soc. Inf. Sci. Technol. 60(2), 219–231 (2009)

    Article  Google Scholar 

  30. Cyber Dust: https://www.cyberdust.com/ (2016)

  31. Team, T.: The streaming APIs. https://dev.twitter.com/streaming/overview

Download references

Acknowledgements

This work is an extended version of the paper: Mondal et al. Forgetting in Social Media: Understanding and Controlling Longitudinal Exposure of Socially Shared Data, Proceedings of the 12th Symposium on Usable Privacy and Security (SOUPS’16), Denver, CO, USA, June 2016.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mainack Mondal.

Ethics declarations

Funding

Funding was provided by Max-Planck-Gesellschaft.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mondal, M., Messias, J., Ghosh, S. et al. Managing longitudinal exposure of socially shared data on the Twitter social media. Int J Adv Eng Sci Appl Math 9, 238–257 (2017). https://doi.org/10.1007/s12572-017-0196-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12572-017-0196-3

Keywords

Navigation