Skip to main content

Classifying Online Social Network Users through the Social Graph

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 7743))

Abstract

In this paper, we address the problem of classifying online social network users using a naively anonymized version of a social graph. We use two main user attributes defined by the graph structure to build an initial classifier, node degree and clustering coefficient, and then exploit user relationships to build a second classifier. We describe how to combine these two classifiers to build an Online Social Network (OSN) user classifier and then we evaluate the performance of our architecture by trying to solve two different classification problems (a binary and a multiclass problem) using data extracted from Twitter. Results show that the proposed classifier is sound and that both classification problems are feasible to solve by an attacker who is able to obtain a naively anonymized version of the social graph.

This work was partially supported by the Spanish MCYT and the FEDER funds under grants TSI2007-65406-C03-03 “E-AEGIS”, TIN2010-15764 “N-KHRONOUS”, and CONSOLIDER CSD2007-00004 “ARES”.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Boyd, D., Ellison, N.B.: Social network sites: Definition, history, and scholarship. Journal of Computer-Mediated Communication 13(1) (2007)

    Google Scholar 

  2. Wu, S., Hofman, J.M., Mason, W.A., Watts, D.J.: Who says what to whom on twitter. In: Proc. of World Wide Web Conference, WWW 2011 (2011)

    Google Scholar 

  3. Jernigan, C., Mistree, B.F.T.: Gaydar: Facebook friendships expose sexual orientation. First Monday 14(10) (2009)

    Google Scholar 

  4. Westin, A.: Privacy and Freedom. Atheneum (1970)

    Google Scholar 

  5. Macskassy, S.A., Provost, F.: Classification in networked data: A toolkit and a univariate case study. J. Mach. Learn. Res. 8, 935–983 (2007)

    Google Scholar 

  6. Macskassy, S.A., Provost, F.: A simple relational classifier. In: Proc. of the 2nd Workshop on Multi-Relational Data Mining, KDD 2003, pp. 64–76 (2003)

    Google Scholar 

  7. Chakrabarti, S., Dom, B., Indyk, P.: Enhanced hypertext categorization using hyperlinks. In: SIGMOD 1998: Proc. of the 1998 ACM SIGMOD International Conference on Management of Data, vol. 27, pp. 307–318. ACM Press, New York (1998)

    Chapter  Google Scholar 

  8. Lu, Q., Getoor, L.: Link-based classification using labeled and unlabeled data. In: Proc. of the ICML 2003 Workshop on the Continuum from Labeled to Unlabeled Data, Washington, DC (2003)

    Google Scholar 

  9. Jensen, D., Neville, J., Gallagher, B.: Why collective inference improves relational classification. In: KDD 2004: Proc. of the 2004 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pp. 593–598. ACM Press, New York (2004)

    Chapter  Google Scholar 

  10. Geman, S., Geman, D.: Stochastic relaxation, gibbs distributions, and the bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-6(6), 721–741 (1984)

    Article  Google Scholar 

  11. Neville, J., Jensen, D.: Iterative classification in relational data. In: AAAI 2000 Workshop on Learning Statistical Models from Relational Data (2000)

    Google Scholar 

  12. Gallagher, B., Eliassi-Rad, T.: An examination of experimental methodology for classifiers of relational data. In: Proc. of the 7th IEEE Int. Conf. on Data Mining Workshops, ICDMW 2007, pp. 411–416. IEEE Computer Society (2007)

    Google Scholar 

  13. Carvalho, V.R., Cohen, W.W.: On the collective classification of email ”speech acts”. In: SIGIR 2005: Proc. of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 345–352. ACM, New York (2005)

    Chapter  Google Scholar 

  14. Bhagat, S., Cormode, G., Rozenbaum, I.: Applying Link-Based Classification to Label Blogs. In: Zhang, H., Spiliopoulou, M., Mobasher, B., Giles, C.L., McCallum, A., Nasraoui, O., Srivastava, J., Yen, J. (eds.) WebKDD/SNA-KDD 2007. LNCS, vol. 5439, pp. 97–117. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  15. Hay, M., Miklau, G., Jensen, D., Weis, P., Srivastava, S.: Anonymizing Social Networks. Technical report (2007)

    Google Scholar 

  16. Zheleva, E., Getoor, L.: Preserving the Privacy of Sensitive Relationships in Graph Data. In: Bonchi, F., Ferrari, E., Malin, B., Saygin, Y. (eds.) PinKDD 2007. LNCS, vol. 4890, pp. 153–171. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  17. Liu, K., Terzi, E.: Towards identity anonymization on graphs. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, SIGMOD 2008, pp. 93–106. ACM, New York (2008)

    Chapter  Google Scholar 

  18. Zhou, B., Pei, J.: Preserving privacy in social networks against neighborhood attacks. In: Proceedings of the 2008 IEEE 24th International Conference on Data Engineering, ICDE 2008, pp. 506–515. IEEE Computer Society, Washington, DC (2008)

    Chapter  Google Scholar 

  19. Zou, L., Chen, L., Özsu, M.T.: k-automorphism: a general framework for privacy preserving network publication. Proc. VLDB Endow. 2(1), 946–957 (2009)

    Google Scholar 

  20. Ford, R., Truta, T.M., Campan, A.: P-sensitive k-anonymity for social networks. In: Stahlbock, R., Crone, S.F., Lessmann, S. (eds.) DMIN, pp. 403–409. CSREA Press (2009)

    Google Scholar 

  21. Knuth, D.E.: Art of Computer Programming: Fundamental Algorithms, 3rd edn., vol. 1. Addison-Wesley Professional (July 1997)

    Google Scholar 

  22. Hearst, M., Dumais, S., Osman, E., Platt, J., Scholkopf, B.: Support vector machines. IEEE Intelligent Systems and their Applications 13(4), 18–28 (1998)

    Article  Google Scholar 

  23. Manning, C.D., Raghavan, P., Schtze, H.: Support vector machines & machine learning on documents. In: Introduction to Information Retrieval, pp. 319–348. Cambridge University Press (2008)

    Google Scholar 

  24. Perlich, C., Provost, F.: Distribution-based aggregation for relational learning with identifier attributes. Machine Learning 62(1-2), 65–105 (2006)

    Article  Google Scholar 

  25. Rocchio, J.: Relevance Feedback in Information Retrieval, pp. 313–323. Prentice Hall (1971)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Pérez-Solà, C., Herrera-Joancomartí, J. (2013). Classifying Online Social Network Users through the Social Graph. In: Garcia-Alfaro, J., Cuppens, F., Cuppens-Boulahia, N., Miri, A., Tawbi, N. (eds) Foundations and Practice of Security. FPS 2012. Lecture Notes in Computer Science, vol 7743. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37119-6_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-37119-6_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-37118-9

  • Online ISBN: 978-3-642-37119-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics