Classifying Online Social Network Users through the Social Graph

Pérez-Solà, Cristina; Herrera-Joancomartí, Jordi

doi:10.1007/978-3-642-37119-6_8

Classifying Online Social Network Users through the Social Graph

Cristina Pérez-Solà²⁰ &
Jordi Herrera-Joancomartí^20,21

Conference paper

1321 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 7743))

Abstract

In this paper, we address the problem of classifying online social network users using a naively anonymized version of a social graph. We use two main user attributes defined by the graph structure to build an initial classifier, node degree and clustering coefficient, and then exploit user relationships to build a second classifier. We describe how to combine these two classifiers to build an Online Social Network (OSN) user classifier and then we evaluate the performance of our architecture by trying to solve two different classification problems (a binary and a multiclass problem) using data extracted from Twitter. Results show that the proposed classifier is sound and that both classification problems are feasible to solve by an attacker who is able to obtain a naively anonymized version of the social graph.

This work was partially supported by the Spanish MCYT and the FEDER funds under grants TSI2007-65406-C03-03 “E-AEGIS”, TIN2010-15764 “N-KHRONOUS”, and CONSOLIDER CSD2007-00004 “ARES”.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Boyd, D., Ellison, N.B.: Social network sites: Definition, history, and scholarship. Journal of Computer-Mediated Communication 13(1) (2007)
Google Scholar
Wu, S., Hofman, J.M., Mason, W.A., Watts, D.J.: Who says what to whom on twitter. In: Proc. of World Wide Web Conference, WWW 2011 (2011)
Google Scholar
Jernigan, C., Mistree, B.F.T.: Gaydar: Facebook friendships expose sexual orientation. First Monday 14(10) (2009)
Google Scholar
Westin, A.: Privacy and Freedom. Atheneum (1970)
Google Scholar
Macskassy, S.A., Provost, F.: Classification in networked data: A toolkit and a univariate case study. J. Mach. Learn. Res. 8, 935–983 (2007)
Google Scholar
Macskassy, S.A., Provost, F.: A simple relational classifier. In: Proc. of the 2nd Workshop on Multi-Relational Data Mining, KDD 2003, pp. 64–76 (2003)
Google Scholar
Chakrabarti, S., Dom, B., Indyk, P.: Enhanced hypertext categorization using hyperlinks. In: SIGMOD 1998: Proc. of the 1998 ACM SIGMOD International Conference on Management of Data, vol. 27, pp. 307–318. ACM Press, New York (1998)
Chapter Google Scholar
Lu, Q., Getoor, L.: Link-based classification using labeled and unlabeled data. In: Proc. of the ICML 2003 Workshop on the Continuum from Labeled to Unlabeled Data, Washington, DC (2003)
Google Scholar
Jensen, D., Neville, J., Gallagher, B.: Why collective inference improves relational classification. In: KDD 2004: Proc. of the 2004 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pp. 593–598. ACM Press, New York (2004)
Chapter Google Scholar
Geman, S., Geman, D.: Stochastic relaxation, gibbs distributions, and the bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-6(6), 721–741 (1984)
Article Google Scholar
Neville, J., Jensen, D.: Iterative classification in relational data. In: AAAI 2000 Workshop on Learning Statistical Models from Relational Data (2000)
Google Scholar
Gallagher, B., Eliassi-Rad, T.: An examination of experimental methodology for classifiers of relational data. In: Proc. of the 7th IEEE Int. Conf. on Data Mining Workshops, ICDMW 2007, pp. 411–416. IEEE Computer Society (2007)
Google Scholar
Carvalho, V.R., Cohen, W.W.: On the collective classification of email ”speech acts”. In: SIGIR 2005: Proc. of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 345–352. ACM, New York (2005)
Chapter Google Scholar
Bhagat, S., Cormode, G., Rozenbaum, I.: Applying Link-Based Classification to Label Blogs. In: Zhang, H., Spiliopoulou, M., Mobasher, B., Giles, C.L., McCallum, A., Nasraoui, O., Srivastava, J., Yen, J. (eds.) WebKDD/SNA-KDD 2007. LNCS, vol. 5439, pp. 97–117. Springer, Heidelberg (2009)
Chapter Google Scholar
Hay, M., Miklau, G., Jensen, D., Weis, P., Srivastava, S.: Anonymizing Social Networks. Technical report (2007)
Google Scholar
Zheleva, E., Getoor, L.: Preserving the Privacy of Sensitive Relationships in Graph Data. In: Bonchi, F., Ferrari, E., Malin, B., Saygin, Y. (eds.) PinKDD 2007. LNCS, vol. 4890, pp. 153–171. Springer, Heidelberg (2008)
Chapter Google Scholar
Liu, K., Terzi, E.: Towards identity anonymization on graphs. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, SIGMOD 2008, pp. 93–106. ACM, New York (2008)
Chapter Google Scholar
Zhou, B., Pei, J.: Preserving privacy in social networks against neighborhood attacks. In: Proceedings of the 2008 IEEE 24th International Conference on Data Engineering, ICDE 2008, pp. 506–515. IEEE Computer Society, Washington, DC (2008)
Chapter Google Scholar
Zou, L., Chen, L., Özsu, M.T.: k-automorphism: a general framework for privacy preserving network publication. Proc. VLDB Endow. 2(1), 946–957 (2009)
Google Scholar
Ford, R., Truta, T.M., Campan, A.: P-sensitive k-anonymity for social networks. In: Stahlbock, R., Crone, S.F., Lessmann, S. (eds.) DMIN, pp. 403–409. CSREA Press (2009)
Google Scholar
Knuth, D.E.: Art of Computer Programming: Fundamental Algorithms, 3rd edn., vol. 1. Addison-Wesley Professional (July 1997)
Google Scholar
Hearst, M., Dumais, S., Osman, E., Platt, J., Scholkopf, B.: Support vector machines. IEEE Intelligent Systems and their Applications 13(4), 18–28 (1998)
Article Google Scholar
Manning, C.D., Raghavan, P., Schtze, H.: Support vector machines & machine learning on documents. In: Introduction to Information Retrieval, pp. 319–348. Cambridge University Press (2008)
Google Scholar
Perlich, C., Provost, F.: Distribution-based aggregation for relational learning with identifier attributes. Machine Learning 62(1-2), 65–105 (2006)
Article Google Scholar
Rocchio, J.: Relevance Feedback in Information Retrieval, pp. 313–323. Prentice Hall (1971)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. d’Enginyeria de la Informació i les Comunicacions, Universitat Autònoma de Barcelona, 08193, Bellaterra, Catalonia, Spain
Cristina Pérez-Solà & Jordi Herrera-Joancomartí
Internet Interdisciplinary Institute (IN3), UOC, Spain
Jordi Herrera-Joancomartí

Authors

Cristina Pérez-Solà
View author publications
You can also search for this author in PubMed Google Scholar
Jordi Herrera-Joancomartí
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

TELECOM SudParis, 9 rue Charles Fourier, 91011, Evry, CEDEX, France
Joaquin Garcia-Alfaro
TELECOM Bretagne, 2 rue de la Châtaigneraie, 35512, Cesson Sévigné, CEDEX, France
Frédéric Cuppens & Nora Cuppens-Boulahia &
Ryerson University, 245 Church Street, M5B 2K3, Toronto, ON, Canada
Ali Miri
Université Laval, 1065 avenue de la médecine, G1V 0A6, Quebec, Canada
Nadia Tawbi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pérez-Solà, C., Herrera-Joancomartí, J. (2013). Classifying Online Social Network Users through the Social Graph. In: Garcia-Alfaro, J., Cuppens, F., Cuppens-Boulahia, N., Miri, A., Tawbi, N. (eds) Foundations and Practice of Security. FPS 2012. Lecture Notes in Computer Science, vol 7743. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37119-6_8

Download citation

DOI: https://doi.org/10.1007/978-3-642-37119-6_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37118-9
Online ISBN: 978-3-642-37119-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics