Skip to main content

A Hybrid Model for Linking Multiple Social Identities Across Heterogeneous Online Social Networks

  • Conference paper
  • First Online:
SOFSEM 2017: Theory and Practice of Computer Science (SOFSEM 2017)

Abstract

Automated online profiling consists of the accurate identification and linking of multiple online identities across heterogeneous online social networks that correspond to the same entity in the physical world. The paper proposes a hybrid profile correlation model which relies on a diversity of techniques from different application domains, such as record linkage and data integration, image and text similarity, and machine learning. It involves distance-based comparison methods and the exploitation of information produced by a social network identification process for use as external knowledge towards searches on other social networks; thus, the remaining identification tasks for the same individual are optimized. The experimental study shows that, even with limited resources, the proposed method collects and combines accurate information effectively from different online sources in a fully-automated way. The mined knowledge then becomes a powerful toolkit to carry out social engineering and other attacks, or for profit and decision-making data mining purposes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.gpeters.com/names/.

  2. 2.

    http://www.geonames.org/.

  3. 3.

    https://newsroom.fb.com/company-info/.

  4. 4.

    Refer to e.g. the following list of over 200 OSNs at the time of writing: https://en.wikipedia.org/wiki/List_of_social_networking_websites.

References

  1. Chaski, C.E.: Empirical evaluations of language-based author identification techniques. Forensic Linguist. 8, 1–65 (2001)

    Google Scholar 

  2. Narayanan, A., Shmatikov, V.: De-anonymizing social networks. In: Proceedings 30th IEEE Symposium on Security & Privacy, pp. 173–187 (2009)

    Google Scholar 

  3. Irani, D., et al.: Large online social footprints - an emerging threat. In: Proceedings IEEE International Conference on Computational Science & Engineering - CSE, vol. 3, pp. 271–276 (2009)

    Google Scholar 

  4. Erlandsson, F., Boldt, M., Johnson, H.: Privacy threats related to user profiling in OSNs. In: Proceedings of IEEE International Conference on Social Computing, pp. 838–842 (2012)

    Google Scholar 

  5. Christen, P.: Data Matching: Concepts and Techniques for Record Linkage, Entity Resolution, and Duplicate Detection. Springer Science & Business Media (2012)

    Google Scholar 

  6. Flickner, M., et al.: Query by image and video content: the QBIC system. IEEE Comput. 28(9), 23–32 (1995)

    Article  Google Scholar 

  7. Kokkos, A., Tzouramanis, T.: A robust gender inference model for online social networks and its application to LinkedIn and Twitter. First Monday 19(9) (2014)

    Google Scholar 

  8. Bilenko, M., Mooney, R., Cohen, W., Ravikumar, P., Fienberg, S.: Adaptive name matching in information integration. IEEE Intell. Syst. 18(5), 16–23 (2003)

    Article  Google Scholar 

  9. Winkler, W.E.: String comparator metrics and enhanced decision rules in the Fellegi-Sunter model of record linkage. In: Proceedings of the Section on Survey Research Methods, American Statistical Association, pp. 354–359 (1990)

    Google Scholar 

  10. Jaccard, P.: Lois de distribution florale. Bulletin de la Socíeté Vaudoise des Sciences Naturelles 38, 67–130 (1902)

    Google Scholar 

  11. Jaccard, P.: The distribution of the flora in the alpine zone. New Phytol. 11(2), 37–50 (1912)

    Article  Google Scholar 

  12. Winker, E.W.: Overview of record linkage and current research directions. Statistical Research Division U.S. Census Bureau (2006)

    Google Scholar 

  13. Balduzzi, M., et al.: Abusing social networks for automated user profiling. In: Proceedings International Workshop on Recent Advances in Intrusion Detection, pp. 422–441 (2010)

    Google Scholar 

  14. Wang, Y., Liu, T., Tan, Q., Shi, J., Guo, L.: Identifying users across different sites using usernames. Procedia Comput. Sci. 80, 376–385 (2016)

    Google Scholar 

  15. Bilge, L., Strufe, T., Balzarotti, D., Kirda, E.: All your contacts are belong to us: automated identity theft attacks on social networks. In: Proceedings 18th ACM International Conference on WWW, pp. 551–560 (2009)

    Google Scholar 

  16. Zhou, C., Chen, H., Yu, T.: Learning a probabilistic semantic model from heterogeneous social networks for relationship identification. In: Proceedings 20th IEEE International Conference on Tools with Artificial Intelligence, vol. 1, 343–350 (2008)

    Google Scholar 

  17. Vosecky, J., Hong, D., Shen, V.Y.: User identification across multiple OSNs. In: Proceedings 1st IEEE International Conference on Networked Digital Technologies, pp. 360–365 (2009)

    Google Scholar 

  18. Peled, O., Fire, M., Rokach, L., Elovici, Y.: Matching entities across online social networks. Neurocomputing 210, 91–106 (2016)

    Google Scholar 

  19. Zhang, Y., Tang, J., Yang, Z., Pei, J., Yu, P.S.: COSNET: connecting heterogeneous social networks with local and global consistency. In: Proceedings 21st ACM SIGKDD International Conference on Knowledge Discovery & Data Mining – KDD, 1485–1494 (2015)

    Google Scholar 

  20. Liu, S., Wang, S., Zhu, F., Zhang, J., Krishnan, R.: HYDRA: Large-scale social identity linkage via heterogeneous behavior modeling. In: Proceedings ACM International Conference on Management of Data- SIGMOD, pp. 51–62 (2014)

    Google Scholar 

  21. Wondracek, G., Holz, T., Kirda, E., Kruegel, C.: A practical attack to de-anonymize social network users. In: Proceedings IEEE Symposium on Security & Privacy, pp. 223–238 (2010)

    Google Scholar 

  22. Goga, O., Loiseau, P., Sommer, R., Teixeira, R., Gummadi, K.P.: On the reliability of profile matching across large online social networks. In: Proceedings 21st ACM SIGKDD International Conference on Knowledge Discovery & Data Mining – KDD, pp. 1799–1808 (2015)

    Google Scholar 

  23. Egozi, O., Markovitch, S., Gabrilovich, E.: Concept-based information retrieval using explicit semantic analysis. ACM Trans. Inf. Syst. 29(2) (2011). article 8

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Theodoros Tzouramanis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Kokkos, A., Tzouramanis, T., Manolopoulos, Y. (2017). A Hybrid Model for Linking Multiple Social Identities Across Heterogeneous Online Social Networks. In: Steffen, B., Baier, C., van den Brand, M., Eder, J., Hinchey, M., Margaria, T. (eds) SOFSEM 2017: Theory and Practice of Computer Science. SOFSEM 2017. Lecture Notes in Computer Science(), vol 10139. Springer, Cham. https://doi.org/10.1007/978-3-319-51963-0_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-51963-0_33

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-51962-3

  • Online ISBN: 978-3-319-51963-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics