Skip to main content

Online Social Network Profile Linkage Based on Cost-Sensitive Feature Acquisition

  • Conference paper

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 489))

Abstract

Billions of people spend their virtual life time on hundreds of social networking sites for different social needs. Each social footprint of a person in a particular social networking site reflects some special aspects of himself. To adequately investigate a user’s preference for applications such as recommendation and executive search, we need to connect up all these aspects to generate a comprehensive profile of the identity. Profile linkage provides an effective solution to identify the same identity’s profiles from different social networks.

With various types of resources, comparing profiles may require plenty of expensive and time-consuming features such as avatars. To boost the online social network profile linkage solution, we propose a cost-sensitive approach that only acquires these expensive and time-consuming features when needed. By evaluating on the real-world datasets from Twitter and LinkedIn, our approach performs at over 85% F 1-measure and has the ability to prune over 80% of the unnecessary feature acquisitions, at a marginal cost of 10% performance loss.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Carmagnola, F., Cena, F.: User Identification for Cross-system Personalisation. Inf. Sci. 179(1-2) (2009)

    Google Scholar 

  2. Malhotra, A., Totti, L., Meira Jr., W., Kumaraguru, P., Almeida, V.: Studying User Footprints in Different Online Social Networks. In: International Workshop on Cybersecurity of Online Social Network (2012)

    Google Scholar 

  3. Nunes, A., Calado, P., Martins, B.: Resolving User Identities over Social Networks through Supervised Learning and Rich Similarity Features. In: Proceedings of the 27th Annual ACM Symposium on Applied Computing. ACM (2012)

    Google Scholar 

  4. Vosecky, J., Hong, D., Shen, V.: User Identification Across Multiple Social Networks. In: Networked Digital Technologies. IEEE (2009)

    Google Scholar 

  5. Narayanan, A., Shmatikov, V.: De-anonymizing Social Networks. In: Proceedings of the 2009 30th IEEE Symposium on Security and Privacy. IEEE (2009)

    Google Scholar 

  6. Bartunov, S., Korshunov, A., Park, S., Ryu, W., Lee, H.: Joint Link-Attribute User Identity Resolution in Online Social Networks. In: Proceedings of the 6th International Conference on Knowledge Discovery and Data Mining, Workshop on Social Network Mining and Analysis. ACM (2012)

    Google Scholar 

  7. Zafarani, R., Liu, H.: Connecting Users across Social Media Sites: A Behavioral-modeling Approach. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 41–49. ACM, New York (2013)

    Chapter  Google Scholar 

  8. Cohen, W.W., Ravikumar, P., Fienberg, S.E., et al.: A Comparison of String Distance Metrics for Name-matching Tasks. In: Proceedings of the IJCAI 2003 Workshop on Information Integration on the Web (IIWeb 2003), pp. 73–78 (2003)

    Google Scholar 

  9. Christen, P.: A Comparison of Personal Name Matching: Techniques and Practical Issues. In: Proceedings of the 6th IEEE International Conference on Data Mining Workshops, ICDM Workshops. IEEE (2006)

    Google Scholar 

  10. Aumueller, D., Do, H.H., Massmann, S., Rahm, E.: Schema and Ontology Matching with Coma++. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, SIGMOD 2005, p. 906. ACM Press (2005)

    Google Scholar 

  11. Nottelmann, H., Straccia, U.: Information Retrieval and Machine Learning for Probabilistic Schema Matching. Information Processing & Management 43(3), 552–576 (2007)

    Article  Google Scholar 

  12. Qian, L., Cafarella, M.J., Jagadish, H.V.: Sample-driven schema mapping. In: Proceedings of the 2012 International Conference on Management of Data, SIGMOD 2012, p. 73. ACM Press (2012)

    Google Scholar 

  13. Ravikumar, P., Cohen, W.W.: A Hierarchical Graphical Model for Record Linkage. In: Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, pp. 454–461. AUAI Press (2004)

    Google Scholar 

  14. Leitão, L., Calado, P., Herschel, M.: Efficient and Effective Duplicate Detection in Hierarchical Data. IEEE Transactions on Knowledge and Data Engineering PP(99), 1 (2012)

    Google Scholar 

  15. Irani, D., Webb, S., Li, K., Pu, C.: Large Online Social Footprints–An Emerging Threat. In: Proceedings of the International Conference on Computational Science and Engineering. IEEE (2009)

    Google Scholar 

  16. Liu, J., Zhang, F., Song, X., Song, Y.I., Lin, C.Y., Hon, H.W.: What’s in A Name?: An Unsupervised Approach to Link Users Across Communities. In: Proceedings of the Sixth ACM International Conference on Web Search and Data Mining. ACM (2013)

    Google Scholar 

  17. Ji, S., Carin, L.: Cost-sensitive Feature Acquisition and Classification. Pattern Recognition 40(5), 1474–1485 (2007)

    Article  MATH  Google Scholar 

  18. Ling, C.X., Sheng, V.S., Yang, Q.: Test strategies for cost-sensitive decision trees. IEEE Trans. on Knowl. and Data Eng. 18(8), 1055–1067 (2006)

    Article  Google Scholar 

  19. Saar-Tsechansky, M., Melville, P., Provost, F.: Active feature-value acquisition. Manage. Sci. 55(4), 664–684 (2009)

    Article  Google Scholar 

  20. Lin, Y.C., Yang, D.N., Chen, M.S.: Selective Data Acquisition for Probabilistic K-nn Query. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 1357–1360. ACM (2010)

    Google Scholar 

  21. Tan, Y.F., Kan, M.Y.: Hierarchical cost-sensitive web resource acquisition for record matching. In: Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, vol. 01, pp. 382–389. IEEE Computer Society (2010)

    Google Scholar 

  22. Epanechnikov, V.: Non-Parametric Estimation of a Multivariate Probability Density. Theory of Probability & Its Applications 14(1), 153–158 (1969)

    Article  Google Scholar 

  23. John, G.H., Langley, P.: Estimating Continuous Distributions in Bayesian Classifiers. In: Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence (1995)

    Google Scholar 

  24. Zhang, J., Marszalek, M., Lazebnik, S., Schmid, C.: Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study. Int. J. Comput. Vision 73(2) (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhang, H., Kan, M., Liu, Y., Ma, S. (2014). Online Social Network Profile Linkage Based on Cost-Sensitive Feature Acquisition. In: Huang, H., Liu, T., Zhang, HP., Tang, J. (eds) Social Media Processing. SMP 2014. Communications in Computer and Information Science, vol 489. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45558-6_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-45558-6_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-45557-9

  • Online ISBN: 978-3-662-45558-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics