A Generalized Approach for Social Network Integration and Analysis with Privacy Preservation

Yang, Chris; Thuraisingham, Bhavani

doi:10.1007/978-3-642-40837-3_8

Chris Yang³ &
Bhavani Thuraisingham⁴

Part of the book series: Studies in Big Data ((SBD,volume 1))

7333 Accesses
1 Citations

Abstract

Social network analysis is very useful in discovering the embedded knowledge in social network structures, which is applicable in many practical domains including homeland security, publish safety, epidemiology, public health, electronic commerce, marketing, and social science. However, social network data is usually distributed and no single organization is able to capture the global social network. For example, a law enforcement unit in Region A has the criminal social network data of her region; similarly, another law enforcement unit in Region B has another criminal social network data of Region B. Unfortunately, due the privacy concerns, these law enforcement units may not be allowed to share the data, and therefore, neither of them can benefit by analyzing the integrated social network that combines the data from the social networks in Region A and Region B. In this chapter, we discuss aspects of sharing the insensitive and generalized information of social networks to support social network analysis while preserving the privacy at the same time. We discuss the generalization approach to construct a generalized social network in which only insensitive and generalized information is shared. We will also discuss the integration of the generalized information and how it can satisfy a prescribed level of privacy leakage tolerance which is measured independently to the privacy-preserving techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Adibi, Chalupsky, H., Melz, E., Valente, A.: The KOJAK Group Finder: Connecting the Dots via Intergrated Knowledge-based and Statistical Reasoning. In: Innovative Applications of Artificial Intelligence Conference (2004)
Google Scholar
Agrawal, R., Srikant, R., Thomas, D.: Privacy Preserving OLAP. In: ACM SIGMOD 2005 (2005)
Google Scholar
Ahmad, M.A., Srivastava, J.: An Ant Colony Optimization Approach to Expert Identification in Social Networks. In: Liu, H., Salerno, J.J., Young, M.J. (eds.) Social Computing, Behavioral Modeling, and Prediction. Springer (2008)
Google Scholar
Backstrom, L., Dwork, C., Kleinberg, J.: Wherefore Art Thou R3579X? Anonymized Social Networks, Hidden Patterns, and Structural Steganography. In: WWW 2007, Banff, Alberta, Canada (2007)
Google Scholar
Bhatt, R., Chaoji, V., Parekh, R.: Predicting Product Adoption in Large-Scale Social Networks. In: ACM CIKM, Toronto, Ontario (2010)
Google Scholar
Bhattacharya, I., Getoor, L.: Iterative Record Linkage for Cleaning and Integration. In: SIGMOD 2004 Workshop on Research Issues on Data Mining and Knowledge Discovery (2004)
Google Scholar
Bhattacharya, I., Getoor, L.: Entity Resolution in Graphs. Technical Report 4758, Computer Science Department, University of Maryland (2005)
Google Scholar
Blum, A., Dwork, C., McSherry, F., Nissim, K.: Practical Privacy: the Sulq Framework. In: ACM PODS 2005 (2005)
Google Scholar
Brickell, J., Shmatikov, V.: Privacy-Preserving Graph Algorithms in the Semi-honest Model. In: Roy, B. (ed.) ASIACRYPT 2005. LNCS, vol. 3788, pp. 236–252. Springer, Heidelberg (2005)
Chapter Google Scholar
Chakrabarti, S., Dom, B., Indyk, P.: Enhanced Hypertext Categorization using Hyperlinks. In: ACM SIGMOD 1998 (1998)
Google Scholar
Chau, A.Y.K., Yang, C.C.: The Shift towards Multi-Disciplinarily in Information Science. Journal of the American Society for Information Science and Technology (2008)
Google Scholar
Chen, H., Yang, C.C.: Intelligence and Security Informatics: Techniques and Applications. Springer (2008)
Google Scholar
Craven, M., DiPasquo, D., Freitag, D., McCallum, A., Mitchell, T., Nigam, K., Slattery, S.: Learning to Construct Knowledge Bases from the World Wide Web. Artificial Intelligence 118, 69–114 (2000)
Article MATH Google Scholar
Dinur, I., Nissim, K.: Revealing Information While Preserving Privacy. In: ACM PODS 2003 (2003)
Google Scholar
Dong, X., Halevy, A., Madhavan, J.: Reference Reconciliation in Complex Information Spaces. In: ACM SIGMOD International Conference on Management of Data (2005)
Google Scholar
Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating Noise to Sensitivity in Private Data Analysis. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Heidelberg (2006)
Chapter Google Scholar
Frantz, T., Carley, K.M.: A Formal Characterization of Cellular Networks. Technical Report CMU-ISRI-05-109, Carnegie Mellon University (2005)
Google Scholar
Frikken, K.B., Golle, P.: Private Social Network Analysis: How to Assemble Pieces of a Graphy Privately. In: The 5th ACM Workshop on Privacy in Electronic Society (WPES 2006), Alexandria, VA (2006)
Google Scholar
Gao, J., Qiu, H., Jiang, X., Wang, T., Yang, D.: Fast Top-K Simple Shortest Discovery in Graphs. In: ACM CIKM, Toronto, Ontario (2010)
Google Scholar
Gartner, T.: Exponential and Geometric Kernels for Graphs. In: NIPS Workshop on Unreal Data: Principles of Modeling Nonvectorial Data (2002)
Google Scholar
Gartner, T.: A Survey of Kernels for Structured Data. ACM SIGKDD Explorations 5, 49–58 (2003)
Article Google Scholar
Getoor, L., Diehl, C.P.: Link Mining: A Survey. ACM SIGKDD Explorations 7, 3–12 (2005)
Article Google Scholar
Hay, M., Miklau, G., Jensen, D., Weis, P., Srivastava, S.: Anonymizing Social Networks. Technical Report 07-19, University of Massachusetts, Amherst (2007)
Google Scholar
Gubichev, A., Bedathur, S., Seufert, S., Weikum, G.: Fast and Accurate Estimation of Shortest Paths in Large Graphs. In: ACM CIKM, Toronto, Ontario (2010)
Google Scholar
Himmel, R., Zucker, S.: On the Foundations of Relaxation Labeling Process. IEEE Transactions on Pattern Analysis and Machine Intelligence, 267–287 (1983)
Google Scholar
Huang, J., Sun, H., Han, J., Deng, H., Sun, Y., Liu, Y.: SHRINK: A Structural Clustering Algorithm for Detecting Hierarchical Communities in Networks. In: ACM CIKM, Toronto, Ontario (2010)
Google Scholar
Huang, J., Zhuang, Z., Li, J., Giles, C.L.: Collaboration Over Time: Characterizing and Modeling Network Evolution. In: ACM WSDM 2008 Palo Alto, CA (2008)
Google Scholar
Jin, X., Zhang, M., Zhang, N., Das, G.: Versatile Publishing for Privacy Preservation. In: ACM KDD, Washington, DC (2010)
Google Scholar
Kenthapadi, K., Mishra, N., Nissim, K.: Simulatable Auditing. In: PODS 2005 (2005)
Google Scholar
Kerschbaum, F., Schaad, A.: Privacy-Preserving Social Network Analysis for Criminal Investigations. In: Proceedings of the ACM Workshop on Privacy in Electronic Society, Alexandria, VA (2008)
Google Scholar
Ketkar, N., Holder, L., Cook, D.: Comparison of Graph-based and Logic-based Multi-relational Data Mining. In: ACM SIGKDD Explorations, vol. 7 (December 2005)
Google Scholar
Kleinberg, J.: Authoritative Sources in a Hyperlinked Environment. Journal of the ACM 46, 604–632 (1999)
Article MathSciNet MATH Google Scholar
Kubica, J., Moore, A., Schneider, J., Yang, Y.: Stochastic Link and Group Detection. In: National Conference on Artificial Intelligence: American Association for Artificial Intelligence (2002)
Google Scholar
Kubica, J., Moore, A., Schneider, J.: Tractable Group Detection on Large Link Data Sets. In: IEEE International Conference on Data Mining (2003)
Google Scholar
Kuramochi, M., Karypis, G.: Frequent Subgraph Discover. In: IEEE International Conference on Data Mining (2001)
Google Scholar
Lafferty, L., McCallum, A., Pereira, F.: Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In: International Conference on Machine Learning (2001)
Google Scholar
Leroy, V., Cambazoglu, B.B., Bonchi, F.: Cold Start Link Prediction. In: ACM SIGKDD, Washington, DC (2010)
Google Scholar
Leung, C.W., Lim, E., Lo, D., Weng, J.: Mining Ineresting Link Formation Rules in Social Networks. In: ACM CIKM, Toronto, Ontario (2010)
Google Scholar
Li, N., Li, T.: t-closeness: Privacy Beyond k-anonymity and ldiversity. In: ICDE 2007 (2007)
Google Scholar
Liben-Nowell, D., Kleinberg, J.: The Link Prediction Problem for Social Networks. In: International Conference on Information and Knowledge Management, CIKM 2003 (2003)
Google Scholar
Lindell, Y., Pinkas, B.: Secure Multiparty Computation for Privacy-Preserving Data Mining. The Journal of Privacy and Confidentiality 1(1), 59–98 (2009)
Google Scholar
Liu, K., Terzi, E.: Towards Identity Anonymization on Graphs. In: ACM SIGMOD 2008. ACM Press, Vancouver (2008)
Google Scholar
Lu, Q., Getoor, L.: Link-based Classification. In: International Conference on Machine Learning (2003)
Google Scholar
Machanavajjhala, A., Gehrke, J., Kifer, D.: L-diversity: Privacy beyond k-anonymity. In: ICDE 2006 (2006)
Google Scholar
Merugu, S., Ghosh, J.: A Distributed Learning Framework for Heterogeneous Data Sources. In: ACM KDD 2005, Chicago, Illinois, USA (2005)
Google Scholar
Morris, M.: Network Epidemiology: A Handmbook for Survey Design and Data Collection. Oxford University Press, London (2004)
Book Google Scholar
Muralidhar, K., Sarathy, R.: Security of Random Data Perturbation Methods. ACM Transactions on Database Systems 24, 487–493 (1999)
Article Google Scholar
Nabar, S.U., Marthi, B., Kenthapadi, K., Mishra, N., Motwani, R.: Towards Robustness in Query Auditing. In: VLDB, pp. 151-162 (2006)
Google Scholar
Nakashima, E.: “Cyber Attack Data-Sharing is Lacking, Congress Told,” the Washington Post, p. D02 (September 19, 2008), http://www.washingtonpost.com/wp-dyn/content/article/2008/09/18/AR2008091803730.html
Nergiz, M.E., Atzori, M., Clifton, C.: Hiding the Presence of Individuals from Shared Database. In: SIGMOD 2007 (2007)
Google Scholar
Newman, M.E.J.: Detecting Community Structure in Networks. European Physical Journal B 38, 321–330 (2004)
Article Google Scholar
Oh, H.J., Myeaeng, S.H., Lee, M.H.: A Practical Hypertext Categorization Method using Links and Incrementally Available Class Information. In: International ACM SIGIR Conference on Research and Development in Information Retrieval (2000)
Google Scholar
O’Madadhain, J., Hutchins, J., Smyth, P.: Prediction and Ranking Algorithms for Even-based Network Data. ACM SIGKDD Explorations 7 (December 2005)
Google Scholar
Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank Citation Ranking: Bringing Order to the Web. Technical Report, Standford University (1998)
Google Scholar
Sageman, M.: Understanding Terror Networks. University of Pennsylvania Press (2004)
Google Scholar
Sakuma, J., Kobayashi, S.: Link Analysis for Private Weighted Graphs. In: Proceedings of ACM SIGIR 2009, Boston, MA, pp. 235–242 (2009)
Google Scholar
Samarati, P.: Protecting Respondents’ Identities in Microdata Release. IEEE Transactions on Knowledge and Data Engineering 13, 1010–1027 (2001)
Article Google Scholar
Srivastava, J., Pathak, N., Mane, S., Ahmad, M.A.: Data Mining for Social Network Analysis. Tutorial Notes in the 2006 IEEE International Conference on Data Mining, Hong Kong, December 18-22 (2006)
Google Scholar
Sweeney, L.: Uniqueness of Simple Demographics in the US Population. Technical Report, Carnegie Mellon University (2000)
Google Scholar
Sweeney, L.: K-Anonymity: A Model for Protecting Privacy. International Journal of Uncertainty Fuzziness Knowledge-based Systems 10, 557–570 (2002)
Article MathSciNet MATH Google Scholar
Tai, C., Yu, P.S., Chen, M.: k-Support Anonymity Based on Pseudo Taxonomy for Outsourcing of Frequent Itemset Mining. In: ACM SIGKDD, Washington, DC (2010)
Google Scholar
Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., Su, Z.: ArnetMiner: Extraction and Mining of Academic Social Networks. In: ACM KDD 2008. ACM Press, Las Vegas (2008)
Google Scholar
Thuraisingham, B.: Security Issues for Federated Databases Systems. In: Computers and Security. North Holland (1994)
Google Scholar
Thuraisingham, B.: Assured Information Sharing: Technologies, Challenges and Directions. In: Chen, H., Yang, C.C. (eds.) Intelligence and Security Informatics: Technqiues and Applications. SCI, vol. 135, pp. 1–15. Springer, Heidelberg (2008)
Chapter Google Scholar
Tyler, J.R., Wilkinson, D.M., Huberman, B.A.: Email as Spectroscopy: Automated Discovery of Community Structure within Organizations, The Netherlands (2003)
Google Scholar
Vaidya, R.J., Clifton, C.: Privacy-preserving top-k queries. In: International Conference of Data Engineering (2005)
Google Scholar
Wasserman, S., Faust, K.: Social Network Analysis: Methods and Applications. Cambridge University Press, Cambridge (1994)
Google Scholar
Watts, D.J., Strogatz, S.H.: Collective Dynamics of "Small-wolrd" Networks. Nature 339, 440–442 (1998)
Article Google Scholar
Wolfe, A.P., Jensen, D.: Playing Multiple Roles: Discovering Overlapping Roles in Social Networks. In: ICML 2004 Workshop on Statistical Relational Learning and its Connections to Other Fields (2004)
Google Scholar
Wong, R.C., Li, J., Fu, A., Wang, K.: (a,k)-Anonymity: An enhanced k-Anonymity Model for Privacy-Preserving Data Publishing. In: SIGKDD, Philadelphia, PA (2006)
Google Scholar
Xiao, X., Tao, Y.: Personalized Privacy Preservation. In: SIGMOD, Chicago, Illinois (2006)
Google Scholar
Xiao, X., Tao, Y.: m-invariance: Towards Privacy Preserving Republication of Dynamic Datasets. In: ACM SIGMOD 2007. ACM Press (2007)
Google Scholar
Xiao, X., Tao, Y.: Dynamic Anonymization: Accurate Statistical Analysis with Privacy Preservation. In: ACM SIGMOD 2008. ACM Press, Vancouver (2008)
Google Scholar
Xu, J., Chen, H.: CrimeNet Explorer: A Framework for Criminal Network Knowledge Discovery. ACM Transactions on Information Systems 23, 201–226 (2005)
Article Google Scholar
Yan, X., Han, J.: gSpan: Graph-based Substructure Pattern Mining. In: International Conference on Data Mining (2002)
Google Scholar
Yang, C.C., Liu, N., Sageman, M.: Analyzing the Terrorist Social Networks with Visualization Tools. In: IEEE International Conference on Intelligence and Security Informatics, San Diego, CA (2006)
Google Scholar
Yang, C.C., Ng, T.D.: Terrorism and Crime Related Weblog Social Network: Link, Content Analysis and Information Visualization. In: IEEE International Conference on Intelligence and Security Informatics, New Brunswick, NJ (2007)
Google Scholar
Yang, C.C., Ng, T.D., Wang, J.-H., Wei, C.-P., Chen, H.: Analyzing and Visualizing Gray Web Forum Structure. In: Yang, C.C., et al. (eds.) PAISI 2007. LNCS, vol. 4430, pp. 21–33. Springer, Heidelberg (2007)
Chapter Google Scholar
Yang, C.C.: Information Sharing and Privacy Protection of Terrorist or Criminal Social Networks. In: IEEE International Conference on Intelligence and Security Informatics, Taipei, Taiwan, pp. 40–45 (2008)
Google Scholar
Yang, C.C., Ng, T.D.: Analyzing Content Development and Visualizing Social Interactions in Web Forum. In: IEEE International Conference on Intelligence and Security Informatics Taipei, Taiwan (2008)
Google Scholar
Yang, C.C., Sageman, M.: Analysis of Terrorist Social Networks with Fractal Views. Journal of Information Science (2009)
Google Scholar
Yang, C.C., Tang, X.: Social Networks Integration and Privacy Preservation using Subgraph Generalization. In: Proceedings of AMC SIGKDD Workshop on CyberSecurity and Intelligence Informatics, Paris, France (June 28, 2009)
Google Scholar
Yang, C.C., Tang, X., Thuraisingham, B.: An Analysis of User Influence Ranking Algorithms on Dark Web Forums. In: Proceedings of ACM SIGKDD Workshop on Intelligence and Security Informatics (ISI-KDD), Washington, D.C. (July 25, 2010)
Google Scholar
Yang, C.C., Thuraisingham, B.: Privacy-Preserved Social Network Integration and Analysis for Security Informatics. IEEE Intelligent Systems 25(3), 88–90 (2010)
Google Scholar
Yang, X., Asur, S., Parthasarathy, S., Mehta, S.: A Visual-Analytic Toolkit for Dynamic Interaction Graphs. In: ACM KDD 2008, Las Vegas, Nevada (2008)
Google Scholar
Yao, A.: Protocols for Secure Computations. In: Proceedings of the Annual IEEE Symposium on Foundations of Computer Science, vol. 23 (1982)
Google Scholar
Ying, X., Wu, X.: Randomizing Social Networks: A Spectrum Preserving Approach. In: SIAM International Conference on Data Mining (SDM 2008), Atlanta, GA (2008)
Google Scholar
Zheleva, E., Getoor, L.: Preserving the Privacy of Sensitive Relationships in Graph Data. In: Bonchi, F., Malin, B., Saygın, Y. (eds.) PInKDD 2007. LNCS, vol. 4890, pp. 153–171. Springer, Heidelberg (2008)
Chapter Google Scholar
Zhou, B., Pei, J.: Preserving Privacy in Social Networks against Neighborhood Attacks. In: IEEE International Conference on Data Engineering (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Drexel University, Philadelphia, PA, USA
Chris Yang
The University of Texas at Dallas, Richardson, TX, USA
Bhavani Thuraisingham

Authors

Chris Yang
View author publications
You can also search for this author in PubMed Google Scholar
Bhavani Thuraisingham
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chris Yang .

Editor information

Editors and Affiliations

Department of Computer Science, University of California, Los Angeles, USA
Wesley W. Chu

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Yang, C., Thuraisingham, B. (2014). A Generalized Approach for Social Network Integration and Analysis with Privacy Preservation. In: Chu, W. (eds) Data Mining and Knowledge Discovery for Big Data. Studies in Big Data, vol 1. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40837-3_8

Download citation

DOI: https://doi.org/10.1007/978-3-642-40837-3_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40836-6
Online ISBN: 978-3-642-40837-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics