Semantic-Based Sensitive Topic Dissemination Control Mechanism for Safe Social Networking
Online Social Networks (OSN) contains a huge volume of publicly available information shared by the users. The users tend to share certain sensitive information which can be easily leaked and disclosed to unprivileged users. It clearly clarifies that the user lacks the knowledge of access control mechanisms available to prevent information leakage and data privacy. There is a need to automatically detect and protect the information disclosed beyond the existing privacy settings offered by OSN service providers. An automatic Semantic-based Sensitive Topic (SST) sanitization mechanism is introduced in this paper, which consider user’s relationship strength and semantic access rules concerning the sensitivity of the information shared on Twitter. The interaction documents undergo Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (SST-LDA) clustering to identify sensitive topic clusters. The experimental result shows (i) the topic clusters are discovered by means of cluster entropy with very high accuracy, (ii) the probability distribution of Kullback–Leibler (KL) divergence between sensitive and sanitized Twitter post leads to a very negligible information loss up to 0.24 which is practically acceptable, and (iii) the sanitization for 16 sensitive topics between 790 Twitter users is tested which can be correlated with the advanced privacy settings to the OSN users in near future.
KeywordsSocial networking Big data Topic modeling Sensitivity analysis Entropy KL divergence
- 1.Li, K., Lin, Z., Wang, X.: An empirical analysis of users privacy disclosure behaviours on social network sites. Int. J. Inform. Manag. 52(7) (2015)Google Scholar
- 3.Villata, S., Costabello, L., Delaforge, N., Gandon, F: A social semantic web access control model. J. Data Semant. 2, 21–36 (2013)Google Scholar
- 4.Carbunar, B., Rahman, M., Pissinou, N.: A survey of privacy vulnerabilities and defenses in geosocial networks. IEEE Commun. Mag. 51(11) (2013)Google Scholar
- 5.Kandadai, V., Yang, H., Jiang, L., Yang, C.C., Fleisher, L., Winston, F.K.: Measuring health information dissemination and identifying target interest communities on twitter. JMIR Res Protocols 5(2) (2016)Google Scholar
- 8.Sánchez, D., Batet, M., Alexandre, V.: Utility-preserving sanitization of semantically correlated terms in textual documents. J. Inf. Sci. 279 (2014)Google Scholar