Privacy Risk Assessment of Individual Psychometric Profiles

Mariani, Giacomo; Monreale, Anna; Naretto, Francesca

doi:10.1007/978-3-030-88942-5_32

Giacomo Mariani¹⁰,
Anna Monreale¹⁰ &
Francesca Naretto¹¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12986))

Included in the following conference series:

International Conference on Discovery Science

1488 Accesses

Abstract

In the modern Internet era the usage of social media such as Twitter and Facebook is constantly increasing. These social media are accumulating a lot of textual data, because individuals often use them for sharing their experiences and personal facts writing text messages. These data hide individual psychological aspects that might represent a valuable alternative source with respect to the classical clinical texts. In many studies, text messages are used to extract individuals psychometric profiles that help in analysing the psychological behaviour of users. Unfortunately, both text messages and psychometric profiles may reveal personal and sensitive information about users, leading to privacy violations. Therefore, in this paper, we propose a study of privacy risk for psychometric profiles: we empirically analyse the privacy risk of different aspects of the psychometric profiles, identifying which psychological facts expose users to an identity disclosure.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

The Privacy Paradox in the Context of Online Health Data Disclosure by Users

Confidentiality, Privacy, and Anonymity

Optional Data Disclosure and the Online Privacy Paradox: A UK Perspective

Notes

1.
The implementation of these attacks, written in Python 3.7, is available on Github https://github.com/karjudev/text-privacy. For conducting the experiments we used a server with 16x Intel(R) Xeon(R) Gold 5120 CPU @ 2.20 GHz (64 bits), 63 gb RAM.
2.
Composed of 517, 401 messages from 158 different authors.
3.
482,117 messages from 20,192 authors.
4.
230,571 messages from 20,192 authors.
5.
176,243 messages from 6,410 authors.
6.
176,207 messages from 6,410 authors.
7.
The number is obtained using the Sturges formula [22].
8.
Colloquial terms are: “mom” or “dad” and “mate” or “buddy”.
9.
Words like “think”, “know”, “always”, “never” and “should”.

References

Abul, O., Bonchi, F., Nanni, M.: Anonymization of moving objects databases by clustering and perturbation. Inf. Syst. 35, 884–910 (2010)
Article Google Scholar
Anandan, B., Clifton, C.: Significance of term relationships on anonymization. In: Web Intelligence/IAT Workshops (2011)
Google Scholar
Chakaravarthy, V.T., Gupta, H., Roy, P., Mohania, M.K.: Efficient techniques for document sanitization. In: CIKM (2008)
Google Scholar
Choudhury, M., Counts, S., Horvitz, E.: Predicting postpartum changes in emotion and behavior via social media. In: Conference on Human Factors in Computing Systems - Proceedings (2013)
Google Scholar
Crossley, S., Kyle, K., McNamara, D.: Sentiment analysis and social cognition engine (seance): an automatic tool for sentiment, social cognition, and social-order analysis. Behav. Res. Methods 49, 803–821 (2017)
Article Google Scholar
Cumby, C.M., Ghani, R.: A machine learning based system for semi-automatically redacting documents. In: IAAI (2011)
Google Scholar
Deng, M., et al.: A privacy threat analysis framework: supporting the elicitation and fulfillment of privacy requirements. Requir. Eng. 16, 3–32 (2011)
Article Google Scholar
Klimt, B., Yang, Y.: Introducing the enron corpus. In: CEAS (2004)
Google Scholar
Li, Y., Baldwin, T., Cohn, T.: Towards robust and privacy-preserving text representations. In: ACL, no. 2 (2018)
Google Scholar
Pellungrini, R., Monreale, A., Guidotti, R.: Privacy risk for individual basket patterns. In: MIDAS/PAP@PKDD/ECML (2018)
Google Scholar
Pellungrini, R., Pappalardo, L., Pratesi, F., Monreale, A.: Analyzing privacy risk in human mobility data. In: STAF Workshops (2018)
Google Scholar
Pellungrini, R., Pratesi, F., Pappalardo, L.: Assessing privacy risk in retail data. In: PAP@PKDD/ECML (2017)
Google Scholar
Pennebaker, J.W., Boyd, R.L., Jordan, K., Blackburn, K.: The development and psychometric properties of liwc2015. Technical report (2015)
Google Scholar
Pensa, R.G., di Blasi, G.: A semi-supervised approach to measuring user privacy in online social networks. In: DS (2016)
Google Scholar
del Pilar Salas-Zárate, M., et al.: A study on LIWC categories for opinion mining in spanish reviews. J. Inf. Sci. 40, 749–760 (2014)
Article Google Scholar
Pratesi, F., Gabrielli, L., Cintia, P., Monreale, A., Giannotti, F.: PRIMULE: privacy risk mitigation for user profiles. Data Knowl. Eng. 125, 101786 (2020)
Article Google Scholar
Pratesi, F., Monreale, A., Giannotti, F., Pedreschi, D.: Privacy preserving multidimensional profiling. In: GOODTECHS (2017)
Google Scholar
Pratesi, F., et al.: Prudence: a system for assessing privacy risk vs utility in data sharing ecosystems. Trans. Data Priv. (2018)
Google Scholar
Sánchez, D., Batet, M.: Toward sensitive document release with privacy guarantees. Eng. Appl. Artif. Intell. 59, 23–34 (2017)
Article Google Scholar
Shen, J.H., Rudzicz, F.: Detecting anxiety through reddit. In: Proceedings of the Fourth Workshop on Computer Linguistics and Clinical Psychology-From Linguistic Signal to Clinical Reality (2017)
Google Scholar
Shrestha, A., Spezzano, F., Joy, A.: Detecting fake news spreaders in social networks via linguistic and personality features. In: CLEF (Working Notes) (2020)
Google Scholar
Sturges, H.A.: The choice of a class interval. J. Am. Stat. Assoc. 21, 65–66 (1926)
Article Google Scholar
Tadesse, M.M., Lin, H., Xu, B., Yang, L.: Detection of depression-related posts in reddit social media forum. IEEE Access 7, 44883–44893 (2019)
Article Google Scholar
Xiao, Y., Xiong, L.: Protecting locations with differential privacy under temporal correlations. In: CCS (2015)
Google Scholar

Download references

Acknowledgment

This work is partially supported by the European Community H2020 programme under the funding schemes: H2020-INFRAIA-2019-1: Research Infrastructure G.A. 871042 SoBigData++ (sobigdata.eu), G.A. 952215 TAILOR, G.A. 952026 Humane AI NET (humane-ai.eu).

Author information

Authors and Affiliations

University of Pisa, Pisa, Italy
Giacomo Mariani & Anna Monreale
Scuola Normale Superiore, Pisa, Italy
Francesca Naretto

Authors

Giacomo Mariani
View author publications
You can also search for this author in PubMed Google Scholar
Anna Monreale
View author publications
You can also search for this author in PubMed Google Scholar
Francesca Naretto
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anna Monreale .

Editor information

Editors and Affiliations

Universidade do Porto and Fraunhofer Portugal AICOS, Porto, Portugal
Carlos Soares
Dalhousie University, Halifax, NS, Canada
Luis Torgo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mariani, G., Monreale, A., Naretto, F. (2021). Privacy Risk Assessment of Individual Psychometric Profiles. In: Soares, C., Torgo, L. (eds) Discovery Science. DS 2021. Lecture Notes in Computer Science(), vol 12986. Springer, Cham. https://doi.org/10.1007/978-3-030-88942-5_32

Download citation

DOI: https://doi.org/10.1007/978-3-030-88942-5_32
Published: 09 October 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-88941-8
Online ISBN: 978-3-030-88942-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Privacy Risk Assessment of Individual Psychometric Profiles

Abstract

Access this chapter

Similar content being viewed by others

The Privacy Paradox in the Context of Online Health Data Disclosure by Users

Confidentiality, Privacy, and Anonymity

Optional Data Disclosure and the Online Privacy Paradox: A UK Perspective

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Privacy Risk Assessment of Individual Psychometric Profiles

Abstract

Access this chapter

Similar content being viewed by others

The Privacy Paradox in the Context of Online Health Data Disclosure by Users

Confidentiality, Privacy, and Anonymity

Optional Data Disclosure and the Online Privacy Paradox: A UK Perspective

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation