Understanding Email Writers: Personality Prediction from Email Messages

Shen, Jianqiang; Brdiczka, Oliver; Liu, Juan

doi:10.1007/978-3-642-38844-6_29

Jianqiang Shen²⁰,
Oliver Brdiczka²⁰ &
Juan Liu²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7899))

Included in the following conference series:

International Conference on User Modeling, Adaptation, and Personalization

3718 Accesses
27 Citations

Abstract

Email is a ubiquitous communication tool and constitutes a significant portion of social interactions. In this paper, we attempt to infer the personality of users based on the content of their emails. Such inference can enable valuable applications such as better personalization, recommendation, and targeted advertising. Considering the private and sensitive nature of email content, we propose a privacy-preserving approach for collecting email and personality data. We then frame personality prediction based on the well-known Big Five personality model and train predictors based on extracted email features. We report prediction performance of 3 generative models with different assumptions. Our results show that personality prediction is feasible, and our email feature set can predict personality with reasonable accuracies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Argamon, S., Whitelaw, C., Chase, P., Hota, S.R., Garg, N., Levitan, S.: Stylistic text classification using functional lexical features. Journal of the American Society for Information Science and Technology 58(6), 802–822 (2007)
Article Google Scholar
Bellotti, V., Ducheneaut, N., Howard, M., Smith, I.: Taking email to task: the design and evaluation of a task management centered email tool. In: CHI 2003, pp. 345–352 (2003)
Google Scholar
Blei, D., Ng, A., Jordan, M.: Latent dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)
MATH Google Scholar
Carvalho, V.R., Cohen, W.W.: Learning to extract signature and reply lines from email. In: Proc. of CEAS 2004 (2004)
Google Scholar
Cohen, W.W., Carvalho, V.R., Mitchell, T.M.: Learning to classify email into “speech acts”. In: Proc. of EMNLP 2004, pp. 309–316 (2004)
Google Scholar
Dredze, M., Brooks, T., Carroll, J., Magarick, J., Blitzer, J.: FernandoPereira: Intelligent email: reply and attachment prediction. In: Proc. of the 13th IUI, pp. 321–324 (2008)
Google Scholar
Ducheneaut, N., Bellotti, V.: E-mail as habitat: an exploration of embedded personal information management. Interactions 8, 30–38 (2001)
Article Google Scholar
Ehrenberg, A.L., Juckes, S.C., White, K.M., Walsh, S.P.: Personality and self-esteem as predictors of young people’s technology use. Cyberpsychology & Behavior 11(6), 739–741 (2008)
Article Google Scholar
Hamburger, Y., Ben-Artzi, E.: The relationship between extraversion and neuroticism and the different uses of the internet. Computers in Human Behavior 6(4) (July 2000)
Google Scholar
Jakobwitz, S., Egan, V.: The dark ‘triad’ of psychopathy and normal personality traits. Personality and Individual Differences 40(0), 331–339 (2006)
Article Google Scholar
Joachims, T.: Learning to Classify Text Using Support Vector Machines. Kluwer Academic Publishers (2001)
Google Scholar
John, O.P., Robins, R.W., Pervin, L.A.: Handbook of Personality: Theory and Research. 3rd edn. The Guilford Press (2010)
Google Scholar
Kenny, D.A., Horner, C., Kashy, D.A., Chu, L.C.: Consensus at zero acquaintance: Replication, behavioral cues, and stability. Journal of Personality and Social Psychology, 88–97 (1992)
Google Scholar
Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proc. of ICML 2001, pp. 282–289 (2001)
Google Scholar
Lam, D., Rohall, S.L., Schmandt, C., Stern, M.K.: Exploiting e-mail structure to improve summarization. In: Proc. of CSCW 2002 (2002)
Google Scholar
Lepri, B., Mana, N., Cappelletti, A., Pianesi, F., Zancanaro, M.: Modeling the personality of participants during group interactions. In: Houben, G.-J., McCalla, G., Pianesi, F., Zancanaro, M. (eds.) UMAP 2009. LNCS, vol. 5535, pp. 114–125. Springer, Heidelberg (2009)
Chapter Google Scholar
Muldner, K., Burleson, W., VanLehn, K.: “Yes!”: Using tutor and sensor data to predict moments of delight during instructional activities. In: De Bra, P., Kobsa, A., Chin, D. (eds.) UMAP 2010. LNCS, vol. 6075, pp. 159–170. Springer, Heidelberg (2010)
Chapter Google Scholar
Pang, B., Lee, L.: Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: Proc. of the 43rd ACL, pp. 115–124 (2005)
Google Scholar
Pennebaker, J.W., Francis, M.E., Booth, R.J.: Linguistic Inquiry and Word Count (LIWC2001). Lawrence Erlbaum Associates, Mahwah (2001)
Google Scholar
Ramage, D., Hall, D., Nallapati, R., Manning, C.: Labeled lda: a supervised topic model for credit attribution in multi-labeled corpora. In: Proc. of EMNLP 2009, pp. 248–256 (2009)
Google Scholar
Shaw, E., Stroz, E.: Warmtouch: assessing the insider threat and relationship management. In: Parker, T., Devost, M., Sachs, M., Shaw, E., Stroz, E. (eds.) Cyber Adversary Characterization: Auditing the Hacker Mind, Syngress Publishing (2004)
Google Scholar
Toutanova, K., Klein, D., Manning, C.D., Singer, Y.: Feature-rich part-of-speech tagging with a cyclic dependency network. In: Proc. of NAACL 2003, 173–180 (2003)
Google Scholar
Tsoumakas, G., Katakis, I.: Multi label classification: An overview. International Journal of Data Warehousing and Mining 3(3), 1–13 (2005)
Article Google Scholar
Tsoumakas, G., Katakis, I., Vlahavas, I.: Random k-labelsets for multilabel classification. IEEE Transactions on Knowledge and Data Engineering 23(7), 1079–1089 (2011)
Article Google Scholar
Whittaker, S., Bellotti, V., Gwizdka, J.: Email in personal information management. Communications of the ACM 49(1), 68–73 (2006)
Article Google Scholar
Wiktionary: a multilingual, web-based free dictionary (2013), http://www.wiktionary.org (retrieved)
Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of HLT-EMNLP, pp. 347–354 (2005)
Google Scholar
Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. In: Proc. of ICML 1997, 412–420 (1997)
Google Scholar

Download references

Author information

Authors and Affiliations

Palo Alto Research Center, 3333 Coyote Hill Road, Palo Alto, CA, 94304, USA
Jianqiang Shen, Oliver Brdiczka & Juan Liu

Authors

Jianqiang Shen
View author publications
You can also search for this author in PubMed Google Scholar
Oliver Brdiczka
View author publications
You can also search for this author in PubMed Google Scholar
Juan Liu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer and Information Science, University of Delaware, 19716, Delaware, DE, USA
Sandra Carberry
School of Computing, National College of Ireland, Mayor Street, IFSC, Dublin 1, Ireland
Stephan Weibelzahl
Dipartimento di Ingegneria, Roma Tre University, Via della Vasca Navale, 79, 00146, Rome, Italy
Alessandro Micarelli
Department of Computer Science, University of Bari Aldo Moro, 70126, Bari, Italy
Giovanni Semeraro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shen, J., Brdiczka, O., Liu, J. (2013). Understanding Email Writers: Personality Prediction from Email Messages. In: Carberry, S., Weibelzahl, S., Micarelli, A., Semeraro, G. (eds) User Modeling, Adaptation, and Personalization. UMAP 2013. Lecture Notes in Computer Science, vol 7899. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38844-6_29

Download citation

DOI: https://doi.org/10.1007/978-3-642-38844-6_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38843-9
Online ISBN: 978-3-642-38844-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics