Skip to main content

Understanding Email Writers: Personality Prediction from Email Messages

  • Conference paper
User Modeling, Adaptation, and Personalization (UMAP 2013)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7899))

Abstract

Email is a ubiquitous communication tool and constitutes a significant portion of social interactions. In this paper, we attempt to infer the personality of users based on the content of their emails. Such inference can enable valuable applications such as better personalization, recommendation, and targeted advertising. Considering the private and sensitive nature of email content, we propose a privacy-preserving approach for collecting email and personality data. We then frame personality prediction based on the well-known Big Five personality model and train predictors based on extracted email features. We report prediction performance of 3 generative models with different assumptions. Our results show that personality prediction is feasible, and our email feature set can predict personality with reasonable accuracies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Argamon, S., Whitelaw, C., Chase, P., Hota, S.R., Garg, N., Levitan, S.: Stylistic text classification using functional lexical features. Journal of the American Society for Information Science and Technology 58(6), 802–822 (2007)

    Article  Google Scholar 

  2. Bellotti, V., Ducheneaut, N., Howard, M., Smith, I.: Taking email to task: the design and evaluation of a task management centered email tool. In: CHI 2003, pp. 345–352 (2003)

    Google Scholar 

  3. Blei, D., Ng, A., Jordan, M.: Latent dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)

    MATH  Google Scholar 

  4. Carvalho, V.R., Cohen, W.W.: Learning to extract signature and reply lines from email. In: Proc. of CEAS 2004 (2004)

    Google Scholar 

  5. Cohen, W.W., Carvalho, V.R., Mitchell, T.M.: Learning to classify email into “speech acts”. In: Proc. of EMNLP 2004, pp. 309–316 (2004)

    Google Scholar 

  6. Dredze, M., Brooks, T., Carroll, J., Magarick, J., Blitzer, J.: FernandoPereira: Intelligent email: reply and attachment prediction. In: Proc. of the 13th IUI, pp. 321–324 (2008)

    Google Scholar 

  7. Ducheneaut, N., Bellotti, V.: E-mail as habitat: an exploration of embedded personal information management. Interactions 8, 30–38 (2001)

    Article  Google Scholar 

  8. Ehrenberg, A.L., Juckes, S.C., White, K.M., Walsh, S.P.: Personality and self-esteem as predictors of young people’s technology use. Cyberpsychology & Behavior 11(6), 739–741 (2008)

    Article  Google Scholar 

  9. Hamburger, Y., Ben-Artzi, E.: The relationship between extraversion and neuroticism and the different uses of the internet. Computers in Human Behavior 6(4) (July 2000)

    Google Scholar 

  10. Jakobwitz, S., Egan, V.: The dark ‘triad’ of psychopathy and normal personality traits. Personality and Individual Differences 40(0), 331–339 (2006)

    Article  Google Scholar 

  11. Joachims, T.: Learning to Classify Text Using Support Vector Machines. Kluwer Academic Publishers (2001)

    Google Scholar 

  12. John, O.P., Robins, R.W., Pervin, L.A.: Handbook of Personality: Theory and Research. 3rd edn. The Guilford Press (2010)

    Google Scholar 

  13. Kenny, D.A., Horner, C., Kashy, D.A., Chu, L.C.: Consensus at zero acquaintance: Replication, behavioral cues, and stability. Journal of Personality and Social Psychology, 88–97 (1992)

    Google Scholar 

  14. Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proc. of ICML 2001, pp. 282–289 (2001)

    Google Scholar 

  15. Lam, D., Rohall, S.L., Schmandt, C., Stern, M.K.: Exploiting e-mail structure to improve summarization. In: Proc. of CSCW 2002 (2002)

    Google Scholar 

  16. Lepri, B., Mana, N., Cappelletti, A., Pianesi, F., Zancanaro, M.: Modeling the personality of participants during group interactions. In: Houben, G.-J., McCalla, G., Pianesi, F., Zancanaro, M. (eds.) UMAP 2009. LNCS, vol. 5535, pp. 114–125. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  17. Muldner, K., Burleson, W., VanLehn, K.: “Yes!”: Using tutor and sensor data to predict moments of delight during instructional activities. In: De Bra, P., Kobsa, A., Chin, D. (eds.) UMAP 2010. LNCS, vol. 6075, pp. 159–170. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  18. Pang, B., Lee, L.: Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: Proc. of the 43rd ACL, pp. 115–124 (2005)

    Google Scholar 

  19. Pennebaker, J.W., Francis, M.E., Booth, R.J.: Linguistic Inquiry and Word Count (LIWC2001). Lawrence Erlbaum Associates, Mahwah (2001)

    Google Scholar 

  20. Ramage, D., Hall, D., Nallapati, R., Manning, C.: Labeled lda: a supervised topic model for credit attribution in multi-labeled corpora. In: Proc. of EMNLP 2009, pp. 248–256 (2009)

    Google Scholar 

  21. Shaw, E., Stroz, E.: Warmtouch: assessing the insider threat and relationship management. In: Parker, T., Devost, M., Sachs, M., Shaw, E., Stroz, E. (eds.) Cyber Adversary Characterization: Auditing the Hacker Mind, Syngress Publishing (2004)

    Google Scholar 

  22. Toutanova, K., Klein, D., Manning, C.D., Singer, Y.: Feature-rich part-of-speech tagging with a cyclic dependency network. In: Proc. of NAACL 2003, 173–180 (2003)

    Google Scholar 

  23. Tsoumakas, G., Katakis, I.: Multi label classification: An overview. International Journal of Data Warehousing and Mining 3(3), 1–13 (2005)

    Article  Google Scholar 

  24. Tsoumakas, G., Katakis, I., Vlahavas, I.: Random k-labelsets for multilabel classification. IEEE Transactions on Knowledge and Data Engineering 23(7), 1079–1089 (2011)

    Article  Google Scholar 

  25. Whittaker, S., Bellotti, V., Gwizdka, J.: Email in personal information management. Communications of the ACM 49(1), 68–73 (2006)

    Article  Google Scholar 

  26. Wiktionary: a multilingual, web-based free dictionary (2013), http://www.wiktionary.org (retrieved)

  27. Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of HLT-EMNLP, pp. 347–354 (2005)

    Google Scholar 

  28. Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. In: Proc. of ICML 1997, 412–420 (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Shen, J., Brdiczka, O., Liu, J. (2013). Understanding Email Writers: Personality Prediction from Email Messages. In: Carberry, S., Weibelzahl, S., Micarelli, A., Semeraro, G. (eds) User Modeling, Adaptation, and Personalization. UMAP 2013. Lecture Notes in Computer Science, vol 7899. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38844-6_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-38844-6_29

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-38843-9

  • Online ISBN: 978-3-642-38844-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics