Psychological Text Analysis in the Digital Humanities

  • Ryan L. BoydEmail author
Part of the Multimedia Systems and Applications book series (MMSA)


In the digital humanities, it has been particularly difficult to establish the psychological properties of a person or group of people in an objective, reliable manner. Traditionally, the attempt to understand an author’s psychological makeup has been primarily (if not exclusively) accomplished through subjective interpretation, qualitative analysis, and speculation. In the world of empirical psychological research, however, the past two decades have witnessed an explosion of computerized language analysis techniques that objectively measure psychological features of the individual. Indeed, by using modern text analysis methods, it is now possible to quickly and accurately extract information about people—personalities, individual differences, social processes, and even their mental health—all through the words that people write and speak. This chapter serves as a primer for researchers interested in learning about how language can provide powerful insights into the minds of others via well-established and easy-to-use psychometric methods. First, this chapter provides a general background on language analysis in the field of psychology, followed by an introduction to modern methods and developments within the field of psychological text analysis. Finally, a solid foundation to psychological text analysis is provided in the form of an overview of research spanning hundreds of studies from labs all over the world.


Childhood Sexual Abuse Emotion Word Content Word Function Word Language Pattern 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



Preparation of this chapter was aided by grants from the National Institute of Health (5R01GM112697-02), John Templeton Foundation (#48503), and the National Science Foundation (IIS-1344257). The views, opinions, and findings contained in this chapter are those of the author and should not be construed as position, policy, or decision of the aforementioned agencies, unless so designated by other documents. The author would like to thank Elisavet Makridis, Natalie M. Peluso, James W. Pennebaker, and the anonymous reviewers for their helpful feedback on earlier versions of this chapter.


  1. J.L. Baddeley, G.R. Daniel, J.W. Pennebaker, How Henry Hellyer’s use of language foretold his suicide. Crisis 32(5), 288–292 (2011)CrossRefGoogle Scholar
  2. Borelli, J. L., Ramsook, K. A., Smiley, P., Kyle Bond, D., West, J. L., K.H. Buttitta, Language matching among mother-child dyads: associations with child attachment and emotion reactivity. Soc. Dev. (2016). doi: 10.1111/sode.12200
  3. R.L. Boyd, MEH: Meaning Extraction Helper (Version 1.4.13) [Software]. Available from (2016)
  4. R.L. Boyd, J.W. Pennebaker, Did Shakespeare write double falsehood? Identifying individuals by creating psychological signatures with text analysis. Psychol. Sci. 26(5), 570–582 (2015)CrossRefGoogle Scholar
  5. R.L. Boyd, S.R. Wilson, J.W. Pennebaker, M. Kosinski, D.J. Stillwell, R. Mihalcea, Values in words: using language to evaluate and understand personal values, in Proceedings of the Ninth International AAAI Conference on Web and Social Media (2015), pp. 31–40Google Scholar
  6. C.K. Chung, J.W. Pennebaker, Revealing dimensions of thinking in open-ended self-descriptions: an automated meaning extraction method for natural language. J. Res. Pers. 42(1), 96–132 (2008)CrossRefGoogle Scholar
  7. M.A. Cohn, M.R. Mehl, J.W. Pennebaker, Linguistic markers of psychological change surrounding September 11, 2001. Psychol. Sci. 15(10), 687–693 (2004)CrossRefGoogle Scholar
  8. M. De Choudhury, M. Gamon, S. Counts, E. Horvitz, Predicting depression via social media, in Annual Proceedings of the 2013 AAAI Conference on Web and Social Media (ICWSM) (2013)Google Scholar
  9. J. Dewey, How we think (D.C. Heath, Boston, 1910)CrossRefGoogle Scholar
  10. M.J. Egnoto, D.J. Griffin, Analyzing language in suicide notes and legacy tokens: investigating clues to harm of self and harm to others in writing. Crisis 37(2), 140–147 (2016)CrossRefGoogle Scholar
  11. M. Fernández-Cabana, A. García-Caballero, M.T. Alves-Pérez, M.J. García-García, R. Mateos, Suicidal traits in Marilyn Monroe’s fragments. Crisis 34(2), 124–130 (2013)CrossRefGoogle Scholar
  12. A.K. Fetterman, M.D. Robinson, Do you use your head or follow your heart? Self-location predicts personality, emotion, decision making, and performance. J. Pers. Soc. Psychol. 105, 316–334 (2013)CrossRefGoogle Scholar
  13. L. Flekova, I. Gurevych, Personality profiling of fictional characters using sense-level links between lexical resources, in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP) (2015)Google Scholar
  14. S. Freud, On Aphasia (International Universities Press, London, 1891)Google Scholar
  15. E. Gortner, J.W. Pennebaker, The archival anatomy of a disaster: media coverage and community-wide health effects of the Texas A&M bonfire tragedy. J. Soc. Clin. Psychol. 22(5), 580–603 (2003)CrossRefGoogle Scholar
  16. D. Holmes, G.W. Alpers, T. Ismailji, C. Classen, T. Wales, V. Cheasty, A. Miller, C. Koopman, Cognitive and emotional processing in narratives of women abused by intimate partners. Violence Against Women 13(11), 1192–1205 (2007)CrossRefGoogle Scholar
  17. M.E. Ireland, J.W. Pennebaker, Language style matching in writing: synchrony in essays, correspondence, and poetry. J. Pers. Soc. Psychol. 99(3), 549–571 (2010)CrossRefGoogle Scholar
  18. M.E. Ireland, R.B. Slatcher, P.W. Eastwick, L.E. Scissors, E.J. Finkel, J.W. Pennebaker, Language style matching predicts relationship initiation and stability. Psychol. Sci. 22(1), 39–44 (2011)CrossRefGoogle Scholar
  19. O.P. John, L.P. Naumann, C.J. Soto, in Handbook of Personality: Theory and Research, ed. by O. P. John, R. W. Robins, L. A. Pervin. Paradigm shift to the integrative big-five trait taxonomy: history, measurement, and conceptual issues (Guilford Press, New York, 2008), pp. 114–158Google Scholar
  20. K. Jordan, J.W. Pennebaker, How the candidates are thinking: analytic versus narrative thinking styles. Retrieved January 21, 2016, from (2016)
  21. P. Juola, Authorship attribution. Found. Trends Inf. Retr. 1(3), 233 (2006)CrossRefGoogle Scholar
  22. D. Kahneman, Thinking, Fast and Slow (Farrar, Straus and Giroux, New York, 2011)Google Scholar
  23. T. Kasser, Lucy in the Mind of Lennon (Oxford University Press, New York, 2013)CrossRefGoogle Scholar
  24. M. Komisin, C. Guinn, Identifying personality types using document classification methods, in Proceedings of the Twenty-Fifth International Florida Artificial Intelligence Research Society Conference (2012)Google Scholar
  25. M. Koppel, J. Schler, S. Argamon, Computational methods in authorship attribution. J. Am. Soc. Inf. Sci. Technol. 60(1), 9–26 (2008)CrossRefGoogle Scholar
  26. C.M. Laserna, Y. Seih, J.W. Pennebaker, J. Lang. Soc. Psychol. 33(3), 328–338 (2014)CrossRefGoogle Scholar
  27. H.D. Lasswell, D. Lerner, I. De Sola Pool, The Comparative Study of Symbols: An Introduction (Stanford University Press, Stanford, 1952)CrossRefGoogle Scholar
  28. M. Liberman, Linguistic dominance in house of cards. Retrieved March 12, 2015, from (2015)
  29. R.D. Lowe, D. Heim, C.K. Chung, J.C. Duffy, J.B. Davies, J.W. Pennebaker, In verbis, vinum? Relating themes in an open-ended writing task to alcohol behaviors. Appetite 68, 8–13 (2013)CrossRefGoogle Scholar
  30. F. Mairesse, M.A. Walker, M.R. Mehl, R.K. Moore, Using linguistic cues for the automatic recognition of personality and conversation in text. J. Artif. Intell. Res. 30(1), 457–500 (2007)zbMATHGoogle Scholar
  31. C. Martindale, The grammar of altered states of consciousness: a semiotic reinterpretation of aspects of psychoanalytic theory. Psychoanal. Contemp. Thought 4, 331–354 (1975)Google Scholar
  32. D.C. McClelland, J.W. Atkinson, R.A. Clark, E.L. Lowell, The Achievement Motive (Irvington, Oxford, 1953)CrossRefGoogle Scholar
  33. E. Mergenthaler, Emotion-abstraction patterns in verbatim protocols: a new way of describing psychotherapeutic processes. J. Consult. Clin. Psychol. 64(6), 1306–1315 (1996)CrossRefGoogle Scholar
  34. G.A. Miller, The Science of Words (Scientific American Library, New York, 1995)Google Scholar
  35. F. Moretti, Distant Reading (Verso, London, 2013)Google Scholar
  36. J.W. Pennebaker, J.F. Evans, Expressive Writing: Words that Heal (Idyll Arbor, Enumclaw, 2014)Google Scholar
  37. J.W. Pennebaker, M.E. Francis, Linguistic Inquiry and Word Count (LIWC): A Computer-Based Text Analysis Program (Erlbaum, Mahwah, NJ, 1999)Google Scholar
  38. J.W. Pennebaker, L.A. King, Linguistic styles: language use as an individual difference. J. Pers. Soc. Psychol. 77(6), 1296–1312 (1999)CrossRefGoogle Scholar
  39. J.W. Pennebaker, L.D. Stone, Words of wisdom: Language use over the life span. Pers. Processes Individ. Differ. 85(2), 291–301 (2003)Google Scholar
  40. J.W. Pennebaker, T.J. Mayne, M.E. Francis, Linguistic predictors of adaptive bereavement. J. Pers. Soc. Psychol. 72, 863–871 (1997)CrossRefGoogle Scholar
  41. J.W. Pennebaker, C.K. Chung, J. Frazee, G.M. Lavergne, D.I. Beaver, When small words foretell academic success: the case of college admissions essays. PLoS One 9(12), e115844 (2014)CrossRefGoogle Scholar
  42. J.W. Pennebaker, R.L. Boyd, K. Jordan, K. Blackburn, The Development and Psychometric Properties of LIWC2015 (University of Texas, Austin, TX, 2015a)Google Scholar
  43. J.W. Pennebaker, R.J. Booth, R.L. Boyd, M.E. Francis, Linguistic Inquiry and Word Count: LIWC2015 (Pennebaker Conglomerates, Austin, TX, 2015b)Google Scholar
  44. K.J. Petrie, J.W. Pennebaker, B. Sivertsen, Things we said today: a linguistic analysis of the Beatles. Psychol. Aesthet. Creat. Arts 2(4), 197–202 (2008)CrossRefGoogle Scholar
  45. S.T. Piantadosi, Zipf’s word frequency law in natural language: a critical review and future directions. Psychon. Bull. Rev. 21(5), 1112–1130 (2014)CrossRefGoogle Scholar
  46. C.S. Pulverman, R.L. Boyd, A.M. Stanton, C.M. Meston, Changes in the sexual self-schema of women with a history of childhood sexual abuse following expressive writing treatment. Psychol. Trauma. 9(2), 181–188 (2016). doi: 10.1037/tra0000163
  47. S.A. Rains, Language style matching as a predictor of perceived social support in computer-mediated interaction among individuals coping with illness. Commun. Res. 43(5), 694–712 (2015)CrossRefGoogle Scholar
  48. N. Ramirez-Esparza, C.K. Chung, E. Kacewicz, J.W. Pennebaker, The psychology of word use in depression forums in English and in Spanish: texting two text analytic approaches, in Annual Proceedings of the 2008 AAAI Conference on Web and Social Media (ICWSM) (2008)Google Scholar
  49. B.H. Richardson, P.J. Taylor, B. Snook, S.M. Conchi, C. Bennell, Language style matching and police interrogation outcomes. Law Hum. Behav. 38(4), 357–366 (2014)CrossRefGoogle Scholar
  50. D.M. Romero, R.I. Swaab, B. Uzzi, A.D. Galinsky, Mimicry is presidential: Linguistic style matching in presidential debates and improved polling numbers. Personal. Soc. Psychol. Bull. 41(10), 1311–1319 (2015)CrossRefGoogle Scholar
  51. S. Ross, In praise of overstating the case: a review of Franco Moretti, distant reading. Dig. Humanit. Q. 8(1), 1 (2014)MathSciNetGoogle Scholar
  52. R.M. Sapolsky, Why Zebras Don't Get Ulcers: A Guide To Stress, Stress Related Diseases, and Coping (W.H. Freeman, New York, 1994)Google Scholar
  53. J. Schler, M. Koppel, S. Argamon, J.W. Pennebaker, Effects of age and gender on blogging, in Proceedings of the 2005 AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs (2006)Google Scholar
  54. T.E. Senn, M.P. Carey, P.A. Vanable, Childhood sexual abuse and sexual risk behavior among men and women attending a sexually transmitted disease clinic. J. Consult. Clin. Psychol. 74(4), 720–731 (2006)CrossRefGoogle Scholar
  55. A.M. Stanton, R.L. Boyd, C.S. Pulverman, C.M. Meston, Determining women’s sexual self-schemas through advanced computerized text analysis. Child Abuse Negl. 46, 78–88 (2015)CrossRefGoogle Scholar
  56. S.W. Stirman, J.W. Pennebaker, Word use in the poetry of suicidal and nonsuicidal poets. Psychosom. Med. 63, 517–522 (2001)CrossRefGoogle Scholar
  57. P.J. Stone, D.C. Dunphy, M.S. Smith, D.M. Ogilvie, The General Inquirer: A Computer Approach to Content Analysis (MIT, Cambridge, 1966)Google Scholar
  58. Y.R. Tausczik, J.W. Pennebaker, The psychological meaning of words: LIWC and computerized text analysis methods. J. Lang. Soc. Psychol. 29(1), 24–54 (2010)CrossRefGoogle Scholar
  59. D. Watson, Mood and Temperament (Guilford Press, New York, 2000)Google Scholar
  60. W. Weintraub, Verbal Behavior in Everyday Life (Springer, New York, 1989)Google Scholar
  61. M. Wolf, C.K. Chung, H. Kordy, Inpatient treatment to online aftercare: e-mailing themes as a function of therapeutic outcomes. Psychother. Res. 20(1), 71–85 (2010)CrossRefGoogle Scholar
  62. T. Yarkoni, Personality in 100,000 words: a large-scale analysis of personality and word use among bloggers. J. Res. Pers. 44(3), 363–373 (2010)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Department of PsychologyThe University of Texas at AustinAustinUSA

Personalised recommendations