Advertisement

Journal of Business Economics

, Volume 89, Issue 1, pp 79–123 | Cite as

Content analysis of business communication: introducing a German dictionary

  • Christina BannierEmail author
  • Thomas Pauls
  • Andreas Walter
Original Paper
  • 287 Downloads

Abstract

Computer-aided text analyses have gained a lot of attention recently. Applied to different types of business communication such as earnings announcements, analyst reports, or IPO prospectuses, they have been used to extract relevant information for financial market participants. A large number of studies employ dictionary-based approaches by referring to specific word lists. Since these lists have been predominantly compiled for the English language, the respective analyses have focused on English business texts. In order to amplify the application of content analyses to other languages, we create a German dictionary designed to measure the textual sentiment of business communication. Our dictionary is based on the English dictionary by Loughran and McDonald (J Finance 66:35–65.  https://doi.org/10.1111/j.1540-6261.2010.01625.x, 2011), which is commonly used for examining finance- and accounting-specific texts. We discuss the set-up of our dictionary and extensively test its quality. We further compare our dictionary to German general language dictionaries and to a machine-learning procedure and provide evidence for its ability to capture market-relevant textual sentiment of German business communication.

Keywords

Text analysis Content analysis Textual sentiment Business communication Annual reports 

JEL Classification

G02 G12 G14 

References

  1. Allee KD, Deangelis MD (2015) The structure of voluntary disclosure narratives: evidence from tone dispersion. J Account Res 53:241–274.  https://doi.org/10.1111/1475-679X.12072 CrossRefGoogle Scholar
  2. Ammann M, Schaub N (2016) Social interaction and investing: evidence from an online social trading network. Working PaperGoogle Scholar
  3. Antons D, Breidbach CF (2018) Big data, big insights? Advancing service innovation and design with machine learning. J Ser Res 21:17–39.  https://doi.org/10.1177/1094670517738373 CrossRefGoogle Scholar
  4. Antons D, Kleer R, Salge TO (2016) Mapping the topic landscape of JPIM, 1984-2013: in search of hidden structures and development trajectories. J Prod Innov Manag 33:726–749.  https://doi.org/10.1111/jpim.12300 CrossRefGoogle Scholar
  5. Antweiler W, Frank MZ (2004) Is all that talk just noise? The information content of internet stock message boards. J Finance 59:1259–1294.  https://doi.org/10.1111/j.1540-6261.2004.00662.x CrossRefGoogle Scholar
  6. Arslan-Ayaydin Ö, Boudt K, Thewissen J (2015) Managers set the tone: equity incentives and the tone of earnings press releases. J Bank Finance 72:132–147.  https://doi.org/10.1016/j.jbankfin.2015.10.007 CrossRefGoogle Scholar
  7. Baker M, Wurgler J (2006) Investor sentiment and the cross-section of stock returns. J of Finance 61:1645–1680.  https://doi.org/10.1111/j.1540-6261.2006.00885.x CrossRefGoogle Scholar
  8. Bannier CE, Pauls T, Walter A (2017) CEO-Speeches and stock returns. Working Paper.  https://doi.org/10.2139/ssrn.2869785
  9. Bao Y, Datta A (2014) Simultaneously discovering and quantifying risk types from textual risk disclosures. Manage Sci 60:1371–1391.  https://doi.org/10.1287/mnsc.2014.1930 CrossRefGoogle Scholar
  10. Blair C, Cole SR (2002) Two-sided equivalence testing of the difference between two means. J Mod Appl Stat Methods 1:139–142.  https://doi.org/10.22237/jmasm/1020255540 CrossRefGoogle Scholar
  11. Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022Google Scholar
  12. Boudt K, Thewissen J (2018) Jockeying for position in CEO letters: impression management and sentiment analytics. Financ Manage.  https://doi.org/10.1111/fima.12219 Google Scholar
  13. Boukus E, Rosenberg JV (2006) The information content of FOMC minutes. Working Paper.  https://doi.org/10.2139/ssrn.922312
  14. Brooke J, Tofiloski M, Taboada M (2009) Cross-linguistic sentiment analysis: from English to Spanish. Proc Int Conf RANLP 2009:50–54Google Scholar
  15. Buehlmaier MMM, Whited TM (2018) Are financial constraints priced? Evidence from textual analysis. Rev Financ Stud 31:2693–2728.  https://doi.org/10.1093/rfs/hhy007 CrossRefGoogle Scholar
  16. Caton S, Hall M, Weinhardt C (2015) How do politicians use Facebook? An applied social observatory. Big Data Soc 2:1–18.  https://doi.org/10.1177/2053951715612822 CrossRefGoogle Scholar
  17. Caumanns J (1999) A fast and simple stemming algorithm for German words. http://www.inf.fu-berlin.de/inst/pubs/tr-b-99-16.abstract.html. Accessed 13 Jan 2018
  18. Cicon JE, Ferris SP, Kammel AJ, Noronha G (2012) European corporate governance: a thematic analysis of national codes of governance. Eur Financ Manag 18:620–648.  https://doi.org/10.1111/j.1468-036X.2010.00542.x CrossRefGoogle Scholar
  19. Das SR, Chen MY (2007) Yahoo! for Amazon: sentiment extraction from small talk on the web. Manage Sci 53:1375–1388.  https://doi.org/10.1287/mnsc.1070.0704 CrossRefGoogle Scholar
  20. Davis AK, Tama-Sweet I (2012) Managers’ use of language across alternative disclosure outlets: earnings press releases versus MD&A. Contemp Account Res 29:804–837.  https://doi.org/10.1111/j.1911-3846.2011.01125.x CrossRefGoogle Scholar
  21. Davis AK, Piger JM, Sedor LM (2012) Beyond the numbers: measuring the information content of earnings press release language. Contemp Account Res 29:845–868.  https://doi.org/10.1111/j.1911-3846.2011.01130.x CrossRefGoogle Scholar
  22. Davis AK, Ge W, Matsumoto D, Zhang JL (2015) The effect of manager-specific optimism on the tone of earnings conference calls. Rev Acc Stud 20:639–673.  https://doi.org/10.1007/s11142-014-9309-4 CrossRefGoogle Scholar
  23. Debortoli S, Müller O, Junglas I, Vom Brocke J (2016) Text mining for information systems researchers: an annotated topic modeling tutorial. CAIS 39:110–135.  https://doi.org/10.17705/1CAIS.03907 CrossRefGoogle Scholar
  24. Doran JS, Peterson DR, Price SM (2012) Earnings conference call content and stock price: the case of REITs. The Journal of Real Estate Finance and Economics 45:402–434.  https://doi.org/10.1007/s11146-010-9266-z CrossRefGoogle Scholar
  25. Eickhoff M, Muntermann J (2016a) How to conquer information overload? Supporting financial decisions by identifying relevant conference call topics. In: PACIS 2016 ProceedingsGoogle Scholar
  26. Eickhoff M, Muntermann J (2016b) They talk but what do they listen to? Analyzing financial analysts’ information processing using latent Dirichlet allocation. In: PACIS 2016 ProceedingsGoogle Scholar
  27. Eickhoff M, Neuss N (2017) Topic modelling methodology: Its use in information systems and other managerial disciplines. In: Proceedings of the 25th European conference on information systems, Guimarães, Portugal:1327–1347Google Scholar
  28. Engelberg J (2008) Costly information processing: Evidence from earnings announcements. Working Paper.  https://doi.org/10.2139/ssrn.1107998
  29. Feldman R, Govindaraj S, Livnat J, Segal B (2008) The incremental information content of tone change in management discussion and analysis. Working PaperGoogle Scholar
  30. Ferris SP, Hao Q, Liao M-Y (2013) The effect of issuer conservatism on IPO pricing and performance. Rev Finance 17:993–1027.  https://doi.org/10.1093/rof/rfs018 CrossRefGoogle Scholar
  31. Feuerriegel S, Ratku A, Neumann D (2016) Analysis of how underlying topics in financial news affect stock prices using Latent Dirichlet Allocation. 49th Hawaii International Conference on System Sciences (HICSS 2016):1072–1081.  https://doi.org/10.1109/hicss.2016.137
  32. Frazier KB, Ingram RW, Tennyson BM (1984) A methodology for the analysis of narrative accounting disclosures. J Account Res 22:318–331.  https://doi.org/10.2307/2490713 CrossRefGoogle Scholar
  33. Gamache DL, McNamara G, Mannor MJ, Johnson RE (2015) Motivated to acquire? The impact of CEO regulatory focus on firm acquisitions. Acad Manag J 58:1261–1282.  https://doi.org/10.5465/amj.2013.0377 CrossRefGoogle Scholar
  34. García D (2013) Sentiment during recessions. The Journal of Finance 68:1267–1300.  https://doi.org/10.1111/jofi.12027 CrossRefGoogle Scholar
  35. Giorgi S, Weber K (2015) Marks of distinction: framing and audience appreciation in the context of investment advice. Adm Sci Q 60:333–367.  https://doi.org/10.1177/0001839215571125 CrossRefGoogle Scholar
  36. González M, Guzmán A, Téllez D, Trujill M-A (2016) What do you say and how do you say it: information disclosure in Latin American Firms. Working Paper.  https://doi.org/10.2139/ssrn.2929833
  37. Griffin PA (2003) Got information? Investor response to form 10-K and form 10-Q EDGAR filings. Rev Acc Stud 8:433–460.  https://doi.org/10.1023/A:1027351630866 CrossRefGoogle Scholar
  38. Grimmer J, Stewart BM (2013) Text as data: the promise and pitfalls of automatic content analysis methods for political texts. Polit Anal 21:267–297.  https://doi.org/10.1093/pan/mps028 CrossRefGoogle Scholar
  39. Haselmayer M, Jenny M (2016) Sentiment analysis of political communication. Combining a dictionary approach with crowdcoding. Qual Quant 51:2623–2646.  https://doi.org/10.1007/s11135-016-0412-4 CrossRefGoogle Scholar
  40. Hawkins JA (2015) A comparative typology of English and German: Unifying the contrasts, 1st edn., Croom Helm, London, 1986. Routledge library editions: English Language, vol 10. Routledge, LondonGoogle Scholar
  41. Henry E (2008) Are investors influenced by how earnings press releases are written? J Bus Commun 45:363–407.  https://doi.org/10.1177/0021943608319388 CrossRefGoogle Scholar
  42. Henry E, Leone AJ (2016) Measuring qualitative information in capital markets research: comparison of alternative methodologies to measure disclosure tone. Account Rev 91:153–178.  https://doi.org/10.2308/accr-51161 CrossRefGoogle Scholar
  43. Heston SL, Sinha NR (2016) News versus sentiment: predicting stock returns from news stories. FEDS 2016:1–35.  https://doi.org/10.17016/feds.2016.048 CrossRefGoogle Scholar
  44. Hillert A, Jacobs H, Müller S (2014) Media makes momentum. Rev Financ Stud 27:3467–3501.  https://doi.org/10.1093/rfs/hhu061 CrossRefGoogle Scholar
  45. Hillert A, Niessen-Ruenzi A, Ruenzi S (2016) Mutual fund shareholder letters: flows, performance, and managerial behavior. Working Paper.  https://doi.org/10.2139/ssrn.2524610
  46. Huang X, Teoh SH, Zhang Y (2014a) Tone management. Account Rev 89:1083–1113.  https://doi.org/10.2308/accr-50684 CrossRefGoogle Scholar
  47. Huang AH, Zang AY, Zheng R (2014b) Evidence on the information content of text in analyst reports. Account Rev 89:2151–2180.  https://doi.org/10.2308/accr-50833 CrossRefGoogle Scholar
  48. Huang AH, Lehavy R, Zang AY, Zheng R (2017) Analyst information discovery and interpretation roles: a topic modeling approach. Manage Sci 64:2833–2855.  https://doi.org/10.1287/mnsc.2017.2751 CrossRefGoogle Scholar
  49. Iliev R, Sagi E, Dehghani M (2015) Automated text analysis in psychology: methods, applications, and future developments. Lang Cogn 7:265–290.  https://doi.org/10.1017/langcog.2014.30 CrossRefGoogle Scholar
  50. Jacobi C, Kleinen-von Königslöw K, Ruigrok N (2016) Political news in online and print newspapers. Dig J 4:723–742.  https://doi.org/10.1080/21670811.2015.1087810 Google Scholar
  51. Jandl J-O, Feuerriegel S, Neumann D (2014) Long- and short-term impact of news messages on house prices: a comparative study of Spain and the United States. Thirty fifth international conference on information systems (Auckland), pp 1–18Google Scholar
  52. Jegadeesh N, Wu D (2013) Word power: a new approach for content analysis. J Financ Econ 110:712–729.  https://doi.org/10.1016/j.jfineco.2013.08.018 CrossRefGoogle Scholar
  53. Kaji N, Kitsuregawa M (2007) Building lexicon for sentiment analysis from massive collection of HTML documents. In: Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning, pp 1075–1083Google Scholar
  54. Kaplan S, Vakili K (2015) The double-edged sword of recombination in breakthrough innovation. Strateg Manag J 36:1435–1457.  https://doi.org/10.1002/smj.2294 CrossRefGoogle Scholar
  55. Kearney C, Liu S (2014) Textual sentiment in finance: a survey of methods and models. Int Rev Financ Anal 33:171–185.  https://doi.org/10.1016/j.irfa.2014.02.006 CrossRefGoogle Scholar
  56. Kirchhoff KR, Piwinger M (2009) Praxishandbuch investor relations. Gabler, Wiesbaden.  https://doi.org/10.1007/978-3-8349-8810-2 Google Scholar
  57. König E, Gast V (2012) Understanding English-German contrasts, vol 29, 3rd edn. Grundlagen der Anglistik und Amerikanistik. Schmidt, BerlinGoogle Scholar
  58. Larcker DF, Zakolyukina AA (2012) Detecting deceptive discussions in conference calls. J Account Res 50:495–540.  https://doi.org/10.1111/j.1475-679X.2012.00450.x CrossRefGoogle Scholar
  59. Lee H, Kang P (2017) Identifying core topics in technology and innovation management studies: a topic model approach. J Technol Transf.  https://doi.org/10.1007/s10961-017-9561-4 Google Scholar
  60. Lee S, Song J, Kim Y (2010) An empirical comparison of four text mining methods. In: 43rd Hawaii international conference on system sciences (HICSS), 2010; Honolulu, Hawaii, 5-8 Jan., pp 1–10.  https://doi.org/10.1109/hicss.2010.48
  61. Li F (2010) The information content of forward-looking statements in corporate filings—a naïve bayesian machine learning approach. J Account Res 48:1049–1102.  https://doi.org/10.1111/j.1475-679X.2010.00382.x CrossRefGoogle Scholar
  62. Loughran T, McDonald B (2011) When is a liability not a liability? Textual analysis, dictionaries, and 10-Ks. The Journal of Finance 66:35–65.  https://doi.org/10.1111/j.1540-6261.2010.01625.x CrossRefGoogle Scholar
  63. Loughran T, McDonald B (2015) The use of word lists in textual analysis. J Behav Finance 16:1–11.  https://doi.org/10.1080/15427560.2015.1000335 CrossRefGoogle Scholar
  64. Loughran T, McDonald B (2016) Textual analysis in accounting and finance: a survey. J Account Res 54:1187–1230.  https://doi.org/10.1111/1475-679X.12123 CrossRefGoogle Scholar
  65. Manning CD, Schütze H (1999) Foundations of statistical natural language processing. MIT Press, CambridgeGoogle Scholar
  66. Mengelkamp A, Hobert S, Schumann M (2015) Corporate credit risk analysis utilizing textual user generated content—a Twitter based feasibility study. Working PaperGoogle Scholar
  67. Mengelkamp A, Schumann M, Wolf S (2016) Data driven creation of sentiment dictionaries for corporate credit risk analysis. In: Proceedings of the 22nd Americas conference on information systems (AMCIS), pp 1–8Google Scholar
  68. Merz L (2012) Langenscheidt Routledge Fachwörterbuch kompakt Wirtschaft Englisch: Englisch-deutsch; deutsch-englisch = Langenscheidt Routledge dictionary of business concise edition English, 4th edn. Langenscheidt Fachwörterbücher, Langenscheidt, BerlinGoogle Scholar
  69. Molina-González MD, Martínez-Cámara E, Martín-Valdivia M-T, Perea-Ortega JM (2013) Semantic orientation for polarity classification in Spanish reviews. Expert Syst Appl 40:7250–7257.  https://doi.org/10.1016/j.eswa.2013.06.076 CrossRefGoogle Scholar
  70. Pennebaker JW, Boyd RL, Jordan K, Blackburn K (2015) The development and psychometric properties of LIWC2015.  https://doi.org/10.15781/t29g6z
  71. Porter MF (1980) An algorithm for suffix stripping. Program 14:130–137.  https://doi.org/10.1108/eb046814 CrossRefGoogle Scholar
  72. Price SM, Doran JS, Peterson DR, Bliss BA (2012) Earnings conference calls and stock returns: the incremental informativeness of textual tone. J Bank Finance 36:992–1011.  https://doi.org/10.1016/j.jbankfin.2011.10.013 CrossRefGoogle Scholar
  73. Ramírez-Esparza N, Pennebaker JW, García FA, Suriá Martínez R (2007) La psicología del uso de las palabras: un programa de computadora que analiza textos en español. Revista Mexicana de Psicología 24:85–99Google Scholar
  74. Remus R, Quasthoff U, Heyer G (2010) SentiWS—a publicly available German-language resource for sentiment analysis. LREC. 2010Google Scholar
  75. Renault T (2017) Intraday online investor sentiment and return patterns in the U.S. stock market. J Bank Finance 84:25–40.  https://doi.org/10.1016/j.jbankfin.2017.07.002 CrossRefGoogle Scholar
  76. Rushdi-Saleh M, Martín-Valdivia MT, Ureña-López LA, Perea-Ortega JM (2011) OCA: opinion corpus for Arabic. J Am Soc Inform Sci Technol 62:2045–2054.  https://doi.org/10.1002/asi.21598 CrossRefGoogle Scholar
  77. Shrout PE, Fleiss JL (1979) Intraclass correlations: uses in assessing rater reliability. Psychol Bull 86:420–428.  https://doi.org/10.1037/0033-2909.86.2.420 CrossRefGoogle Scholar
  78. Sinha NR (2016) Underreaction to news in the US stock market. Q J Finance 6:1–46.  https://doi.org/10.1142/S2010139216500051 CrossRefGoogle Scholar
  79. Stone PJ, Dunphy DC, Smith MS (1966) The general inquirer: a computer approach to content analysis. MIT Press, CambridgeGoogle Scholar
  80. Tan S, Zhang J (2008) An empirical study of sentiment analysis for chinese documents. Expert Syst Appl 34:2622–2629.  https://doi.org/10.1016/j.eswa.2007.05.028 CrossRefGoogle Scholar
  81. Tetlock PC (2007) Giving content to investor sentiment. The role of media in the stock market. J Finance 62:1139–1168.  https://doi.org/10.1111/j.1540-6261.2007.01232.x CrossRefGoogle Scholar
  82. Tetlock PC, Saar-Tsechansky M, MacsKassy S (2008) More than words. Quantifying language to measure firms’ fundamentals. J Finance 63:1437–1467.  https://doi.org/10.1111/j.1540-6261.2008.01362.x CrossRefGoogle Scholar
  83. Tirunillai S, Tellis GJ (2014) Mining marketing meaning from online chatter: strategic brand analysis of big data using latent Dirichlet allocation. J Mark Res 51:463–479.  https://doi.org/10.1509/jmr.12.0106 CrossRefGoogle Scholar
  84. Twedt B, Rees L (2012) Reading between the lines: an empirical examination of qualitative attributes of financial analysts’ reports. J Account Public Policy 31:1–21.  https://doi.org/10.1016/j.jaccpubpol.2011.10.010 CrossRefGoogle Scholar
  85. Wan X (2008) Using bilingual knowledge and ensemble techniques for unsupervised Chinese sentiment analysis. In: Proceedings of the 2008 conference on empirical methods in natural language processing (Honolulu, Hawaii, 2008), pp 553–561Google Scholar
  86. Wang X, Bendle NT, Mai F, Cotte J (2015) The journal of consumer research at 40: a historical analysis. J Consum Res 42:5–18.  https://doi.org/10.1093/jcr/ucv009 CrossRefGoogle Scholar
  87. Wolf M, Horn AB, Mehl MR, Haug S, Pennebaker JW, Kordy H (2008) Computergestützte quantitative Textanalyse. Diagnostica 54:85–98.  https://doi.org/10.1026/0012-1924.54.2.85 CrossRefGoogle Scholar
  88. Zehe A, Becker M, Hettinger L, Hotho A, Reger I (2016) Prediction of happy endings in German novels based on sentiment information. In: Proceedings of the workshop on interactions between data mining and natural language processing 2016, pp 9–16Google Scholar
  89. Zijlstra H, van Meerveid T, van Middendorp H (2004) De Nederlandse versie van de’Linguistic inquiry and word Count’(LIWC). Gedrag Gezondheid 32:271–281Google Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  • Christina Bannier
    • 1
    Email author
  • Thomas Pauls
    • 2
  • Andreas Walter
    • 3
  1. 1.Chair of Banking and FinanceJustus-Liebig-University GiessenGiessenGermany
  2. 2.House of FinanceGoethe University FrankfurtFrankfurtGermany
  3. 3.Chair of Financial ServicesJustus-Liebig-University GiessenGiessenGermany

Personalised recommendations