Skip to main content

Using Topic Modelling to Improve Prediction of Financial Report Commentary Classes

  • Conference paper
  • First Online:
  • 2339 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12109))

Abstract

We consider the task of predicting the class of commentaries associated with financial discrepancies between actual and estimated sales data. Such analysis of the financial data is helpful in meeting targets and assessing the overall performance of the company. While generating a commentary and its associated class is the task of an analyst, these manual operations might be erroneous and as a result, might lead to a diminished performance for the employed prediction model due to wrong class labels. Accordingly, we propose using topic modelling, namely Latent Dirichlet Allocation (LDA), for automated extraction of the classes of the commentaries. In addition, we use feature selection strategies to improve the accuracy of the prediction models. Our analysis with various time series classification methods points to improved performance due to LDA and feature selection.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Bagnall, A., Lines, J., Bostrom, A., Large, J., Keogh, E.: The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min. Knowl. Disc. 31(3), 606–660 (2016). https://doi.org/10.1007/s10618-016-0483-9

    Article  MathSciNet  Google Scholar 

  2. Blei, D., Ng, A., Jordan, M.: Latent dirichlet allocation. JMLR 3, 993–1022 (2003)

    MATH  Google Scholar 

  3. Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391–407 (1990)

    Article  Google Scholar 

  4. El Mokhtari, K., Higdon, B., Başar, A.: Interpreting financial time series with SHAP values. In: Proceedings of the 29th Annual International Conference on Computer Science and Software Engineering, pp. 166–172 (2019)

    Google Scholar 

  5. El Mokhtari, K., Maidens, J., Bener, A.: Predicting commentaries on a financial report with recurrent neural networks. In: Meurs, M.-J., Rudzicz, F. (eds.) Canadian AI 2019. LNCS (LNAI), vol. 11489, pp. 531–542. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-18305-9_56

    Chapter  Google Scholar 

  6. Peachey Higdon, B., El Mokhtari, K., Başar, A.: Time-series-based classification of financial forecasting discrepancies. In: Bramer, M., Petridis, M. (eds.) SGAI 2019. LNCS (LNAI), vol. 11927, pp. 474–479. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-34885-4_39

    Chapter  Google Scholar 

  7. Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 50–57 (1999)

    Google Scholar 

  8. Asadi Kakhki, S.S., Kavaklioglu, C., Bener, A.: Topic detection and document similarity on financial news. In: Bagheri, E., Cheung, J.C.K. (eds.) Canadian AI 2018. LNCS (LNAI), vol. 10832, pp. 322–328. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-89656-4_34

    Chapter  Google Scholar 

  9. Lee, S., Baker, J., Song, J., Wetherbe, J.C.: An empirical comparison of four text mining methods. In: 2010 43rd Hawaii International Conference on System Sciences, pp. 1–10, January 2010. https://doi.org/10.1109/HICSS.2010.48

  10. Liu, Y., Huang, X., An, A., Yu, X.: ARSA: a sentiment-aware model for predicting sales performance using blogs. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 607–614 (2007)

    Google Scholar 

  11. Miao, J., Huang, J.X., Zhao, J.: TopPRF: a probabilistic framework for integrating topic space into pseudo relevance feedback. ACM TOIS 34(4), 1–36 (2016)

    Article  Google Scholar 

  12. Park, S., Choi, D., Kim, M., Cha, W., Kim, C., Moon, I.C.: Identifying prescription patterns with a topic model of diseases and medications. J. Biomed. Inform. 75, 35–47 (2017)

    Article  Google Scholar 

Download references

Acknowledgement

This work is supported by Smart Computing For Innovation (SOSCIP) consortium, Toronto, Canada.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mucahit Cevik .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

El Mokhtari, K., Cevik, M., Başar, A. (2020). Using Topic Modelling to Improve Prediction of Financial Report Commentary Classes. In: Goutte, C., Zhu, X. (eds) Advances in Artificial Intelligence. Canadian AI 2020. Lecture Notes in Computer Science(), vol 12109. Springer, Cham. https://doi.org/10.1007/978-3-030-47358-7_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-47358-7_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-47357-0

  • Online ISBN: 978-3-030-47358-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics