Skip to main content

Segment Information Extraction from Financial Annual Reports Using Neural Network

  • Conference paper
  • First Online:
Advances in Artificial Intelligence (JSAI 2019)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1128))

Included in the following conference series:

Abstract

This is an extension from a selected paper from JSAI2019. To extract business contents automatically from financial reports is an important problem in the financial area. Especially, segment names and their explanations are important contents that should be extracted. However, the methods for extracting these types of information from financial reports have not been established. In this study, we aim to develop a practical solution for extracting these types of information. To solve this problem, we developed a manually annotated dataset for the task of extracting the segment names and their explanations of each company from financial reports and then developed a recurrent neural network model to solve this task. Our method using the manually annotated dataset outperformed the baseline methods in the task of extracting segment names and their explanations of each company from annual financial reports. In addition, we experimentally demonstrated that our method can be available for this task even when we have a small training dataset. This work is the first work for applying a machine learning method to the task of extracting segment names and their explanations. The insights from this work should be valuable in the industrial area.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Alves, P., Rayson, P., Walker, M., Young, S.: Heterogeneous narrative content in annual reports published as pdf files: extraction, classification and incremental predictive ability. SSRN Electron. J. (2016)

    Google Scholar 

  2. Banko, M., Cafarella, M.J., Soderland, S., Broadhead, M., Etzioni, O.: Open information extraction from the web. In: Proceedings of the 20th International Joint Conference on Artifical Intelligence, pp. 2670–2676 (2007)

    Google Scholar 

  3. Cho, K., van Merriënboer, B., Gulcehre, C., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1724–1734 (2014)

    Google Scholar 

  4. Corro, L.D., Gemulla, R.: ClausIE: clause-based open information extraction. In: In Proceedings of the 22nd International Conference on World Wide Web, pp. 355–366 (2013)

    Google Scholar 

  5. Cui, L., Wei, F., Zhou, M.: Neural open information extraction. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, pp. 407–413 (2018)

    Google Scholar 

  6. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT 2019, pp. 4171–4186 (2019)

    Google Scholar 

  7. EL-Haj, M., Rayson, P., Young, S., Walker, M.: Detecting document structure in a very large corpus of UK financial reports. In: Proceedings of The 9th Edition of the Language Resources and Evaluation Conference, pp. 26–31 (2014)

    Google Scholar 

  8. Etzioni, O., Fader, A., Christensen, J., Soderland, S., Mausam, M.: Open information extraction: the second generation. In: Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, pp. 3–10 (2011)

    Google Scholar 

  9. Hajek, P., Henriques, R.: Mining corporate annual reports for intelligent detection of financial statement fraud - a comparative study of machine learning methods. Knowl.-Based Syst. 128, 139–152 (2017)

    Article  Google Scholar 

  10. Isonuma, M., Fujino, T., Mori, J., Matsuo, Y., Sakata, I.: Extractive summarization using multi-task learning with document classification. In: EMNLP (2017)

    Google Scholar 

  11. Kitamori, S., Sakai, H., Sakaji, H.: Extraction of sentences concerning business performance forecast and economic forecast from summaries of financial statements by deep learning. In: IEEE CIFEr (2017)

    Google Scholar 

  12. Lee, H., Surdeanu, M., MacCartney, B., Jurafsky, D.: On the importance of text analysis for stock price prediction. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC-2014), pp. 1170–1175 (2014)

    Google Scholar 

  13. Madaan, A., Mittal, A., Mausam, Ramakrishnan, G., Sarawagi, S.: Numerical relation extraction with minimal supervision. In: Proceedings of Thirtieth AAAI Conference on Artificial Intelligence, pp. 2764–2771 (2016)

    Google Scholar 

  14. Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS (2013)

    Google Scholar 

  15. Niklaus, C., Cetto, M., Freitas, A., Handschuh, S.: A survey on open information extraction. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 3866–3878 (2018)

    Google Scholar 

  16. Pires, F.M., Abreu, S.: Automatic selection of table areas in documents for information extraction. In: EPIA (2013)

    Google Scholar 

  17. Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: SQuAD: 100,000+ questions for machine comprehension of text. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 2383–2392 (2016)

    Google Scholar 

  18. Sakaji, H., Murono, R., Sakai, H., Bennett, J., Izumi, K.: Discovery of rare causal knowledge from financial statement summaries. In: IEEE CIFEr (2017)

    Google Scholar 

  19. Sheikh, M., Conlon, S.: A rule-based system to extract financial information. J. Comput. Inf. Syst. 52, 10–19 (2012)

    Google Scholar 

  20. Vinyals, O., Fortunato, M., Jaitly, N.: Pointer networks. In: NIPS (2015)

    Google Scholar 

  21. Wang, W., Yan, M., Wu, C.: Multi-granularity hierarchical attention fusion networks for reading comprehension and question answering. In: ACL (2018)

    Google Scholar 

  22. Wang, W., Yang, N., Wei, F., Chang, B., Zhou, M.: Gated self-matching networks for reading comprehension and question answering. In: ACL (2017)

    Google Scholar 

Download references

Acknowledgment

This work was supported in part by JSPS KAKENHI Grant Number JP17J04768.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tomoki Ito .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ito, T., Sakaji, H., Izumi, K. (2020). Segment Information Extraction from Financial Annual Reports Using Neural Network. In: Ohsawa, Y., et al. Advances in Artificial Intelligence. JSAI 2019. Advances in Intelligent Systems and Computing, vol 1128. Springer, Cham. https://doi.org/10.1007/978-3-030-39878-1_20

Download citation

Publish with us

Policies and ethics