Skip to main content

Neural Machine Translation for Financial Listing Documents

  • Conference paper
  • First Online:
  • 2507 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11305))

Abstract

In this paper, we focus on developing a Neural Machine Translation (NMT) system on English-to-Traditional-Chinese translation for financial prospectuses of companies which seek listing on the Hong Kong Stock Exchange. To the best of our knowledge, this is the first work on NMT for this specific domain. We propose a domain-specific NMT system by introducing a domain flag to indicate the target-side domain. By training the NMT model on the data from both the IPO corpus and the general domain corpus, we can expand the vocabulary while capturing the common writing styles and sentence structures. Our experimental results show that the proposed NMT system can achieve a significant improvement on translating the IPO documents. More significantly, through a blind assessment by a translator expert, our system outperforms two mainstream commercial tools, the Google translator and SDL Trado for some IPO documents.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://www.hkex.com.hk/.

  2. 2.

    https://github.com/BYVoid/OpenCC.

  3. 3.

    https://github.com/fxsjy/jieba.

  4. 4.

    https://github.com/moses-smt/mosesdecoder/blob/master/scripts/tokenizer/tokenizer.perl.

  5. 5.

    https://github.com/google/sentencepiece.

  6. 6.

    https://github.com/tensorflow/tensor2tensor.

  7. 7.

    https://translate.google.com.

  8. 8.

    https://www.freetranslation.com.

References

  1. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. CoRR abs/1409.0473 (2014)

    Google Scholar 

  2. Wu, Y., et al.: Google’s neural machine translation system: bridging the gap between human and machine translation. CoRR abs/1609.08144 (2016)

    Google Scholar 

  3. Gehring, J., Auli, M., Grangier, D., Yarats, D., Dauphin, Y.N.: Convolutional sequence to sequence learning. In: ICML, pp. 1243–1252 (2017)

    Google Scholar 

  4. Vaswani, A., et al.: Attention is all you need. In: Guyon, I., et al. (eds.) NIPS, pp. 6000–6010 (2017)

    Google Scholar 

  5. Koehn, P., Och, F.J., Marcu, D.: Statistical phrase-based translation. In: Hearst, M.A., Ostendorf, M. (eds.) HLT-NAACL. The Association for Computational Linguistics (2003)

    Google Scholar 

  6. Kobus, C., Crego, J.M., Senellart, J.: Domain control for neural machine translation. In: RANLP, pp. 372–378 (2017)

    Google Scholar 

  7. Bertoldi, N., Federico, M.: Domain adaptation for statistical machine translation with monolingual resources. In: WMT@EACL, pp. 182–189 (2009)

    Google Scholar 

  8. Johnson, M., et al.: Google’s multi-lingual neural machine translation system: enabling zero-shot translation. TACL 5, 339–351 (2017)

    Google Scholar 

  9. Stajner, S., Querido, A., Rendeiro, N., Rodrigues, J.A., Branco, A.: Use of domain-specific language resources in machine translation. In: LREC (2016)

    Google Scholar 

  10. Wu, H., Wang, H., Zong, C.: Domain adaptation for statistical machine translation with domain dictionary and monolingual corpora. In: COLING, pp. 993–1000 (2008)

    Google Scholar 

  11. Tiedemann, J.: Emerging language spaces learned from massively multilingual corpora. CoRR abs/1802.00273 (2018)

    Google Scholar 

  12. Chu, C., Dabre, R., Kurohashi, S.: An empirical comparison of simple domain adaptation methods for neural machine translation. CoRR abs/1701.03214 (2017)

    Google Scholar 

  13. Hu, Z., Zhang, Z., Yang, H., Chen, Q., Zhu, R., Zuo, D.: Predicting the quality of online health expert question answering services with temporal features in a deep learning framework. Neurocomputing 275, 2769–2782 (2018)

    Article  Google Scholar 

  14. Yang, H., Cheung, L.P.: Implicit heterogeneous features embedding in deep knowledge tracing. Cognit. Comput. 10(1), 314 (2018)

    Article  Google Scholar 

  15. Cheung, L.P., Yang, H.: Heterogeneous features integration in deep knowledge tracing. In: Neural Information Processing - 24th International Conference, ICONIP 2017, Guangzhou, China, 14–18 November 2017, Proceedings, Part II, pp. 653–662 (2017)

    Chapter  Google Scholar 

  16. Britz, D., Goldie, A., Luong, M., Le, Q.V.: Massive exploration of neural machine translation architectures. CoRR abs/1703.03906 (2017)

    Google Scholar 

  17. Ziemski, M., Junczys-Dowmunt, M., Pouliquen, B.: The united nations parallel corpus v1.0. In: LREC (2016)

    Google Scholar 

  18. Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J.R., Bethard, S., McClosky, D.: The stanford CoreNLP natural language processing toolkit. In: ACL, pp. 55–60 (2014)

    Google Scholar 

  19. Papineni, K., Roukos, S., Ward, T., Zhu, W.: Bleu: a method for automatic evaluation of machine translation. In: ACL, pp. 311–318 (2002)

    Google Scholar 

Download references

Acknowledgments

The work described in this paper was partially supported by the Research Grants Council of the Hong Kong Special Administrative Region, China (Project No. UGC/IDS14/16).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Linkai Luo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Luo, L., Yang, H., Siu, S.C., Chin, F.Y.L. (2018). Neural Machine Translation for Financial Listing Documents. In: Cheng, L., Leung, A., Ozawa, S. (eds) Neural Information Processing. ICONIP 2018. Lecture Notes in Computer Science(), vol 11305. Springer, Cham. https://doi.org/10.1007/978-3-030-04221-9_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-04221-9_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-04220-2

  • Online ISBN: 978-3-030-04221-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics