Thai Named-Entity Recognition Using Variational Long Short-Term Memory with Conditional Random Field

  • Can UdomcharoenchaikitEmail author
  • Peerapon Vateekul
  • Prachya Boonkwan
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 807)


Thai Named-Entity Recognition (NER) is a difficult task due to the characteristics of Thai language such as the lack of special character that separates named-entity from other word types. Previous Thai NER system heavily depends on human’s knowledge in a form of feature selection and external resources such as dictionaries. A recent trend in NER research shows that deep learning approach can be used to train high-quality NER system without resorting on these external resources. In this paper, we present a deep learning model that combines recurrent neural networks with a probabilistic graphical model, as well as, a variational inference-based dropout approach. We benchmark our model on one of the largest Thai corpora called “BEST 2010”. Our model outperforms all baseline methods without relying on extra manually annotated resources and external knowledge.


Named-Entity Recognition Natural Language Processing Deep learning 


  1. 1.
    Chanlekha, H., Kawtrakul, A.: Thai named entity extraction by incorporating maximum entropy model with simple heuristic information. In: Proceedings of the IJCNLP (2004)Google Scholar
  2. 2.
    Suwanno, N., Suzuki, Y., Yamazaki, H.: Selecting the optimal feature sets for Thai named entity extraction. In: Proceedings of ICEE-2007 & PEC, vol. 5 (2007)Google Scholar
  3. 3.
    Tirasaroj, N., Aroonmanakun, W.: Thai named entity recognition based on conditional random fields. In: Eighth International Symposium on Natural Language Processing, SNLP 2009, pp. 216–220. IEEE (2009)Google Scholar
  4. 4.
    Tirasaroj, N., Aroonmanakun, W.: The effect of answer patterns for supervised named entity recognition in Thai. In: PACLIC 2011, pp. 392–399 (2011)Google Scholar
  5. 5.
    Collobert, J., Weston, L., Bottou, M., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)Google Scholar
  6. 6.
    Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)
  7. 7.
    Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional LSTM-CNNs. Trans. Assoc. Comput. Linguist. 4, 357–370 (2016)Google Scholar
  8. 8.
    Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360 (2016)
  9. 9.
    Lafferty, J., et al.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning, ICML, vol. 1, pp. 282–289 (2001)Google Scholar
  10. 10.
    Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques. MIT press, Cambridge (2009)Google Scholar
  11. 11.
    Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)CrossRefGoogle Scholar
  12. 12.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  13. 13.
    Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
  14. 14.
    Gal, Y., Ghahramani, Z.: A theoretically grounded application of dropout in recurrent neural networks. In: Advances in Neural Information Processing Systems, pp. 1019–1027 (2016)Google Scholar
  15. 15.
    Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNetzbMATHGoogle Scholar
  16. 16.
    Aw, A., Mahani, S.A., Lertcheva, N., Kalunsima, S.: TaLAPi - a Thai linguistically annotated corpus for language processing. In: LREC, pp. 125–132 (2014)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Can Udomcharoenchaikit
    • 1
    Email author
  • Peerapon Vateekul
    • 1
  • Prachya Boonkwan
    • 2
  1. 1.Faculty of Engineering, Department of Computer EngineeringChulalongkorn UniversityBangkokThailand
  2. 2.NECTEC, Language and Semantic Technology Lab (LST)PathumthaniThailand

Personalised recommendations