Abstract
Named entity recognition (NER) for written documents has been studied intensively during the past decades. However, NER for spoken texts is still at its early stage. There are several challenges behind this: spoken texts are usually less grammatical, all in lowercase, and even have no punctuation marks; continuous text chunks like email, hyperlinks are interpreted as discrete tokens; and numeric texts are sometimes interpreted as alphabetic forms. These characteristics are real obstacles for spoken text understanding. In this paper, we propose a lightweight machine learning model to NER for Vietnamese spoken texts that aims to overcome those problems. We incorporated into the model a variety of rich features including sophisticated regular expressions and various look-up dictionaries to make it robust. Unlike previous work on NER, our model does not need to rely on word boundary and part-of-speech information – that are expensive and time-consuming to prepare. We conducted a careful evaluation on a medium-sized dataset about mobile voice interaction and achieved an average \(F_1\) of 92.06. This is a significant result for such a difficult task. In addition, we kept our model compact and fast to integrate it into a mobile virtual assistant for Vietnamese.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Microsoft Skype Translator and AT&T Speech-to-Speech Translation.
- 2.
- 3.
JTextPro: http://jtextpro.sourceforge.net.
References
Angelov, K., Bringert, B., Ranta, A.: Speech-enabled hybrid multilingual translation for mobile devices. In: EACL (2014)
Berger, A., Pietra, S.A.D., Pietra, V.J.D.: A maximum entropy approach to natural language processing. Comput. Linguist. 22(1), 39–71 (1996)
Borthwick, A.: A maximum entropy approach to named entity recognition. Ph.D. dissertation, Department of CS, New York University (1999)
Chieu, H.L., Ng, H.T.: Named entity recognition with a maximum entropy approach. In: The 7th CoNLL, pp. 160–163 (2003)
Chinchor, N., Marsh, E.: MUC-7 information extraction task definition (version 5.1). In: The 7th Message Understanding Conference (MUC) (1998)
Graves, A., Jaitly, N.: Towards end-to-end speech recognition with recurrent neural networks. In: ICML (2014)
Grishman, R., Sundheim, B.: Message understanding conference 6: a brief history. In: The 6th Message Understanding Conference (MUC-6) (1995)
Hatmi, M., Jacquin, C., Morin, E., Meignier, S.: Named entity recognition in speech transcripts following an extended taxonomy. In: SLAM (2013)
Hannun, A., Case, C., Casper, J., Catanzaro, B., Diamos, G., Elsen, E., Prenger, R., Satheesh, S., Sengupta, S., Coates, A., Ng, A.Y.: Deep speech: scaling up end-to-end speech recognition (2014). arXiv:1412.5567v2
Hinton, G., Deng, L., Yu, D., Dahl, G., Mohamed, A., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T., Kingsbury, B.: Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Process. Mag. 29, 82–97 (2012)
Lafferty, J.D., McCallum, A., Pereira, F.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: ICML, pp. 282–289 (2001)
Liu, D., Nocedal, J.: On the limited memory BFGS method for large-scale optimization. Math. Program. 45, 503–528 (1989)
Molla, D., Zaanen, M., Cassidy, S.: Named entity recognition in question answering of speech data. In: The Australasian Language Technology Workshop (2007)
Nguyen, C.T., Tran, T.O., Phan, X.H., Thuy, H.Q.: Named entity recognition in Vietnamese free–text and web documents using CRFs. In: ADD (2007)
Nigam, K., Lafferty, J., McCallum, A.: Using maximum entropy for text classification. In: IJCAI Workshop on Machine Learning for Information Filtering, pp. 61–67 (1999)
Pan, Y.C., Liu, Y.Y., Lee, L.S.: Named entity recognition from spoken documents using global evidences and external knowledge sources with applications on mandarin chinese. In: IEEE Automatic Speech Recognition and Understanding (2005)
Popkin, J.: Google, Apple Siri and IBM Watson: the future of natural-language question answering in your enterprise. Gartner Technical Professional Advice (2013)
Ratnaparkhi, A.: A maximum entropy model for part-of-speech tagging. In: The Empirical Methods in Natural Language Processing Conference (1996)
Tur, G., Mori, R.D.: Spoken Language Understanding: Systems for Extracting Semantic Information from Speech. Wiley, New York (2011)
Acknowledgment
This work was supported by the project QG.15.29 from Vietnam National University, Hanoi (VNU).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tran, PN., Ta, VD., Truong, QT., Duong, QV., Nguyen, TT., Phan, XH. (2016). Named Entity Recognition for Vietnamese Spoken Texts and Its Application in Smart Mobile Voice Interaction. In: Nguyen, N.T., Trawiński, B., Fujita, H., Hong, TP. (eds) Intelligent Information and Database Systems. ACIIDS 2016. Lecture Notes in Computer Science(), vol 9621. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-49381-6_17
Download citation
DOI: https://doi.org/10.1007/978-3-662-49381-6_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-49380-9
Online ISBN: 978-3-662-49381-6
eBook Packages: Computer ScienceComputer Science (R0)