Abstract
We describe a simple spoken utterance classification method suitable for data-sparse domains which can be approximately described by CFG grammars. The central idea is to perform robust matching of CFG rules against output from a large-vocabulary recogniser, using a dynamic programming method which optimises the tf-idf score of the matched grammar string. We present results of experiments carried out on a substantial CFG-based medical speech translator and the publicly available Spoken CALL Shared Task. Robust utterance classification using the tf-idf method strongly outperforms plain CFG-based recognition for both domains. When comparing with Naive Bayes classifiers trained on data sampled from the CFG grammars, the tf-idf/dynamic programming method is much better on the complex speech translation domain, but worse on the simple Spoken CALL Shared Task domain.
The work described here was funded by the Fondation privée des Hôpitaux universitaires de Genève and Unitec. We would like to thank Pierrette Bouillon for semantic annotation of the test data and many helpful suggestions, and Nuance Inc for generously making their software available to us for research purposes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
https://regulus.unige.ch/spokencallsharedtask, “Results” tab.
References
Aho, A.V., Ullman, J.D.: Properties of syntax directed translations. J. Comput. Syst. Sci. 3(3), 319–334 (1969)
Baur, C., Chua, C., Gerlach, J., Rayner, E., Russell, M., Strik, H., Wei, X.: Overview of the 2017 spoken CALL shared task. In: Proceedings of the Seventh SLaTE Workshop, Stockholm, Sweden (2017)
Bouillon, P., Gerlach, J., Spechbach, H., Tsourakis, N., Halimi, S.: BabelDr vs Google Translate: a user study at Geneva University Hospitals (HUG). In: Proceedings of the 20th Conference of the European Association for Machine Translation (EAMT), Prague, Czech Republic (2017)
Bouillon, P., Spechbach, H.: BabelDr: a web platform for rapid construction of phrasebook-style medical speech translation applications. In: Proceedings of EAMT 2016, Vilnius, Latvia (2016)
Hakkani-Tür, D., Béchet, F., Riccardi, G., Tur, G.: Beyond ASR 1-best: using word confusion networks in spoken language understanding. Comput. Speech Lang. 20(4), 495–514 (2006)
Holmes, G., Donkin, A., Witten, I.H.: Weka: A machine learning workbench. In: Proceedings of the Second Australian and New Zealand Conference on Intelligent Information Systems, pp. 357–361. IEEE (1994)
Kuo, H.K.J., Lee, C.H., Zitouni, I., Fosler-Lussier, E., Ammicht, E.: Discriminative training for call classification and routing. Training 8, 9 (2002)
Mesnil, G., He, X., Deng, L., Bengio, Y.: Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding. In: Interspeech, pp. 3771–3775 (2013)
Patil, S., Davies, P.: Use of Google Translate in medical communication: evaluation of accuracy. BMJ 349, g7392 (2014)
Qian, M., Wei, X., Jancovic, P., Russell, M.: The University of Birmingham 2017 SLaTE CALL shared task systems. In: Proceedings of the Seventh SLaTE Workshop, Stockholm, Sweden (2017)
Rayner, M., Bouillon, P., Ebling, S., Strasly, I., Tsourakis, N.: A framework for rapid development of limited-domain speech-to-sign phrasal translators. In: Proceedings of the workshop on Future and Emerging Trends in Language Technology, Sevilla, Spain (2015)
Sparck Jones, K.: A statistical interpretation of term specificity and its application in retrieval. J. Doc. 28(1), 11–21 (1972)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Rayner, M., Tsourakis, N., Gerlach, J. (2017). Lightweight Spoken Utterance Classification with CFG, tf-idf and Dynamic Programming. In: Camelin, N., Estève, Y., Martín-Vide, C. (eds) Statistical Language and Speech Processing. SLSP 2017. Lecture Notes in Computer Science(), vol 10583. Springer, Cham. https://doi.org/10.1007/978-3-319-68456-7_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-68456-7_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68455-0
Online ISBN: 978-3-319-68456-7
eBook Packages: Computer ScienceComputer Science (R0)