Skip to main content

Lightweight Spoken Utterance Classification with CFG, tf-idf and Dynamic Programming

  • Conference paper
  • First Online:
Statistical Language and Speech Processing (SLSP 2017)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10583))

Included in the following conference series:

Abstract

We describe a simple spoken utterance classification method suitable for data-sparse domains which can be approximately described by CFG grammars. The central idea is to perform robust matching of CFG rules against output from a large-vocabulary recogniser, using a dynamic programming method which optimises the tf-idf score of the matched grammar string. We present results of experiments carried out on a substantial CFG-based medical speech translator and the publicly available Spoken CALL Shared Task. Robust utterance classification using the tf-idf method strongly outperforms plain CFG-based recognition for both domains. When comparing with Naive Bayes classifiers trained on data sampled from the CFG grammars, the tf-idf/dynamic programming method is much better on the complex speech translation domain, but worse on the simple Spoken CALL Shared Task domain.

The work described here was funded by the Fondation privée des Hôpitaux universitaires de Genève and Unitec. We would like to thank Pierrette Bouillon for semantic annotation of the test data and many helpful suggestions, and Nuance Inc for generously making their software available to us for research purposes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://regulus.unige.ch/spokencallsharedtask, “Results” tab.

References

  1. Aho, A.V., Ullman, J.D.: Properties of syntax directed translations. J. Comput. Syst. Sci. 3(3), 319–334 (1969)

    Article  MathSciNet  MATH  Google Scholar 

  2. Baur, C., Chua, C., Gerlach, J., Rayner, E., Russell, M., Strik, H., Wei, X.: Overview of the 2017 spoken CALL shared task. In: Proceedings of the Seventh SLaTE Workshop, Stockholm, Sweden (2017)

    Google Scholar 

  3. Bouillon, P., Gerlach, J., Spechbach, H., Tsourakis, N., Halimi, S.: BabelDr vs Google Translate: a user study at Geneva University Hospitals (HUG). In: Proceedings of the 20th Conference of the European Association for Machine Translation (EAMT), Prague, Czech Republic (2017)

    Google Scholar 

  4. Bouillon, P., Spechbach, H.: BabelDr: a web platform for rapid construction of phrasebook-style medical speech translation applications. In: Proceedings of EAMT 2016, Vilnius, Latvia (2016)

    Google Scholar 

  5. Hakkani-Tür, D., Béchet, F., Riccardi, G., Tur, G.: Beyond ASR 1-best: using word confusion networks in spoken language understanding. Comput. Speech Lang. 20(4), 495–514 (2006)

    Article  Google Scholar 

  6. Holmes, G., Donkin, A., Witten, I.H.: Weka: A machine learning workbench. In: Proceedings of the Second Australian and New Zealand Conference on Intelligent Information Systems, pp. 357–361. IEEE (1994)

    Google Scholar 

  7. Kuo, H.K.J., Lee, C.H., Zitouni, I., Fosler-Lussier, E., Ammicht, E.: Discriminative training for call classification and routing. Training 8, 9 (2002)

    Google Scholar 

  8. Mesnil, G., He, X., Deng, L., Bengio, Y.: Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding. In: Interspeech, pp. 3771–3775 (2013)

    Google Scholar 

  9. Patil, S., Davies, P.: Use of Google Translate in medical communication: evaluation of accuracy. BMJ 349, g7392 (2014)

    Article  Google Scholar 

  10. Qian, M., Wei, X., Jancovic, P., Russell, M.: The University of Birmingham 2017 SLaTE CALL shared task systems. In: Proceedings of the Seventh SLaTE Workshop, Stockholm, Sweden (2017)

    Google Scholar 

  11. Rayner, M., Bouillon, P., Ebling, S., Strasly, I., Tsourakis, N.: A framework for rapid development of limited-domain speech-to-sign phrasal translators. In: Proceedings of the workshop on Future and Emerging Trends in Language Technology, Sevilla, Spain (2015)

    Google Scholar 

  12. Sparck Jones, K.: A statistical interpretation of term specificity and its application in retrieval. J. Doc. 28(1), 11–21 (1972)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Manny Rayner .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Rayner, M., Tsourakis, N., Gerlach, J. (2017). Lightweight Spoken Utterance Classification with CFG, tf-idf and Dynamic Programming. In: Camelin, N., Estève, Y., Martín-Vide, C. (eds) Statistical Language and Speech Processing. SLSP 2017. Lecture Notes in Computer Science(), vol 10583. Springer, Cham. https://doi.org/10.1007/978-3-319-68456-7_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-68456-7_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-68455-0

  • Online ISBN: 978-3-319-68456-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics