Advertisement

ATVS-CSLT-HCTLab System for NIST 2013 Open Keyword Search Evaluation

  • Javier Tejedor
  • Doroteo T. Toledano
  • Dong Wang
Conference paper
  • 682 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8854)

Abstract

This paper presents the ATVS-CSLT-HCTLab spoken term detection (STD) system submitted to the NIST 2013 Open Keyword Search evaluation. The evaluation consists of searching a list of query terms in Vietnamese conversational speech data. Our submission involves an automatic speech recognition (ASR) subsystem which converts speech signals into word/phone lattices, and an STD subsystem which indexes and searches for query terms. The submission is a hybrid approach which employs a word-based system to search for in-vocabulary (INV) terms and a phone-based system to search for out-of-vocabulary (OOV) terms. A term-dependent discriminative confidence estimation is employed to score confidence of detections. Although the ASR performance is not state-of-the-art, our submission achieves a moderate STD performance in the evaluation.

Keywords

spoken term detection evaluation N-gram reverse indexing term-dependent discriminative confidence 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Abad, A., Rodríguez-Fuentes, L.J., Peñagarikano, M., Varona, A., Bordel, G.: On the calibration and fusion of heterogeneous spoken term detection systems. In: Proc. of Interspeech, pp. 20–24 (2013)Google Scholar
  2. 2.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: An update. SIGKDD Explorations 11(1) (2009)Google Scholar
  3. 3.
    Katsurada, K., Miura, S., Seng, K., Iribe, Y., Nitta, T.: Acceleration of spoken term detection using a suffix array by assigning optimal threshold values to subkeywords. In: Proc. of Interspeech, pp. 11–14 (2013)Google Scholar
  4. 4.
    Li, H., Han, J., Zheng, T., Zheng, G.: A novel confidence measure based on context consistency for spoken term detection. In: Proc. of Interspeech, pp. 2429–2430 (2012)Google Scholar
  5. 5.
    Liu, C., Wang, D., Tejedor, J.: N-gram FST indexing for spoken term detection. In: Proc. of Interspeech, pp. 2093–2096 (2012)Google Scholar
  6. 6.
    Martin, A., Doddington, G., Kamm, T., Ordowski, M., Przybocki, M.: The DET curve in assessment of detection task performance. In: Proc. of Eurospeech, pp. 1895–1898 (1997)Google Scholar
  7. 7.
    NIST: The spoken term detection (STD) 2006 evaluation plan, 10 edn. National Institute of Standards and Technology (NIST), Gaithersburg, MD, USA (September 2006), http://www.nist.gov/speech/tests/std
  8. 8.
    Norouzian, A., Rose, R.: An approach for efficient open vocabulary spoken term detection. Speech Communication 57, 50–62 (2014)CrossRefGoogle Scholar
  9. 9.
    Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlicek, P., Qian, Y., Schwarz, P., Silovsky, J., Stemmer, G., Vesely, K.: The KALDI speech recognition toolkit. In: Proc. of ASRU (2011)Google Scholar
  10. 10.
    Stolcke, A.: SRILM - an extensible language modeling tool. In: Proc. of ICSLP, pp. 901–904 (2002)Google Scholar
  11. 11.
    Szoke, I.: Hybrid word-subword spoken term detection. Ph.D. thesis, Brno University of Technology (June 2010)Google Scholar
  12. 12.
    Wang, D.: Out-of-vocabulary Spoken Term Detection. Ph.D. thesis, University of Edinburgh (December 2009)Google Scholar
  13. 13.
    Wang, D., King, S.: Letter-to-sound pronunciation prediction using conditional random fields. IEEE Signal Processing Letters 18(2), 122–125 (2011)CrossRefGoogle Scholar
  14. 14.
    Wang, D., Tejedor, J., King, S., Frankel, J.: Term-dependent confidence normalization for out-of-vocabulary spoken term detection. Journal of Computer Science and Technology 27(2), 358–375 (2012)CrossRefGoogle Scholar
  15. 15.
    Wessel, F., Macherey, K., Schluter, R.: Using word probabilities as confidence measures. In: Proc. of ICASSP, pp. 225–228 (1998)Google Scholar
  16. 16.
    Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: The HTK v3.4 Book. Engineering Department, Cambridge University (March 2006)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Javier Tejedor
    • 1
  • Doroteo T. Toledano
    • 2
  • Dong Wang
    • 3
  1. 1.GEINTRAUniversity of AlcaláSpain
  2. 2.ATVS-Biometric Recognition GroupUniversidad Autónoma de MadridSpain
  3. 3.Center for Speech and Language Technologies (CSLT)Tsinghua UniversityChina

Personalised recommendations