Summary
In this chapter, the state of the art of word-spotting and rejection methods is presented. After an introduction to word-spotting, available algorithms are classified in several categories. This is followed by a description of template matching word-spotting systems, garbage modeling, and the use of a large vocabulary recognizer in a word-spotting task. Then, we address the issues of vocabulary-independent word-spotting, performance measures, and rejection. The rejection problem is associated to the notion of confidence measure indicating how well an hypothesis matches with the recognized result. Consequently, confidence measures and the related problem of detecting out-of-vocabulary words are considered.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Alleva, F. and Lee, K.-F. (1989). Automatic new word acquisition: Spelling from acoustics. In DARPA Speech and Natural Language Workshop, pages 266–270.
Antoniol, G., Cettolo, M., and Federico, M. (1993). Robust and reliable speech understanding in restricted domains. In IEEE ASR Workshop, pages 103–104.
Asadi, A., Schwartz, R., and Makhoul, J. (1990). Automatic detection of new words in a large-vocabulary continuous speech recognition system. In ICASSP, pages 125–128.
Asadi, A., Schwartz, R., and Makhoul, J. (1991). Automatic modeling for adding new words to a large-vocabulary continuous speech recognition system. In ICASSP, pages 305–308.
Bahl, L., Brown, P., de Souza, P., Mercer, R., and Picheny, M. (1988). Acoustic Markov models in the TANGORA speech recognition system. In ICASSP, pages 497–500.
Boite, J.-M., Boulard, H., D’hoore, B., and Haesen, M. (1993). A new approach towards keyword spotting. In EUROSPEECH, pages 1273–1276.
Boulard, H., D’hoore, B., and Boite, J.-M. (1994). Optimizing recognition and rejection performance in wordspotting systems. In ICASSP, pages I.373-I.376.
Bridle, J. (1973). An efficient elastic-template method for detecting given words in running speech. In Brit Acoust. Soc. Meeting, pages 1–4.
Chigier, B. (1992). Rejection and keyword spotting algorithms for a directory assistance city name recognition application. In ICASSP, pages H.93-II.96.
Christiansen, R. and Rushforth, C. (1977). Detecting and locating key words in continuous speech using linear predictive coding. IEEE Trans. ASSP, ASSP-25(5):361–367.
Cole, R., Novick, D., Fanty, M., and. S. Sutton, P. V., Burnett, D., and Schalkwyk, J. (1994). A prototype voice-response questionnaire for the U.S. census. In ICSLP, pages 683–686.
De la Torre, C. and Acero, A. (1994). Discriminative training of garbage model for non-vocabulary utterance rejection. In ICSLP, pages 475–478.
Feng, M.-W. and Mazor, B. (1992). Continuous word spotting for applications in telecommunications. In ICSLP, pages 21–24.
Gillick, L., Baker, J., Baker, J., Bridle, J., Hunt, M., Ito, Y., Lowe, S., Orloff, J., Peskin, B., Roth, R., and Scattone, F. (1993). Application of large vocabulary continuous speech recognition to topic and speaker identification using telephone speech. In ICASSP, pages II.471-n.474.
Gish, H., Ng, K., and Rohlicek, J. (1992). Secondary processing using speech segments for an HMM word spotting system. In ICSLP, pages 17–20.
Godfrey, J., Holliman, E., and McDaniel, J. (1992). Switchboard: Telephone speech corpus for research and development. In ICASSP, pages 1.517–1.520.
Haeb-Umbach, R., Beyerlein, P., and Thelen, E. (1995). Automatic transcription of unknown words in a speech recognition system. In ICASSP, pages 840–843.
Higgins, A. and Wohlford, R. (1985). Keyword recognition using template concatenation. In ICASSP, pages 1233–1236.
Hofstetter, E. and Rose, R. (1992). Techniques for task independent word spotting in continuous speech messages. In ICASSP, pages H101–11.104.
Inamura, A. and Suzuki, Y. (1990). Speaker-independent word spotting and a transputer-based implementation. In ICSLP, pages 13.5.1–13.5.4.
James, D. and Young, S. (1994). A fast lattice-based approach to vocabulary independent wordspotting. In ICASSP, pages I.377-I.380.
Jones, G., Foote, J., Sparck-Jones, K., and Young, S. (1995). Video mail retieval: The effect of word spotting accuracy on precision. In ICASSP, pages 309–312.
Kimura, T., Niyada, K., Hiraoka, S., Morii, S., and Watanabe, T. (1987). A telephone speech recognition system using word spotting technique based on statistical measure. In ICASSP, pages 1175–1178.
Li, K., Naylor, J., and Rossen, M. (1992). A whole word recurrent neural network for keyword spotting. In ICASSP, pages II.81–II84.
Lleida, E., MariJ., Salavedra, J., Bonafonte, A., Monte, E., and Martinez, A. (1993). Out-of-vocabulary word modelling and rejection for keyword spotting. In EU-ROSPEECH, pages 1265–1268.
Marcus, J. (1992). A novel algorithm for HMM word spotting, performance evaluation and error analysis. In ICASSP, pages II.89–II.92.
Masai, Y., Tanaka, S., and Nitta, T. (1992). Speaker-independent keyword recognition based on SMQ/HMM. In ICSLP, pages 619–622.
Mathan, L. and Miclet, L. (1991). Rejection of extraneous speech input in speech recognition applications using multi-layer perceptrons and the trace of HMMs. In ICASSP, pages 93–96.
Meng, H., Seneff, S., and Zue, V. (1994). Phonological parsing for reversible letter-to-sound/sound-to-letter generation. In ICASSP, pages II.1–II.4
Mercier, G. (1989). Rules and strategies for syllabic segmentation, phoneme identification and tuning in continuous speech. In Lea, W., editor, Towards Robustness in Speech Recognition, pages 409–426. Speech Science Publications.
Morgan, D., Scofield, C., Lorenzo, T., Real, E., and Loconto, D. (1990). A keyword spotter which incorporates neural networks for secondary processing. In ICASSP, pages 113–116.
Myers, C., Rabiner, L., and Rosenberg, A. (1980). An investigation of the use of dynamic time warping for word spotting and connected speech recognition. In ICASSP, pages 173–177.
Nakagawa, S. (1989). Speaker-independent continuous-speech recognition by phoneme-based word spotting and time-synchronous context-free parsing. Computer Speech and Language, 3(3):277–299.
Nakagawa, S., Hauptmann, A., and Tomita, M. (1986). On quick word spotting techniques. In ICASSP, pages 2311–2314.
Nakamura, S., Akabane, T., and Hamaguchi, S. (1993). Robust word spotting in adverse car environments. In EUROSPEECH, pages 1045–1048.
MST (1991). NIST speech disc 6–1.1.
Rahim, M., Lee, C.-H., and Juang, B.-H. (1995). Robust utterance verification for connected digits recognition. In ICASSP, pages 285–288.
Rohlicek, J., Jeanrenaud, P., Ng, K., Gish, H., Musicus, B., and Siu, M. (1993). Phonetic training and language modeling for word spotting. In ICASSP, pages II.459–II.462.
Rohlicek, J., Russel, W., Roukos, S., and Gish, H. (1989). Continuous hidden Markov modeling for speaker-independent word-spotting. In ICASSP, pages 627–630.
Rose, R. (1992). Discriminant wordspotting techniques for rejecting non-vocabulary utterances in unconstrained speech. In ICASSP, pages II.105–II.108.
Rose, R. (1993). Definition of subword acoustic units for wordspotting. In EUROSPEECH, pages 1049–1052.
Rose, R., Chang, E., and Lippmann, R. (1991). Techniques for information retrieval from voice messages. In ICASSP, pages 317–321.
Rose, R., Juang, B.-H., and Lee, C.-H. (1995). A training procedure for verifying string hypotheses in continuous speech recognition. In ICASSP, pages 281–284.
Rose, R. and Paul, D. (1990). A hidden Markov model based keyword recognition system. In ICASSP, pages 129–132.
Rosenberg, A. and Collat, A. (1987). A connected speech recognition system based on spotting diphone-like segments - preliminary results. In ICASSP, pages 85–88.
Song, J. (1993). Continuous HMM for word spotting and rejection of non vocabulary word in speech recognition over telephone networks. In EUROSPEECH, pages 1563–1566.
Sukkar, R. (1994). Rejection for connected digit recognition based on GPD segmental discrimination. In ICASSP, pages I.393–I.396.
Sukkar, R. and Wilpon, J. (1993). A two pass classifier for utterance rejection in keyword spotting. In ICASSP, pages II.451–II.454.
Sunstar (1992). Sunstar Esprit Project 2094. Design and recording of the SAMOGO database, Doc. W-PIV.STC.009.
Takebayashi, Y., Tsuboi, H., and Kanazawa, H. (1991). A robust speech recognition system using word-spotting with noise immunity learning. In ICASSP, pages 905–908.
Teixeira, C., Trancoso, I., and Serralheiro, A. (1992). Single vs. multiple sink models for isolated and connected word recognition. In ETRW: Speech Processing in Adverse Conditions, pages 179–182.
Tsuboi, H., Kanazawa, H., and Takebayashi, Y. (1990). An accelerator for a highspeed spoken word-spotting and noise immunity learning system. In ICSLP, pages 273–276.
Villarubia, L. and Acero, A. (1993). Rejection techniques for digit recognition in telecommunication applications. In ICASSP, pages II.455–II.458.
Weintraub, M. (1993). Keyword-spotting using SRI’s DECIPHER large-vocabulary speech recognition system. In ICASSP, pages II.463–II.466.
Wilcox, L. and Bush, M. (1991). HMM-based wordspotting for voice editing and indexing. In EUROSPEECH, pages 25–28.
Wilpon, J., Miller, L., and Modi, P. (1991). Improvements and applications for key word recognition using hidden Markov modeling techniques. In ICASSP. pages 309–312.
Wilpon, J., Rabiner, L., Lee, C.-H., and Goldman, E. (1990). Automatic recognition of keywords in unconstrained speech using hidden Markov models. IEEE Trans. ASSP, ASSP-38(11):1870–1878.
Wright, J., Carey, M., and Parris, E. (1995). Improved topic spotting through statistical modelling of keyword dependencies. In ICASSP, pages 313–316.
Yamada, M., Komori, Y., and Ohora, Y. (1994). Active/non-active word control using garbage model - unknown word re-evaluation in speech conversation. In ICSLP, pages 823–826.
Young, S. R. (1994a). Detecting misrecognitions and out-of-vocabulary words. In ICASSP, pages II.21–II.24.
Young, S. R. (1994b). Estimating recognition confidence: Methods for conjoining acoustics, semantics, pragmatics and discourse. In ICSLP, pages 2159–2162.
Young, S. R. and Ward, W. (1993a). Learning new words from spontaneous speech. In ICASSP, pages II.590–II.591.
Young, S. R. and Ward, W. (1993b). Recognition confidence measures for spontaneous spoken dialog. In EUROSPEECH, pages 1177–1179.
Zeppenfeld, T., Houghton, R., and Waibel, A. (1993). Improving the MS-TDNN for word spotting. In ICASSP, pages II.475–II.478.
Zeppenfeld, T. and Waibel, A. (1992). A hybrid neural network, dynamic programming word spotter. In ICASSP, pages II.77–II.80.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 1996 Kluwer Academic Publishers
About this chapter
Cite this chapter
Junqua, JC., Haton, JP. (1996). Word-Spotting and Rejection. In: Robustness in Automatic Speech Recognition. The Kluwer International Series in Engineering and Computer Science, vol 341. Springer, Boston, MA. https://doi.org/10.1007/978-1-4613-1297-0_10
Download citation
DOI: https://doi.org/10.1007/978-1-4613-1297-0_10
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4612-8555-7
Online ISBN: 978-1-4613-1297-0
eBook Packages: Springer Book Archive