Word-Spotting and Rejection

Junqua, Jean-Claude; Haton, Jean-Paul

doi:10.1007/978-1-4613-1297-0_10

Jean-Claude Junqua³ &
Jean-Paul Haton⁴

Part of the book series: The Kluwer International Series in Engineering and Computer Science ((SECS,volume 341))

Summary

In this chapter, the state of the art of word-spotting and rejection methods is presented. After an introduction to word-spotting, available algorithms are classified in several categories. This is followed by a description of template matching word-spotting systems, garbage modeling, and the use of a large vocabulary recognizer in a word-spotting task. Then, we address the issues of vocabulary-independent word-spotting, performance measures, and rejection. The rejection problem is associated to the notion of confidence measure indicating how well an hypothesis matches with the recognized result. Consequently, confidence measures and the related problem of detecting out-of-vocabulary words are considered.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Alleva, F. and Lee, K.-F. (1989). Automatic new word acquisition: Spelling from acoustics. In DARPA Speech and Natural Language Workshop, pages 266–270.
Chapter Google Scholar
Antoniol, G., Cettolo, M., and Federico, M. (1993). Robust and reliable speech understanding in restricted domains. In IEEE ASR Workshop, pages 103–104.
Google Scholar
Asadi, A., Schwartz, R., and Makhoul, J. (1990). Automatic detection of new words in a large-vocabulary continuous speech recognition system. In ICASSP, pages 125–128.
Google Scholar
Asadi, A., Schwartz, R., and Makhoul, J. (1991). Automatic modeling for adding new words to a large-vocabulary continuous speech recognition system. In ICASSP, pages 305–308.
Google Scholar
Bahl, L., Brown, P., de Souza, P., Mercer, R., and Picheny, M. (1988). Acoustic Markov models in the TANGORA speech recognition system. In ICASSP, pages 497–500.
Google Scholar
Boite, J.-M., Boulard, H., D’hoore, B., and Haesen, M. (1993). A new approach towards keyword spotting. In EUROSPEECH, pages 1273–1276.
Google Scholar
Boulard, H., D’hoore, B., and Boite, J.-M. (1994). Optimizing recognition and rejection performance in wordspotting systems. In ICASSP, pages I.373-I.376.
Google Scholar
Bridle, J. (1973). An efficient elastic-template method for detecting given words in running speech. In Brit Acoust. Soc. Meeting, pages 1–4.
Google Scholar
Chigier, B. (1992). Rejection and keyword spotting algorithms for a directory assistance city name recognition application. In ICASSP, pages H.93-II.96.
Google Scholar
Christiansen, R. and Rushforth, C. (1977). Detecting and locating key words in continuous speech using linear predictive coding. IEEE Trans. ASSP, ASSP-25(5):361–367.
Article Google Scholar
Cole, R., Novick, D., Fanty, M., and. S. Sutton, P. V., Burnett, D., and Schalkwyk, J. (1994). A prototype voice-response questionnaire for the U.S. census. In ICSLP, pages 683–686.
Google Scholar
De la Torre, C. and Acero, A. (1994). Discriminative training of garbage model for non-vocabulary utterance rejection. In ICSLP, pages 475–478.
Google Scholar
Feng, M.-W. and Mazor, B. (1992). Continuous word spotting for applications in telecommunications. In ICSLP, pages 21–24.
Google Scholar
Gillick, L., Baker, J., Baker, J., Bridle, J., Hunt, M., Ito, Y., Lowe, S., Orloff, J., Peskin, B., Roth, R., and Scattone, F. (1993). Application of large vocabulary continuous speech recognition to topic and speaker identification using telephone speech. In ICASSP, pages II.471-n.474.
Google Scholar
Gish, H., Ng, K., and Rohlicek, J. (1992). Secondary processing using speech segments for an HMM word spotting system. In ICSLP, pages 17–20.
Google Scholar
Godfrey, J., Holliman, E., and McDaniel, J. (1992). Switchboard: Telephone speech corpus for research and development. In ICASSP, pages 1.517–1.520.
Google Scholar
Haeb-Umbach, R., Beyerlein, P., and Thelen, E. (1995). Automatic transcription of unknown words in a speech recognition system. In ICASSP, pages 840–843.
Google Scholar
Higgins, A. and Wohlford, R. (1985). Keyword recognition using template concatenation. In ICASSP, pages 1233–1236.
Google Scholar
Hofstetter, E. and Rose, R. (1992). Techniques for task independent word spotting in continuous speech messages. In ICASSP, pages H101–11.104.
Google Scholar
Inamura, A. and Suzuki, Y. (1990). Speaker-independent word spotting and a transputer-based implementation. In ICSLP, pages 13.5.1–13.5.4.
Google Scholar
James, D. and Young, S. (1994). A fast lattice-based approach to vocabulary independent wordspotting. In ICASSP, pages I.377-I.380.
Google Scholar
Jones, G., Foote, J., Sparck-Jones, K., and Young, S. (1995). Video mail retieval: The effect of word spotting accuracy on precision. In ICASSP, pages 309–312.
Google Scholar
Kimura, T., Niyada, K., Hiraoka, S., Morii, S., and Watanabe, T. (1987). A telephone speech recognition system using word spotting technique based on statistical measure. In ICASSP, pages 1175–1178.
Google Scholar
Li, K., Naylor, J., and Rossen, M. (1992). A whole word recurrent neural network for keyword spotting. In ICASSP, pages II.81–II84.
Google Scholar
Lleida, E., MariJ., Salavedra, J., Bonafonte, A., Monte, E., and Martinez, A. (1993). Out-of-vocabulary word modelling and rejection for keyword spotting. In EU-ROSPEECH, pages 1265–1268.
Google Scholar
Marcus, J. (1992). A novel algorithm for HMM word spotting, performance evaluation and error analysis. In ICASSP, pages II.89–II.92.
Google Scholar
Masai, Y., Tanaka, S., and Nitta, T. (1992). Speaker-independent keyword recognition based on SMQ/HMM. In ICSLP, pages 619–622.
Google Scholar
Mathan, L. and Miclet, L. (1991). Rejection of extraneous speech input in speech recognition applications using multi-layer perceptrons and the trace of HMMs. In ICASSP, pages 93–96.
Google Scholar
Meng, H., Seneff, S., and Zue, V. (1994). Phonological parsing for reversible letter-to-sound/sound-to-letter generation. In ICASSP, pages II.1–II.4
Google Scholar
Mercier, G. (1989). Rules and strategies for syllabic segmentation, phoneme identification and tuning in continuous speech. In Lea, W., editor, Towards Robustness in Speech Recognition, pages 409–426. Speech Science Publications.
Google Scholar
Morgan, D., Scofield, C., Lorenzo, T., Real, E., and Loconto, D. (1990). A keyword spotter which incorporates neural networks for secondary processing. In ICASSP, pages 113–116.
Google Scholar
Myers, C., Rabiner, L., and Rosenberg, A. (1980). An investigation of the use of dynamic time warping for word spotting and connected speech recognition. In ICASSP, pages 173–177.
Google Scholar
Nakagawa, S. (1989). Speaker-independent continuous-speech recognition by phoneme-based word spotting and time-synchronous context-free parsing. Computer Speech and Language, 3(3):277–299.
Article Google Scholar
Nakagawa, S., Hauptmann, A., and Tomita, M. (1986). On quick word spotting techniques. In ICASSP, pages 2311–2314.
Google Scholar
Nakamura, S., Akabane, T., and Hamaguchi, S. (1993). Robust word spotting in adverse car environments. In EUROSPEECH, pages 1045–1048.
Google Scholar
MST (1991). NIST speech disc 6–1.1.
Google Scholar
Rahim, M., Lee, C.-H., and Juang, B.-H. (1995). Robust utterance verification for connected digits recognition. In ICASSP, pages 285–288.
Google Scholar
Rohlicek, J., Jeanrenaud, P., Ng, K., Gish, H., Musicus, B., and Siu, M. (1993). Phonetic training and language modeling for word spotting. In ICASSP, pages II.459–II.462.
Google Scholar
Rohlicek, J., Russel, W., Roukos, S., and Gish, H. (1989). Continuous hidden Markov modeling for speaker-independent word-spotting. In ICASSP, pages 627–630.
Google Scholar
Rose, R. (1992). Discriminant wordspotting techniques for rejecting non-vocabulary utterances in unconstrained speech. In ICASSP, pages II.105–II.108.
Google Scholar
Rose, R. (1993). Definition of subword acoustic units for wordspotting. In EUROSPEECH, pages 1049–1052.
Google Scholar
Rose, R., Chang, E., and Lippmann, R. (1991). Techniques for information retrieval from voice messages. In ICASSP, pages 317–321.
Google Scholar
Rose, R., Juang, B.-H., and Lee, C.-H. (1995). A training procedure for verifying string hypotheses in continuous speech recognition. In ICASSP, pages 281–284.
Google Scholar
Rose, R. and Paul, D. (1990). A hidden Markov model based keyword recognition system. In ICASSP, pages 129–132.
Google Scholar
Rosenberg, A. and Collat, A. (1987). A connected speech recognition system based on spotting diphone-like segments - preliminary results. In ICASSP, pages 85–88.
Google Scholar
Song, J. (1993). Continuous HMM for word spotting and rejection of non vocabulary word in speech recognition over telephone networks. In EUROSPEECH, pages 1563–1566.
Google Scholar
Sukkar, R. (1994). Rejection for connected digit recognition based on GPD segmental discrimination. In ICASSP, pages I.393–I.396.
Google Scholar
Sukkar, R. and Wilpon, J. (1993). A two pass classifier for utterance rejection in keyword spotting. In ICASSP, pages II.451–II.454.
Google Scholar
Sunstar (1992). Sunstar Esprit Project 2094. Design and recording of the SAMOGO database, Doc. W-PIV.STC.009.
Google Scholar
Takebayashi, Y., Tsuboi, H., and Kanazawa, H. (1991). A robust speech recognition system using word-spotting with noise immunity learning. In ICASSP, pages 905–908.
Google Scholar
Teixeira, C., Trancoso, I., and Serralheiro, A. (1992). Single vs. multiple sink models for isolated and connected word recognition. In ETRW: Speech Processing in Adverse Conditions, pages 179–182.
Google Scholar
Tsuboi, H., Kanazawa, H., and Takebayashi, Y. (1990). An accelerator for a highspeed spoken word-spotting and noise immunity learning system. In ICSLP, pages 273–276.
Google Scholar
Villarubia, L. and Acero, A. (1993). Rejection techniques for digit recognition in telecommunication applications. In ICASSP, pages II.455–II.458.
Google Scholar
Weintraub, M. (1993). Keyword-spotting using SRI’s DECIPHER large-vocabulary speech recognition system. In ICASSP, pages II.463–II.466.
Google Scholar
Wilcox, L. and Bush, M. (1991). HMM-based wordspotting for voice editing and indexing. In EUROSPEECH, pages 25–28.
Google Scholar
Wilpon, J., Miller, L., and Modi, P. (1991). Improvements and applications for key word recognition using hidden Markov modeling techniques. In ICASSP. pages 309–312.
Google Scholar
Wilpon, J., Rabiner, L., Lee, C.-H., and Goldman, E. (1990). Automatic recognition of keywords in unconstrained speech using hidden Markov models. IEEE Trans. ASSP, ASSP-38(11):1870–1878.
Article Google Scholar
Wright, J., Carey, M., and Parris, E. (1995). Improved topic spotting through statistical modelling of keyword dependencies. In ICASSP, pages 313–316.
Google Scholar
Yamada, M., Komori, Y., and Ohora, Y. (1994). Active/non-active word control using garbage model - unknown word re-evaluation in speech conversation. In ICSLP, pages 823–826.
Google Scholar
Young, S. R. (1994a). Detecting misrecognitions and out-of-vocabulary words. In ICASSP, pages II.21–II.24.
Google Scholar
Young, S. R. (1994b). Estimating recognition confidence: Methods for conjoining acoustics, semantics, pragmatics and discourse. In ICSLP, pages 2159–2162.
Google Scholar
Young, S. R. and Ward, W. (1993a). Learning new words from spontaneous speech. In ICASSP, pages II.590–II.591.
Google Scholar
Young, S. R. and Ward, W. (1993b). Recognition confidence measures for spontaneous spoken dialog. In EUROSPEECH, pages 1177–1179.
Google Scholar
Zeppenfeld, T., Houghton, R., and Waibel, A. (1993). Improving the MS-TDNN for word spotting. In ICASSP, pages II.475–II.478.
Google Scholar
Zeppenfeld, T. and Waibel, A. (1992). A hybrid neural network, dynamic programming word spotter. In ICASSP, pages II.77–II.80.
Google Scholar

Download references

Author information

Authors and Affiliations

Speech Technology Laboratory, USA
Jean-Claude Junqua
CRIN - INRIA, France
Jean-Paul Haton

Authors

Jean-Claude Junqua
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Paul Haton
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Junqua, JC., Haton, JP. (1996). Word-Spotting and Rejection. In: Robustness in Automatic Speech Recognition. The Kluwer International Series in Engineering and Computer Science, vol 341. Springer, Boston, MA. https://doi.org/10.1007/978-1-4613-1297-0_10

Download citation

DOI: https://doi.org/10.1007/978-1-4613-1297-0_10
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4612-8555-7
Online ISBN: 978-1-4613-1297-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics