Multiple-Pass Search Strategies

Schwartz, Richard; Nguyen, Long; Makhoul, John

doi:10.1007/978-1-4613-1367-0_18

Richard Schwartz³,
Long Nguyen³ &
John Makhoul³

Part of the book series: The Kluwer International Series in Engineering and Computer Science ((SECS,volume 355))

432 Accesses
22 Citations

Abstract

Large vocabulary speech recognition is very expensive computationally. We explore multi-pass search strategies as a way to reduce computation substantially, without any increase in error rate. We consider two basic strategies: the N-best Paradigm, and the Forward-Backward search. Both of these strategies operate on the entire sentence in (at least) two passes. The N-best Paradigm computes alternative hypotheses for a sentence, which can later be rescored using more detailed and more expensive knowledge sources. We present and compare many algorithms for finding the N-best sentence hypotheses, and suggest which are the most efficient and accurate. The Forward-Backward Search performs a time-synchronous forward search that finds all of the words that are likely to end at each frame within an utterance. Then, a second more expensive search can be performed in the backward direction, restricting its attention to those words found in the forward pass.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

F. Alleva, X. Huang and M.-Y. Hwang, “An Improved Search Algorithm Using Incremental Knowledge for Continuous Speech Recognition”, IEEE ICASSP-93, pp. II–307–310, April 1993.
Google Scholar
S. Austin, R. Schwartz, and P. Placeway, “The Forward-Backward Search Strategy for Real-Time Speech Recognition”, IEEE ICASSP-91, Toronto, Canada, pp. 697–700, May 1991. Also in Proc. of the DARPA Speech and Natural Language Workshop, Hidden Valley, June 1990.
Google Scholar
A. Austin, G. Zavaliagkos, J. Makhoul and R. Schwartz, “Speech Recognition using Segmental Neural Nets”, IEEE ICASSP-92.
Google Scholar
L. R. Bahl, P. de Souza, P. S. Gopalakrishnan, D. Kanevsky and D. Na-hamoo, “Constructing Groups of Acoustically Confusable Words”, IEEE ICASSP-90, April 1990.
Google Scholar
J.-L. Gauvain, L. F. Lamel, G. Adda, and M. Adda-Decker, “The LIMSI Continuous Speech Dictation System: Evaluation on the ARPA Wall Street Journal Task” IEEE ICASSP-94, pp. 557–560, Adelaide, Australia, April 1994.
Google Scholar
L Gillick and R. Roth, “A Rapid Match Algorithm for Continuous Speech Recognition”, Proc. of the DARPA Speech and Natural Language Workshop, Hidden Valley, June 1990.
Google Scholar
P. S. Gopalakrishnan, L. R. Bahl and R. L. Mercer, “A Tree Search Strategy for Large-Vocabulary Continuous Speech Recognition”, IEEE ICASSP-95, pp. I–572–575, Detroit, MI, May, 1995.
Google Scholar
J. Mariño and E. Monte, “Generation of Multiple Hypothesis in Connected Phonetic-Unit Recognition by a Modified One-Stage Dynamic Programming Algorithm”, Proc. of the EuroSpeech-89, Vol. 2, pp. 408–411, Paris, Sept. 1989.
Google Scholar
H. Murveit, J. Butzberger, V. Digalakis and M. Weintraub, “Large Vocabulary Dictation using SRI’s Decipher Speech Recognition System: Progressive Search Techniques”, IEEE ICASSP-93, Vol. II pp. 319–322, Minneapolis, MN, April, 1993.
Google Scholar
L. Nguyen, R. Schwartz, F. Kubala and P. Placeway, “Search Algorithms for Software-Only Real-Time Recognition with Very Large Vocabularies”, Proc. of ARPA Human Language Technology Workshop, pp. 91–95, Plains-boro, NJ, Mar. 1993.
Google Scholar
L. Nguyen, R. Schwartz, Y. Zhao and G. Zavaliagkos, “Is N-best Dead?”, Proc. of ARPA Human Language Technology Workshop, pp. 411–414, Plainsboro, NJ, Mar. 1994.
Google Scholar
M. Ostendorf, A. Kannan, O. Kimball, R. Schwartz, S. Austin and R. Rohlicek, “Integration of Diverse Recognition Methodologies Through Reevaluation of N-Best Sentence Hypotheses”. Proc. of the DARPA Speech and Natural Language Workshop, Monterey, Feb. 1991.
Google Scholar
D. Paul, “Algorithms for an Optimal A* Search and Linearizing the Search in the Stack Decoder”, IEEE ICASSP-91, pp. 693–696, Toronto, Canada, May 1991.
Google Scholar
P. Price, W. M. Fisher, J. Bernstein and D.S. Pallett, “The DARPA 1000-Word Resource Management Database for Continuous Speech Recognition,” IEEE ICASSP-88, pp. 651–654, New York, NY, April 1988.
Google Scholar
R. Schwartz and Y. L. Chow, “The N-Best Algorithm: An Efficient and Exact Procedure for Finding the N Most Likely Sentence Hypotheses”, IEEE ICASSP-90, pp. 81–84, Albuquerque, April 1990. Also in Proc. of the DARPA Speech and Natural Language Workshop, Cape Cod, Oct. 1989.
Google Scholar
R. Schwartz and S. Austin, “A Comparison Of Several Approximate Algorithms for Finding Multiple (N-Best) Sentence Hypotheses”, IEEE ICASSP-91, pp. 701–704, Toronto, Canada, May 1991.
Google Scholar
F. Soong and E. Huang, “A Tree-Trellis Based Fast Search for Finding the N Best Sentence Hypotheses in Continuous Speech Recognition”. IEEE ICASSP-91, pp. 705–708, Toronto, Canada, May 1991. Also in Proc. of the DARPA Speech and Natural Language Workshop, Hidden Valley, June 1990.
Google Scholar
V. Steinbiss, “Sentence-Hypotheses Generation in a Continuous-Speech Recognition System,” Proc. EuroSpeech-89, Vol. 2, pp. 51–54, Paris, Sept. 1989.
Google Scholar
P. Woodland, C. Legetter, J. Odell, V. Valtchev, and S. Young, “The Development of the 1994 HTK Large Vocabulary Speech Recognition System”, Proc. of ARPA Spoken Language Technology Workshop, pp. 104–109, Austin, TX, January, 1995.
Google Scholar
S. Young, “Generating Multiple Solutions from Connected Word DP Recognition Algorithms”. Proc. of the Institute of Acoustics, Vol. 6 Part 4, pp. 351–354, 1984.
Google Scholar
G. Zavaliagkos, S. Austin, J. Makhoul and R. Schwartz, “A Hybrid Continuous Speech Recognition System Using Segmental Neural Nets With Hidden Markov Models”, International Journal of Pattern Recognition and Artificial Intelligence, Vol. 7, No. 4, pp. 949–963, 1993.
Article Google Scholar

Download references

Author information

Authors and Affiliations

BBN Corporation, Cambridge, MA, 02138, USA
Richard Schwartz, Long Nguyen & John Makhoul

Authors

Richard Schwartz
View author publications
You can also search for this author in PubMed Google Scholar
Long Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
John Makhoul
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

AT&T Bell Laboratories, Murray Hill, NJ, 07974, USA
Chin-Hui Lee & Frank K. Soong &
School of Microelectronic Engineering, Griffith University, Australia
Kuldip K. Paliwal

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Schwartz, R., Nguyen, L., Makhoul, J. (1996). Multiple-Pass Search Strategies. In: Lee, CH., Soong, F.K., Paliwal, K.K. (eds) Automatic Speech and Speaker Recognition. The Kluwer International Series in Engineering and Computer Science, vol 355. Springer, Boston, MA. https://doi.org/10.1007/978-1-4613-1367-0_18

Download citation

DOI: https://doi.org/10.1007/978-1-4613-1367-0_18
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4612-8590-8
Online ISBN: 978-1-4613-1367-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics