Detection-Based Decoder

Li, Qi (Peter)

doi:10.1007/978-3-642-23731-7_6

Detection-Based Decoder

Qi (Peter) Li²

Chapter
First Online: 01 January 2011

727 Accesses

Part of the book series: Signals and Communication Technology ((SCT))

Abstract

Decoding or searching is an important task in both speaker and speech recognition. In speaker verification (SV), given a spoken password and a speakerdependent hidden Markov model (HMM), the task of decoding or searching is to find optimal state alignments in the sense of maximum likelihood score of the entire utterance. Currently, the most popular decoding algorithm is the Viterbi algorithm with a pre-defined beam width to reduce the search space; however, it is difficult to determine a suitable beam width beforehand. A small beam width may miss the optimal path while a large one may slow down the process. To address the problem, the author has developed a non-heuristic algorithm to reduce the search space. The details are presented in this chapter.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bahl L. R. et al.: “Large vocabulary natural language continuous speech recognition”, in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, pp. 465–467, May 1989
Google Scholar
Bansal R. K.: “An algorithm for detecting a change in stochastic process”, Master Thesis, University of Connecticut, EECS Dept., 1983
Google Scholar
Bansal, R. K., Papantoni-Kazakos, P.: “An algorithm for detecting a change in stochastic process”. IEEE Trans. Information Theory IT-32, 227–235 (1986)
Article MathSciNet Google Scholar
Bellman, R. E.: Dynamic Programming. Princeton University Press, Princeton (1957)
MATH Google Scholar
Brodsky, B., Darkhovsky, B. S.: Nonparametric methods in change-point problems. Kluwer Academic, Boston (1993)
Google Scholar
Chen, J. K., Soong, F. K.: “An n-best candidates-based discriminative training for speech-recognition applications”. IEEE Trans. on Speech and Audio Processing 2, 206–216 (1994)
Article Google Scholar
Deller, J. R., Proakis, J. G., Hansen, J. H. L.: Discrete-time processing of speech signals. Macmillan Publishing, NY (1993)
Google Scholar
Deshmukh, N., Ganapathiraju, A., Picone, J.: search for large-vocabulary conversational speech recognition”. IEEE Signal Processing Magazine 16, 84–107 (1999)
Article Google Scholar
Forney, G. D.: “The Viterbi algorithm”. Proceeding of IEEE 61, 268–278 (1973)
Article MathSciNet Google Scholar
Kazakos, D., Papantoni-Kazakos, P.: Detection and Estimation. Computer Science Press, NY (1990)
Google Scholar
Lee, C.-H., Rabiner, L. R.: “A frame-synchronous network search algorithm for connected word recognition”. IEEE Transactions on Acoustics, Speech, and Signal Processing 37, 1649–1658 (1989)
Article Google Scholar
Li, Q.: “A detection approach to search-space reduction for HMM state alignment in speaker verification”. IEEE Trans. on Speech and Audio Processing 9, 569–578 (2001)
Article Google Scholar
Li Q.: “A fast decoding algorithm based on sequential detection of the changes in distribution”. in Proc. Int’l Conf. on Spoken Language Processing (Sydney), Nov. 1998
Google Scholar
Li Q.: “A fast, sequential decoding algorithm with application to speaker verification”. in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (Phoenix), March 1999
Google Scholar
Li Q., Juang B.-H.: “Speaker verification using verbal information verification for automatic enrollment”. in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (Seattle), May 1998
Google Scholar
Lorden, G.: “Procedures for reacting to a change in distribution”. The Annals of Mathematical Statistics 42, 1897–1908 (1971)
Article MathSciNet MATH Google Scholar
Lowerre, B., Reddy, R.: The HARPY speech understanding system, In: Lea, W. A. (ed) Trends in Speech Recognition., Printice Hall, NJ (1980)
Google Scholar
Ney H., Haeb-Umbach R., Tran B.-H., Oerder M. “Improvements in beam search for 10000-word continuous speech recognition”. in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing (San Francisco, CA), pp. I-9–I-12, March 1992
Google Scholar
Ney, H., Ortmanns, S.: “Dynamic programming search for continuous speech recognition”. IEEE Signal Processing Magazine 16, 64–83 (1999)
Article Google Scholar
Nguyen L., Schwartz R., Kubala F., Placeway P.: “Search algorithms for software-only real-time recognition with very large vocabularies”. in Proceedings of DARPA Human language Technology Workshop, pp. 91–95, March 1993
Google Scholar
Page, E. S.: “Continuous inspection schemes”. Biometrika 41, 100–115 (1954)
MathSciNet MATH Google Scholar
Page, E. S.: “A test for a change in a parameter occuring at an unknown point”. Biometrika 42, 523–527 (1955)
MathSciNet MATH Google Scholar
Papoulis, A.: Probability, Random variables, and stochastic processes. McGraw-Hill, NY (1984)
MATH Google Scholar
Parthasarathy S., Rosenberg A. E.: “General phrase speaker verification using sub-word background models and likelihood-ratio scoring”. in Proceedings of ICSLP-96 (Philadelphia), October 1996
Google Scholar
Rosenberg A. E., Parthasarathy S. “Speaker background models for connected digit password speaker verification”. in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (Atlanta), pp. 81–84, May 1996
Google Scholar
Viterbi, A. J.: “Error bounds for convolutional codes and an asymptotically optimal decoding algorithm”. IEEE Transactions on Information Theory IT-13, 260–269 (1967)
Article Google Scholar
Wald, A.: Sequential analysis. Chapman & Hall, NY (1947)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Li Creative Technologies (LcT), Inc, Vreeland Road 30 A, Suite 130, 07932, Florham Park, NJ, USA
Qi (Peter) Li

Authors

Qi (Peter) Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qi (Peter) Li .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Li, Q.(. (2012). Detection-Based Decoder. In: Speaker Authentication. Signals and Communication Technology. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23731-7_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-23731-7_6
Published: 30 September 2011
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23730-0
Online ISBN: 978-3-642-23731-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics