Speech Recognition on an FPGA Using Discrete and Continuous Hidden Markov Models

Melnikoff, Stephen J.; Quigley, 1Steven F.; Russell, Martin J.

doi:10.1007/3-540-46117-5_22

Stephen J. Melnikoff⁶,
1Steven F. Quigley &
Martin J. Russell⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2438))

Included in the following conference series:

International Conference on Field Programmable Logic and Applications

1346 Accesses
7 Citations

Abstract

Speech recognition is a computationally demanding task, particularly the stage which uses Viterbi decoding for converting pre-processed speech data into words or sub-word units. Any device that can reduce the load on, for example, a PC’s processor, is advantageous. Hence we present FPGA implementations of the decoder based alternately on discrete and continuous hidden Markov models (HMMs) representing monophones, and demonstrate that the discrete version can process speech nearly 5,000 times real time, using just 12% of the slices of a Xilinx Virtex XCV1000, but with a lower recognition rate than the continuous implementation, which is 75 times faster than real time, and occupies 45% of the same device.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Burchard, B. & Romer, R., “A single chip phoneme based HMM speech recognition system for consumer applications,” IEEE Trans. Consumer Elec., 46, No.3, 2000, pp.914–919.
Article Google Scholar
Gorin, A.L., Riccardi, G. & Wright, J.H., “How may I help you?” Speech Communication, 23, 1997, pp.113–127.
Article Google Scholar
Holmes, J. N. & Holmes WJ, “Speech synthesis and recognition,” Taylor & Francis, 2001
Google Scholar
Melnikoff, S.J., James-Roxby, P.B., Quigley, S.F. & Russell, M.J., “Reconfigurable computing for speech recognition: preliminary findings,” FPL 2000, LNCS #1896, 2000, pp.495–504.
Google Scholar
Melnikoff, S.J., Quigley, S.F. & Russell, M.J., “Implementing a hidden Markov model speech recognition system in programmable logic,” FPL 2001, LNCS #2147, 2001, pp.81–90.
Google Scholar
Nakamura K. et al, “Speech recognition chip for monosyllables,” Proc. Asia and South Pacific Design Automation Conference (ASP-DAC 2001), IEEE, 2001, pp.396–399.
Google Scholar
Rabiner, L.R., “A tutorial on hidden Markov models and selected applications in speech recognition,” Proceedings of the IEEE, 77, No.2, 1989, pp.257–286.
Google Scholar
Shi Y.Y., Liu J. & Liu R.S., “Single-chip speech recognition system based on 8051 microcontroller core,” IEEE Trans. Consumer Elec., 47, No.1, 2001, pp.149–153.
Article Google Scholar
Shozakai, M., “Speech interface VLSI for car applications”, ICASSP’ 99, 1999, pp.141–144.
Google Scholar
Stogiannos, P., Dollas, A. & Digalakis, V., “A configurable logic based architecture for real-time continuous speech recognition using hidden Markov models,” Journal of VLSI Signal Processing Systems, 2000, 24, No.2–3, pp.223–240.
Article Google Scholar
Woodland, P.C., Odell, J.J., Valtchev, V. & Young, S.J. “Large vocabulary continuous speech recognition using HTK,” ICASSP’ 94, 1994, pp.125–128.
Google Scholar

Download references

Author information

Authors and Affiliations

Electronic, Electrical and Computer Engineering, University of Birmingham, B15 2TT, Edgbaston, Birmingham, UK
Stephen J. Melnikoff & Martin J. Russell

Authors

Stephen J. Melnikoff
View author publications
You can also search for this author in PubMed Google Scholar
1Steven F. Quigley
View author publications
You can also search for this author in PubMed Google Scholar
Martin J. Russell
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institut für Datentechnik, FG Mikroelektronische Systeme, Technische Universität Darmstadt, Karlstraße 15, 64283, Darmstadt, Germany
Manfred Glesner & Peter Zipf &
Microelectronics Department, LIRMM, 161 rue Ada, 34392, Montpellier Cedex, France
Michel Renovell

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Melnikoff, S.J., Quigley, 1.F., Russell, M.J. (2002). Speech Recognition on an FPGA Using Discrete and Continuous Hidden Markov Models. In: Glesner, M., Zipf, P., Renovell, M. (eds) Field-Programmable Logic and Applications: Reconfigurable Computing Is Going Mainstream. FPL 2002. Lecture Notes in Computer Science, vol 2438. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46117-5_22

Download citation

DOI: https://doi.org/10.1007/3-540-46117-5_22
Published: 16 August 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44108-3
Online ISBN: 978-3-540-46117-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics