The Real Time Implementation of the Recognition Stage

Breitschaedel, Robert; Ciaramella, Alberto; Clementino, Davide; Pacifici, Roberto; Riviere, Jean Pierre; Venuti, Giovanni

doi:10.1007/978-3-642-84341-9_3

Robert Breitschaedel²,
Alberto Ciaramella³,
Davide Clementino³,
Roberto Pacifici³,
Jean Pierre Riviere⁴ &
…
Giovanni Venuti³

Part of the book series: Research Reports ESPRIT ((1546,volume 1))

45 Accesses

Abstract

Subtasks 2.2 and 2.3 of the P26 project have been devoted to the design of a hardware architecture and to the implementation on it, in real time, of recognition algorithms already developed and experimented within Subtask 2.1.: this real time implementation of the recognition stage will be called RICO in the following. Table 3.1 summarizes the key points we considered when we started our work, i.e. algorithmic requirements, project development constraints, hardware and software technology limits; they contributed to the definition of RICO main characteristics, summarized in Table 3.2: in the following of this paragraph we will detail these considerations. We started with the consideration that recognition algorithms can be distinguished into two principal blocks, a first “feature extraction” block till vector quantization and phonetic classification of frames, and a following “search” block extracting the lattice of most likely words using dynamic programming: this system “cut” corresponds to the minimal flow of data and besides separates blocks with different computational characteristics. The first block in fact is characterized by predictable execution times, cyclic computations, vector data structures, not-too-large data addressing requirements: this block in fact implements “traditional” DSP algorithms, for which the DSP chips fit well. Instead memory and computational requirements of the second block heavily depend on the recognition vocabulary size and on the speaking style (continuous speech of course is more demanding than isolated words) and also exhibit a time dependency for the same utterance; in each case, for the real time recognition of continuous speech with a 1K words vocabulary, the computational requirements are quite demanding, although were not clearly defined at the beginning of the project.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Bibliography

Y. Kawakami, H. Ishizuka, M. Watari, H. Sakoe, T. Hoshi, T. Iwata: “A microprocessor for speech recognition”, IEEE Journal on Selected Areas in Communications, vol. 3, pp. 369–376, March 1985
Article Google Scholar
R.E. Owen: “A VLSI dynamic time warp processor for connected and isolated word speech recognition”, Proc. of the ICASSP’ 85, pp. 985–988, Tampa, Fla., March 1985
Google Scholar
G. Quenot, J.L. Gauvain, J.J. Gangolf, J. Mariani: “A dynamic time warp VLSI processor for continuous speech recognition”, Proc. of the ICASSP’ 86, pp. 1549–1542, Tokyo, Japan, Apr. 1986
Google Scholar
J.R. Mann, F.M. Rhodes: “A wafer scale DTW multiprocessor”, Proc. of the ICASSP’ 86, pp. 1557–1560, Tokyo, Japan, Apr. 1986
Google Scholar
R. A. Kavaler, M. Lowy, H. Murveit, R. R. Brodersen: “A Dynamic Time Warp Integrated Circuit for a 1000 word speech recognition system”, IEEE Journal of Solid State Circuits, vol. 22, pp. 3–14, February 1987
Article Google Scholar
S.G. Glinski, T.M. Lalumia, D.R. Cassiday, Taiho Koh, C. Gerveshi, G. A. Wilson, J. Kumar: “The Graph Search Machine: A VLSI architecture for connected speech recognition and other applications”, IEEE Proceedings, vol. 75, pp. 1172–1184, Sept. 1987
Article Google Scholar
R. Cecinati, A. Ciaramella, G. Venuti, C. Vicenzi: “A dynamic time warping custom integrated circuit for speech recognition”, Proc. of the EUSIPCO’ 86, The Hague, The Netherlands, pp. 1215–1218, Sept. 1986
Google Scholar
R. Cecinati, A. Ciaramella, L. Licciardi, G. Venuti: “Implementation of a dynamic time warp integrated circuit for large vocabulary isolated and connected speech recognition”, Proc. of EUROSPEECH’ 89, pp. 565–568, Paris, France, Sept. 1989
Google Scholar
A. Albarello, R. Breitschaedel, A. Ciaramella, E. Lenormand, R. Pacifici, J. Potage, J.P. Riviere, N. Scheibel, G. Venuti: “Implementation of an acoustical front-end using the TMS32020”, Proc. of the Digital Signal Processing Conference, Florence, Italy, September 1987
Google Scholar
C. Erskine, S. Magar: “Architecture and applications of a second generation digital signal processor”, Proc. of the ICASSP’ 85, pp. 228–231, Tampa, Fla., March 1985
Google Scholar
K.S. Lin, G.A. Frantz, R. Simar jr.: “The TMS32020 family of digital signal processors”, IEEE Proceedings, vol. 75, pp. 1143–1159, Sept. 1987
Article Google Scholar
D.B. Roe, A.L. Gorin, P. Ramesh: “Incorporating syntax into the level-building algorithm on a tree-structured parallel computer”, Proc. of the ICASSP’ 89, pp. 778–781, Glasgow, UK, May 1989
Google Scholar
R. Bisiani, T. Anantharaman, L. Butcher: “BEAM: an accelerator for speech recognition”, Proc. of the ICASSP’ 89, pp. 782–784, Glasgow, UK, May 1989
Google Scholar
S. Chatterjee, P. Agrawal: “Connected speech recognition on multiple processor pipeline”, Proc. of the ICASSP’ 89, pp. 774-777, Glasgow, May 1989
Google Scholar
W. Fisher: “IEEE P1014 — A standard for high performance VME bus”, IEEE Micro, vol. 5, pp. 31–41, Febr. 1985
Article Google Scholar
D. Gustavson: “Computer buses — A tutorial”, IEEE Micro, vol. 4, pp. 7–22, Aug. 1984
Article MathSciNet Google Scholar
VME Bus Manufacturers Group: VME Bus Specification Manual [with VME Revision B, August 1982, and VMX Revision A, October 1983]
Google Scholar
P. Harold: “Powerful local buses join the VME bus”, EDN, pp. 199-208, Apr. 18, 1985
Google Scholar
M. L. Fuccio, R. N. Gadenz, C. J. Garen, J. M. Huser, B. Ng, S. P. Pekarich: “The DSP32C: AT&T’s second generation Floating Point Digital Signal Processor”, IEEE Micro, vol. 8, pp. 30–48, Dec. 1988
Article Google Scholar
P. Papamichalis R. Simar, Jr.: “The TMS320C30 Floating Point Digital Signal Processor”, IEEE Micro, vol. 8, pp. 13–29, Dec. 1988
Article Google Scholar
E. A. Lee: “Programmable DSP architectures: Part I”, IEEE ASSP Magazine, vol. 5, pp. 4–14, Oct. 1988
Article Google Scholar
E. A. Lee: “Programmable DSP architectures: Part II”, IEEE ASSP Magazine, vol.6, pp. 4–14, Jan. 1989
Article Google Scholar
A. Dinning: “A survey of synchronisation methods for parallel computers”, IEEE Computer, vol. 22, pp. 66–77, July 1989
Article Google Scholar
ESPRIT II Project N.2218 (SUNDIAL). Technical Annex
Google Scholar
D. MacGregor, D. Mothersole, B. Moyer: “The Motorola MC68020”, IEEE Micro, vol. 4, pp. 101–118, Aug. 1984
Article Google Scholar
VERSADOS Operating System — Technical Documentation
Google Scholar
C. Huntsman D. Cawthron: “The MC68881 floating point coprocessor”, IEEE Micro, vol. 3, pp. 44–54, Dec. 1983
Article Google Scholar
G.W. Cherry: Pascal Programming Structures for Motorola Microprocessors. Reston Publishing, Prentice Hall, 1982
Google Scholar
M. Ajmone Marsan, G. Balbo, G. Conte: “Performance models of multiprocessor systems”, MIT Press Series in Computer Systems, Chapters 9 and 10, 1986
Google Scholar
A. Ciaramella, G. Venuti: “Vector quantization firmware for an acoustical front end using the TMS32020”, Proc. of the ICASSP’ 87, pp. 1895–1898, Dallas, Tex., Apr. 1987
Google Scholar
F.J. Harris: “On the use of windows for harmonic analysis with the Discrete Fourier Transform”, IEEE Proceedings, vol. 66, pp. 51–83, Jan. 1978
Article Google Scholar
E. O. Brigham: The Fast Fourier Transform, Sect. 10-10, pp. 163-171. Prentice Hall, 1974
Google Scholar
L.R. Morris: “Structural considerations for large FFT programs on the TI TMS32010 DSP microchip”, Proc. of the ICASSP’ 85, pp. 42.13.1-4, Tampa, Fla., March 1985
Google Scholar
P. Kabal, B. Sayar: “Performance of fixed-point FFT’s: rounding and scaling considerations”, Proc. of the ICASSP’ 86, pp. 6.3.1-4, Tokyo, Japan, Apr. 1986
Google Scholar
S. Prakash, V.V. Rao: “Fixed point error analysis of Radix-4 FFT”, Signal Processing, vol.3, pp. 123–133, Apr. 1981
Article MathSciNet Google Scholar
K. H. Davis, P. Mermelstein: “Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences”, IEEE trans. ASSP, vol.28, pp. 357–366, Aug. 1980
Article Google Scholar
A. Kaltenmeier: “Acoustic/phonetic transcription using a polynomial classifier and Hidden Markov Models” Proc. of the Montreal Symposium on Speech Technology, pp. 95–96, Montreal, Canada, July 1986
Google Scholar
P. Capello, G. Davidson, A. Gersho, C. Koc, V. Somayazulu: “A systolic vector quantization processor for real time speech coding”, Proc. of the ICASSP’ 86, pp. 41.1.1-4, Tokyo, Japan, Apr. 1986
Google Scholar
P. Laface, G. Micca, R. Pieraccini: “Experimentals results on a large lexicon access task”, Proc. of the ICASSP’ 87, pp. 809–812, Dallas, Tex., Apr. 1987
Google Scholar
M. Cravero, R. Pieraccini, F. Raineri: “Definition and evaluation of phonetic units for speech recognition by Hidden Markov Models”, Proc. of the ICASSP’ 86, pp. 42.3.1-4, Tokyo, Japan, Apr. 1986
Google Scholar
L. Fissore, E. Giachin, P. Laface, G. Micca, R. Pieraccini, C. Rullent: “Experimental results on large vocabulary continuous speech recognition and understanding”, Proc. of the ICASSP’ 88, pp. 414–417, New Jork, NY, Apr. 1988
Google Scholar
L. Fissore, P. Laface, G. Micca, R. Pieraccini: “Interaction between fast lexical access and word verification in large vocabulary continuous speech recognition” Proc. of the ICASSP’ 88, pp. 279–282, New York, NY, Apr. 1988
Google Scholar
L. Fissore, P. Laface, G. Micca, R. Pieraccini: “Very large vocabulary isolated utterance recognition: a comparison between one pass and two pass strategies”, Proc. of the ICASSP’ 88, pp. 203–206, New York, NY, Apr. 1988
Google Scholar
G. Micca, R. Pieraccini, P. Laface, L. Saitta, A. Kaltenmeier: “Word Hypothesization and Verification in a Large Vocabulary”, Proc. of 3rd Esprit Technical Week, pp. 845–853, Brussels, Belgium, Sept. 1986
Google Scholar
A. Ciaramella, G. Venuti: “Dynamic programming with hidden markov models on a TMS32020 digital signal processor”, Proc. of EUSIPCO’ 88, pp. 751–754, Grenoble, France, Sept. 1988
Google Scholar
A. Ciaramella, D. Clementino, R. Pacifici: “Characterization of a large vocabulary isolated words and continuous speech recognizer”, Proc. of the Eurospeech’ 89, pp. 437–440, Paris, France, Sept. 1989
Google Scholar

Download references

Author information

Authors and Affiliations

Daimler Benz, Germany
Robert Breitschaedel
CSELT, Italy
Alberto Ciaramella, Davide Clementino, Roberto Pacifici & Giovanni Venuti
Thomson-CSF, France
Jean Pierre Riviere

Authors

Robert Breitschaedel
View author publications
You can also search for this author in PubMed Google Scholar
Alberto Ciaramella
View author publications
You can also search for this author in PubMed Google Scholar
Davide Clementino
View author publications
You can also search for this author in PubMed Google Scholar
Roberto Pacifici
View author publications
You can also search for this author in PubMed Google Scholar
Jean Pierre Riviere
View author publications
You can also search for this author in PubMed Google Scholar
Giovanni Venuti
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

CSELT, Via Reiss Romoli, 274, I-10148, Torino, Italy
Giancarlo Pirani

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Breitschaedel, R., Ciaramella, A., Clementino, D., Pacifici, R., Riviere, J.P., Venuti, G. (1990). The Real Time Implementation of the Recognition Stage. In: Pirani, G. (eds) Advanced Algorithms and Architectures for Speech Understanding. Research Reports ESPRIT, vol 1. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-84341-9_3

Download citation

DOI: https://doi.org/10.1007/978-3-642-84341-9_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-53402-0
Online ISBN: 978-3-642-84341-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics