Abstract
Subtasks 2.2 and 2.3 of the P26 project have been devoted to the design of a hardware architecture and to the implementation on it, in real time, of recognition algorithms already developed and experimented within Subtask 2.1.: this real time implementation of the recognition stage will be called RICO in the following. Table 3.1 summarizes the key points we considered when we started our work, i.e. algorithmic requirements, project development constraints, hardware and software technology limits; they contributed to the definition of RICO main characteristics, summarized in Table 3.2: in the following of this paragraph we will detail these considerations. We started with the consideration that recognition algorithms can be distinguished into two principal blocks, a first “feature extraction” block till vector quantization and phonetic classification of frames, and a following “search” block extracting the lattice of most likely words using dynamic programming: this system “cut” corresponds to the minimal flow of data and besides separates blocks with different computational characteristics. The first block in fact is characterized by predictable execution times, cyclic computations, vector data structures, not-too-large data addressing requirements: this block in fact implements “traditional” DSP algorithms, for which the DSP chips fit well. Instead memory and computational requirements of the second block heavily depend on the recognition vocabulary size and on the speaking style (continuous speech of course is more demanding than isolated words) and also exhibit a time dependency for the same utterance; in each case, for the real time recognition of continuous speech with a 1K words vocabulary, the computational requirements are quite demanding, although were not clearly defined at the beginning of the project.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Bibliography
Y. Kawakami, H. Ishizuka, M. Watari, H. Sakoe, T. Hoshi, T. Iwata: “A microprocessor for speech recognition”, IEEE Journal on Selected Areas in Communications, vol. 3, pp. 369–376, March 1985
R.E. Owen: “A VLSI dynamic time warp processor for connected and isolated word speech recognition”, Proc. of the ICASSP’ 85, pp. 985–988, Tampa, Fla., March 1985
G. Quenot, J.L. Gauvain, J.J. Gangolf, J. Mariani: “A dynamic time warp VLSI processor for continuous speech recognition”, Proc. of the ICASSP’ 86, pp. 1549–1542, Tokyo, Japan, Apr. 1986
J.R. Mann, F.M. Rhodes: “A wafer scale DTW multiprocessor”, Proc. of the ICASSP’ 86, pp. 1557–1560, Tokyo, Japan, Apr. 1986
R. A. Kavaler, M. Lowy, H. Murveit, R. R. Brodersen: “A Dynamic Time Warp Integrated Circuit for a 1000 word speech recognition system”, IEEE Journal of Solid State Circuits, vol. 22, pp. 3–14, February 1987
S.G. Glinski, T.M. Lalumia, D.R. Cassiday, Taiho Koh, C. Gerveshi, G. A. Wilson, J. Kumar: “The Graph Search Machine: A VLSI architecture for connected speech recognition and other applications”, IEEE Proceedings, vol. 75, pp. 1172–1184, Sept. 1987
R. Cecinati, A. Ciaramella, G. Venuti, C. Vicenzi: “A dynamic time warping custom integrated circuit for speech recognition”, Proc. of the EUSIPCO’ 86, The Hague, The Netherlands, pp. 1215–1218, Sept. 1986
R. Cecinati, A. Ciaramella, L. Licciardi, G. Venuti: “Implementation of a dynamic time warp integrated circuit for large vocabulary isolated and connected speech recognition”, Proc. of EUROSPEECH’ 89, pp. 565–568, Paris, France, Sept. 1989
A. Albarello, R. Breitschaedel, A. Ciaramella, E. Lenormand, R. Pacifici, J. Potage, J.P. Riviere, N. Scheibel, G. Venuti: “Implementation of an acoustical front-end using the TMS32020”, Proc. of the Digital Signal Processing Conference, Florence, Italy, September 1987
C. Erskine, S. Magar: “Architecture and applications of a second generation digital signal processor”, Proc. of the ICASSP’ 85, pp. 228–231, Tampa, Fla., March 1985
K.S. Lin, G.A. Frantz, R. Simar jr.: “The TMS32020 family of digital signal processors”, IEEE Proceedings, vol. 75, pp. 1143–1159, Sept. 1987
D.B. Roe, A.L. Gorin, P. Ramesh: “Incorporating syntax into the level-building algorithm on a tree-structured parallel computer”, Proc. of the ICASSP’ 89, pp. 778–781, Glasgow, UK, May 1989
R. Bisiani, T. Anantharaman, L. Butcher: “BEAM: an accelerator for speech recognition”, Proc. of the ICASSP’ 89, pp. 782–784, Glasgow, UK, May 1989
S. Chatterjee, P. Agrawal: “Connected speech recognition on multiple processor pipeline”, Proc. of the ICASSP’ 89, pp. 774-777, Glasgow, May 1989
W. Fisher: “IEEE P1014 — A standard for high performance VME bus”, IEEE Micro, vol. 5, pp. 31–41, Febr. 1985
D. Gustavson: “Computer buses — A tutorial”, IEEE Micro, vol. 4, pp. 7–22, Aug. 1984
VME Bus Manufacturers Group: VME Bus Specification Manual [with VME Revision B, August 1982, and VMX Revision A, October 1983]
P. Harold: “Powerful local buses join the VME bus”, EDN, pp. 199-208, Apr. 18, 1985
M. L. Fuccio, R. N. Gadenz, C. J. Garen, J. M. Huser, B. Ng, S. P. Pekarich: “The DSP32C: AT&T’s second generation Floating Point Digital Signal Processor”, IEEE Micro, vol. 8, pp. 30–48, Dec. 1988
P. Papamichalis R. Simar, Jr.: “The TMS320C30 Floating Point Digital Signal Processor”, IEEE Micro, vol. 8, pp. 13–29, Dec. 1988
E. A. Lee: “Programmable DSP architectures: Part I”, IEEE ASSP Magazine, vol. 5, pp. 4–14, Oct. 1988
E. A. Lee: “Programmable DSP architectures: Part II”, IEEE ASSP Magazine, vol.6, pp. 4–14, Jan. 1989
A. Dinning: “A survey of synchronisation methods for parallel computers”, IEEE Computer, vol. 22, pp. 66–77, July 1989
ESPRIT II Project N.2218 (SUNDIAL). Technical Annex
D. MacGregor, D. Mothersole, B. Moyer: “The Motorola MC68020”, IEEE Micro, vol. 4, pp. 101–118, Aug. 1984
VERSADOS Operating System — Technical Documentation
C. Huntsman D. Cawthron: “The MC68881 floating point coprocessor”, IEEE Micro, vol. 3, pp. 44–54, Dec. 1983
G.W. Cherry: Pascal Programming Structures for Motorola Microprocessors. Reston Publishing, Prentice Hall, 1982
M. Ajmone Marsan, G. Balbo, G. Conte: “Performance models of multiprocessor systems”, MIT Press Series in Computer Systems, Chapters 9 and 10, 1986
A. Ciaramella, G. Venuti: “Vector quantization firmware for an acoustical front end using the TMS32020”, Proc. of the ICASSP’ 87, pp. 1895–1898, Dallas, Tex., Apr. 1987
F.J. Harris: “On the use of windows for harmonic analysis with the Discrete Fourier Transform”, IEEE Proceedings, vol. 66, pp. 51–83, Jan. 1978
E. O. Brigham: The Fast Fourier Transform, Sect. 10-10, pp. 163-171. Prentice Hall, 1974
L.R. Morris: “Structural considerations for large FFT programs on the TI TMS32010 DSP microchip”, Proc. of the ICASSP’ 85, pp. 42.13.1-4, Tampa, Fla., March 1985
P. Kabal, B. Sayar: “Performance of fixed-point FFT’s: rounding and scaling considerations”, Proc. of the ICASSP’ 86, pp. 6.3.1-4, Tokyo, Japan, Apr. 1986
S. Prakash, V.V. Rao: “Fixed point error analysis of Radix-4 FFT”, Signal Processing, vol.3, pp. 123–133, Apr. 1981
K. H. Davis, P. Mermelstein: “Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences”, IEEE trans. ASSP, vol.28, pp. 357–366, Aug. 1980
A. Kaltenmeier: “Acoustic/phonetic transcription using a polynomial classifier and Hidden Markov Models” Proc. of the Montreal Symposium on Speech Technology, pp. 95–96, Montreal, Canada, July 1986
P. Capello, G. Davidson, A. Gersho, C. Koc, V. Somayazulu: “A systolic vector quantization processor for real time speech coding”, Proc. of the ICASSP’ 86, pp. 41.1.1-4, Tokyo, Japan, Apr. 1986
P. Laface, G. Micca, R. Pieraccini: “Experimentals results on a large lexicon access task”, Proc. of the ICASSP’ 87, pp. 809–812, Dallas, Tex., Apr. 1987
M. Cravero, R. Pieraccini, F. Raineri: “Definition and evaluation of phonetic units for speech recognition by Hidden Markov Models”, Proc. of the ICASSP’ 86, pp. 42.3.1-4, Tokyo, Japan, Apr. 1986
L. Fissore, E. Giachin, P. Laface, G. Micca, R. Pieraccini, C. Rullent: “Experimental results on large vocabulary continuous speech recognition and understanding”, Proc. of the ICASSP’ 88, pp. 414–417, New Jork, NY, Apr. 1988
L. Fissore, P. Laface, G. Micca, R. Pieraccini: “Interaction between fast lexical access and word verification in large vocabulary continuous speech recognition” Proc. of the ICASSP’ 88, pp. 279–282, New York, NY, Apr. 1988
L. Fissore, P. Laface, G. Micca, R. Pieraccini: “Very large vocabulary isolated utterance recognition: a comparison between one pass and two pass strategies”, Proc. of the ICASSP’ 88, pp. 203–206, New York, NY, Apr. 1988
G. Micca, R. Pieraccini, P. Laface, L. Saitta, A. Kaltenmeier: “Word Hypothesization and Verification in a Large Vocabulary”, Proc. of 3rd Esprit Technical Week, pp. 845–853, Brussels, Belgium, Sept. 1986
A. Ciaramella, G. Venuti: “Dynamic programming with hidden markov models on a TMS32020 digital signal processor”, Proc. of EUSIPCO’ 88, pp. 751–754, Grenoble, France, Sept. 1988
A. Ciaramella, D. Clementino, R. Pacifici: “Characterization of a large vocabulary isolated words and continuous speech recognizer”, Proc. of the Eurospeech’ 89, pp. 437–440, Paris, France, Sept. 1989
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1990 ECSC — EEC — EAEC, Brussels — Luxembourg
About this chapter
Cite this chapter
Breitschaedel, R., Ciaramella, A., Clementino, D., Pacifici, R., Riviere, J.P., Venuti, G. (1990). The Real Time Implementation of the Recognition Stage. In: Pirani, G. (eds) Advanced Algorithms and Architectures for Speech Understanding. Research Reports ESPRIT, vol 1. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-84341-9_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-84341-9_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-53402-0
Online ISBN: 978-3-642-84341-9
eBook Packages: Springer Book Archive