# The Real Time Implementation of the Recognition Stage

## Abstract

Subtasks 2.2 and 2.3 of the P26 project have been devoted to the design of a hardware architecture and to the implementation on it, in real time, of recognition algorithms already developed and experimented within Subtask 2.1.: this real time implementation of the recognition stage will be called RICO in the following. Table 3.1 summarizes the key points we considered when we started our work, i.e. algorithmic requirements, project development constraints, hardware and software technology limits; they contributed to the definition of RICO main characteristics, summarized in Table 3.2: in the following of this paragraph we will detail these considerations. We started with the consideration that recognition algorithms can be distinguished into two principal blocks, a first “feature extraction” block till vector quantization and phonetic classification of frames, and a following “search” block extracting the lattice of most likely words using dynamic programming: this system “cut” corresponds to the minimal flow of data and besides separates blocks with different computational characteristics. The first block in fact is characterized by predictable execution times, cyclic computations, vector data structures, not-too-large data addressing requirements: this block in fact implements “traditional” DSP algorithms, for which the DSP chips fit well. Instead memory and computational requirements of the second block heavily depend on the recognition vocabulary size and on the speaking style (continuous speech of course is more demanding than isolated words) and also exhibit a time dependency for the same utterance; in each case, for the real time recognition of continuous speech with a 1K words vocabulary, the computational requirements are quite demanding, although were not clearly defined at the beginning of the project.

## Keywords

Lexical Access Real Time Implementation Continuous Speech Beam Search Phonetic Classification## Preview

Unable to display preview. Download preview PDF.

## Bibliography

- 1.Y. Kawakami, H. Ishizuka, M. Watari, H. Sakoe, T. Hoshi, T. Iwata: “A microprocessor for speech recognition”,
*IEEE Journal on Selected Areas in Communications*, vol. 3, pp. 369–376, March 1985CrossRefGoogle Scholar - 2.R.E. Owen: “A VLSI dynamic time warp processor for connected and isolated word speech recognition”,
*Proc. of the ICASSP’ 85*, pp. 985–988, Tampa, Fla., March 1985Google Scholar - 3.G. Quenot, J.L. Gauvain, J.J. Gangolf, J. Mariani: “A dynamic time warp VLSI processor for continuous speech recognition”,
*Proc. of the ICASSP’ 86*, pp. 1549–1542, Tokyo, Japan, Apr. 1986Google Scholar - 4.J.R. Mann, F.M. Rhodes: “A wafer scale DTW multiprocessor”,
*Proc. of the ICASSP’ 86*, pp. 1557–1560, Tokyo, Japan, Apr. 1986Google Scholar - 5.R. A. Kavaler, M. Lowy, H. Murveit, R. R. Brodersen: “A Dynamic Time Warp Integrated Circuit for a 1000 word speech recognition system”,
*IEEE Journal of Solid State Circuits*, vol. 22, pp. 3–14, February 1987CrossRefGoogle Scholar - 6.S.G. Glinski, T.M. Lalumia, D.R. Cassiday, Taiho Koh, C. Gerveshi, G. A. Wilson, J. Kumar: “The Graph Search Machine: A VLSI architecture for connected speech recognition and other applications”,
*IEEE Proceedings*, vol. 75, pp. 1172–1184, Sept. 1987CrossRefGoogle Scholar - 7.R. Cecinati, A. Ciaramella, G. Venuti, C. Vicenzi: “A dynamic time warping custom integrated circuit for speech recognition”,
*Proc. of the EUSIPCO’ 86*, The Hague, The Netherlands, pp. 1215–1218, Sept. 1986Google Scholar - 8.R. Cecinati, A. Ciaramella, L. Licciardi, G. Venuti: “Implementation of a dynamic time warp integrated circuit for large vocabulary isolated and connected speech recognition”,
*Proc. of EUROSPEECH’ 89*, pp. 565–568, Paris, France, Sept. 1989Google Scholar - 9.A. Albarello, R. Breitschaedel, A. Ciaramella, E. Lenormand, R. Pacifici, J. Potage, J.P. Riviere, N. Scheibel, G. Venuti: “Implementation of an acoustical front-end using the TMS32020”,
*Proc. of the Digital Signal Processing Conference*, Florence, Italy, September 1987Google Scholar - 10.C. Erskine, S. Magar: “Architecture and applications of a second generation digital signal processor”,
*Proc. of the ICASSP’ 85*, pp. 228–231, Tampa, Fla., March 1985Google Scholar - 11.K.S. Lin, G.A. Frantz, R. Simar jr.: “The TMS32020 family of digital signal processors”,
*IEEE Proceedings*, vol. 75, pp. 1143–1159, Sept. 1987CrossRefGoogle Scholar - 12.D.B. Roe, A.L. Gorin, P. Ramesh: “Incorporating syntax into the level-building algorithm on a tree-structured parallel computer”,
*Proc. of the ICASSP’ 89*, pp. 778–781, Glasgow, UK, May 1989Google Scholar - 13.R. Bisiani, T. Anantharaman, L. Butcher: “BEAM: an accelerator for speech recognition”,
*Proc. of the ICASSP’ 89*, pp. 782–784, Glasgow, UK, May 1989Google Scholar - 14.S. Chatterjee, P. Agrawal: “Connected speech recognition on multiple processor pipeline”,
*Proc. of the ICASSP’ 89*, pp. 774-777, Glasgow, May 1989Google Scholar - 15.W. Fisher: “IEEE P1014 — A standard for high performance VME bus”,
*IEEE Micro*, vol. 5, pp. 31–41, Febr. 1985CrossRefGoogle Scholar - 16.D. Gustavson: “Computer buses — A tutorial”,
*IEEE Micro*, vol. 4, pp. 7–22, Aug. 1984CrossRefMathSciNetGoogle Scholar - 17.VME Bus Manufacturers Group:
*VME Bus Specification Manual*[with VME Revision B, August 1982, and VMX Revision A, October 1983]Google Scholar - 18.P. Harold: “Powerful local buses join the VME bus”,
*EDN*, pp. 199-208, Apr. 18, 1985Google Scholar - 19.M. L. Fuccio, R. N. Gadenz, C. J. Garen, J. M. Huser, B. Ng, S. P. Pekarich: “The DSP32C: AT&T’s second generation Floating Point Digital Signal Processor”,
*IEEE Micro*, vol. 8, pp. 30–48, Dec. 1988CrossRefGoogle Scholar - 20.P. Papamichalis R. Simar, Jr.: “The TMS320C30 Floating Point Digital Signal Processor”,
*IEEE Micro*, vol. 8, pp. 13–29, Dec. 1988CrossRefGoogle Scholar - 21.E. A. Lee: “Programmable DSP architectures: Part I”,
*IEEE ASSP Magazine*, vol. 5, pp. 4–14, Oct. 1988CrossRefGoogle Scholar - 22.E. A. Lee: “Programmable DSP architectures: Part II”,
*IEEE ASSP Magazine*, vol.6, pp. 4–14, Jan. 1989CrossRefGoogle Scholar - 23.A. Dinning: “A survey of synchronisation methods for parallel computers”,
*IEEE Computer*, vol. 22, pp. 66–77, July 1989CrossRefGoogle Scholar - 24.ESPRIT II Project N.2218 (SUNDIAL). Technical AnnexGoogle Scholar
- 25.D. MacGregor, D. Mothersole, B. Moyer: “The Motorola MC68020”,
*IEEE Micro*, vol. 4, pp. 101–118, Aug. 1984CrossRefGoogle Scholar - 26.VERSADOS Operating System — Technical DocumentationGoogle Scholar
- 27.C. Huntsman D. Cawthron: “The MC68881 floating point coprocessor”,
*IEEE Micro*, vol. 3, pp. 44–54, Dec. 1983CrossRefGoogle Scholar - 28.G.W. Cherry:
*Pascal Programming Structures for Motorola Microprocessors*. Reston Publishing, Prentice Hall, 1982Google Scholar - 29.M. Ajmone Marsan, G. Balbo, G. Conte: “Performance models of multiprocessor systems”,
*MIT Press Series in Computer Systems*, Chapters 9 and 10, 1986Google Scholar - 30.A. Ciaramella, G. Venuti: “Vector quantization firmware for an acoustical front end using the TMS32020”,
*Proc. of the ICASSP’ 87*, pp. 1895–1898, Dallas, Tex., Apr. 1987Google Scholar - 31.F.J. Harris: “On the use of windows for harmonic analysis with the Discrete Fourier Transform”,
*IEEE Proceedings*, vol. 66, pp. 51–83, Jan. 1978CrossRefGoogle Scholar - 32.E. O. Brigham:
*The Fast Fourier Transform*, Sect. 10-10, pp. 163-171. Prentice Hall, 1974Google Scholar - 33.L.R. Morris: “Structural considerations for large FFT programs on the TI TMS32010 DSP microchip”,
*Proc. of the ICASSP’ 85*, pp. 42.13.1-4, Tampa, Fla., March 1985Google Scholar - 34.P. Kabal, B. Sayar: “Performance of fixed-point FFT’s: rounding and scaling considerations”,
*Proc. of the ICASSP’ 86*, pp. 6.3.1-4, Tokyo, Japan, Apr. 1986Google Scholar - 35.S. Prakash, V.V. Rao: “Fixed point error analysis of Radix-4 FFT”,
*Signal Processing*, vol.3, pp. 123–133, Apr. 1981CrossRefMathSciNetGoogle Scholar - 36.K. H. Davis, P. Mermelstein: “Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences”,
*IEEE trans. ASSP*, vol.28, pp. 357–366, Aug. 1980CrossRefGoogle Scholar - 37.A. Kaltenmeier: “Acoustic/phonetic transcription using a polynomial classifier and Hidden Markov Models”
*Proc. of the Montreal Symposium on Speech Technology*, pp. 95–96, Montreal, Canada, July 1986Google Scholar - 38.P. Capello, G. Davidson, A. Gersho, C. Koc, V. Somayazulu: “A systolic vector quantization processor for real time speech coding”,
*Proc. of the ICASSP’ 86*, pp. 41.1.1-4, Tokyo, Japan, Apr. 1986Google Scholar - 39.P. Laface, G. Micca, R. Pieraccini: “Experimentals results on a large lexicon access task”,
*Proc. of the ICASSP’ 87*, pp. 809–812, Dallas, Tex., Apr. 1987Google Scholar - 40.M. Cravero, R. Pieraccini, F. Raineri: “Definition and evaluation of phonetic units for speech recognition by Hidden Markov Models”,
*Proc. of the ICASSP’ 86*, pp. 42.3.1-4, Tokyo, Japan, Apr. 1986Google Scholar - 41.L. Fissore, E. Giachin, P. Laface, G. Micca, R. Pieraccini, C. Rullent: “Experimental results on large vocabulary continuous speech recognition and understanding”,
*Proc. of the ICASSP’ 88*, pp. 414–417, New Jork, NY, Apr. 1988Google Scholar - 42.L. Fissore, P. Laface, G. Micca, R. Pieraccini: “Interaction between fast lexical access and word verification in large vocabulary continuous speech recognition”
*Proc. of the ICASSP’ 88*, pp. 279–282, New York, NY, Apr. 1988Google Scholar - 43.L. Fissore, P. Laface, G. Micca, R. Pieraccini: “Very large vocabulary isolated utterance recognition: a comparison between one pass and two pass strategies”,
*Proc. of the ICASSP’ 88*, pp. 203–206, New York, NY, Apr. 1988Google Scholar - 44.G. Micca, R. Pieraccini, P. Laface, L. Saitta, A. Kaltenmeier: “Word Hypothesization and Verification in a Large Vocabulary”,
*Proc. of 3rd Esprit Technical Week*, pp. 845–853, Brussels, Belgium, Sept. 1986Google Scholar - 45.A. Ciaramella, G. Venuti: “Dynamic programming with hidden markov models on a TMS32020 digital signal processor”,
*Proc. of EUSIPCO’ 88*, pp. 751–754, Grenoble, France, Sept. 1988Google Scholar - 46.A. Ciaramella, D. Clementino, R. Pacifici: “Characterization of a large vocabulary isolated words and continuous speech recognizer”,
*Proc. of the Eurospeech’ 89*, pp. 437–440, Paris, France, Sept. 1989Google Scholar