Skip to main content

Abstract

Subtasks 2.2 and 2.3 of the P26 project have been devoted to the design of a hardware architecture and to the implementation on it, in real time, of recognition algorithms already developed and experimented within Subtask 2.1.: this real time implementation of the recognition stage will be called RICO in the following. Table 3.1 summarizes the key points we considered when we started our work, i.e. algorithmic requirements, project development constraints, hardware and software technology limits; they contributed to the definition of RICO main characteristics, summarized in Table 3.2: in the following of this paragraph we will detail these considerations. We started with the consideration that recognition algorithms can be distinguished into two principal blocks, a first “feature extraction” block till vector quantization and phonetic classification of frames, and a following “search” block extracting the lattice of most likely words using dynamic programming: this system “cut” corresponds to the minimal flow of data and besides separates blocks with different computational characteristics. The first block in fact is characterized by predictable execution times, cyclic computations, vector data structures, not-too-large data addressing requirements: this block in fact implements “traditional” DSP algorithms, for which the DSP chips fit well. Instead memory and computational requirements of the second block heavily depend on the recognition vocabulary size and on the speaking style (continuous speech of course is more demanding than isolated words) and also exhibit a time dependency for the same utterance; in each case, for the real time recognition of continuous speech with a 1K words vocabulary, the computational requirements are quite demanding, although were not clearly defined at the beginning of the project.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Bibliography

  1. Y. Kawakami, H. Ishizuka, M. Watari, H. Sakoe, T. Hoshi, T. Iwata: “A microprocessor for speech recognition”, IEEE Journal on Selected Areas in Communications, vol. 3, pp. 369–376, March 1985

    Article  Google Scholar 

  2. R.E. Owen: “A VLSI dynamic time warp processor for connected and isolated word speech recognition”, Proc. of the ICASSP’ 85, pp. 985–988, Tampa, Fla., March 1985

    Google Scholar 

  3. G. Quenot, J.L. Gauvain, J.J. Gangolf, J. Mariani: “A dynamic time warp VLSI processor for continuous speech recognition”, Proc. of the ICASSP’ 86, pp. 1549–1542, Tokyo, Japan, Apr. 1986

    Google Scholar 

  4. J.R. Mann, F.M. Rhodes: “A wafer scale DTW multiprocessor”, Proc. of the ICASSP’ 86, pp. 1557–1560, Tokyo, Japan, Apr. 1986

    Google Scholar 

  5. R. A. Kavaler, M. Lowy, H. Murveit, R. R. Brodersen: “A Dynamic Time Warp Integrated Circuit for a 1000 word speech recognition system”, IEEE Journal of Solid State Circuits, vol. 22, pp. 3–14, February 1987

    Article  Google Scholar 

  6. S.G. Glinski, T.M. Lalumia, D.R. Cassiday, Taiho Koh, C. Gerveshi, G. A. Wilson, J. Kumar: “The Graph Search Machine: A VLSI architecture for connected speech recognition and other applications”, IEEE Proceedings, vol. 75, pp. 1172–1184, Sept. 1987

    Article  Google Scholar 

  7. R. Cecinati, A. Ciaramella, G. Venuti, C. Vicenzi: “A dynamic time warping custom integrated circuit for speech recognition”, Proc. of the EUSIPCO’ 86, The Hague, The Netherlands, pp. 1215–1218, Sept. 1986

    Google Scholar 

  8. R. Cecinati, A. Ciaramella, L. Licciardi, G. Venuti: “Implementation of a dynamic time warp integrated circuit for large vocabulary isolated and connected speech recognition”, Proc. of EUROSPEECH’ 89, pp. 565–568, Paris, France, Sept. 1989

    Google Scholar 

  9. A. Albarello, R. Breitschaedel, A. Ciaramella, E. Lenormand, R. Pacifici, J. Potage, J.P. Riviere, N. Scheibel, G. Venuti: “Implementation of an acoustical front-end using the TMS32020”, Proc. of the Digital Signal Processing Conference, Florence, Italy, September 1987

    Google Scholar 

  10. C. Erskine, S. Magar: “Architecture and applications of a second generation digital signal processor”, Proc. of the ICASSP’ 85, pp. 228–231, Tampa, Fla., March 1985

    Google Scholar 

  11. K.S. Lin, G.A. Frantz, R. Simar jr.: “The TMS32020 family of digital signal processors”, IEEE Proceedings, vol. 75, pp. 1143–1159, Sept. 1987

    Article  Google Scholar 

  12. D.B. Roe, A.L. Gorin, P. Ramesh: “Incorporating syntax into the level-building algorithm on a tree-structured parallel computer”, Proc. of the ICASSP’ 89, pp. 778–781, Glasgow, UK, May 1989

    Google Scholar 

  13. R. Bisiani, T. Anantharaman, L. Butcher: “BEAM: an accelerator for speech recognition”, Proc. of the ICASSP’ 89, pp. 782–784, Glasgow, UK, May 1989

    Google Scholar 

  14. S. Chatterjee, P. Agrawal: “Connected speech recognition on multiple processor pipeline”, Proc. of the ICASSP’ 89, pp. 774-777, Glasgow, May 1989

    Google Scholar 

  15. W. Fisher: “IEEE P1014 — A standard for high performance VME bus”, IEEE Micro, vol. 5, pp. 31–41, Febr. 1985

    Article  Google Scholar 

  16. D. Gustavson: “Computer buses — A tutorial”, IEEE Micro, vol. 4, pp. 7–22, Aug. 1984

    Article  MathSciNet  Google Scholar 

  17. VME Bus Manufacturers Group: VME Bus Specification Manual [with VME Revision B, August 1982, and VMX Revision A, October 1983]

    Google Scholar 

  18. P. Harold: “Powerful local buses join the VME bus”, EDN, pp. 199-208, Apr. 18, 1985

    Google Scholar 

  19. M. L. Fuccio, R. N. Gadenz, C. J. Garen, J. M. Huser, B. Ng, S. P. Pekarich: “The DSP32C: AT&T’s second generation Floating Point Digital Signal Processor”, IEEE Micro, vol. 8, pp. 30–48, Dec. 1988

    Article  Google Scholar 

  20. P. Papamichalis R. Simar, Jr.: “The TMS320C30 Floating Point Digital Signal Processor”, IEEE Micro, vol. 8, pp. 13–29, Dec. 1988

    Article  Google Scholar 

  21. E. A. Lee: “Programmable DSP architectures: Part I”, IEEE ASSP Magazine, vol. 5, pp. 4–14, Oct. 1988

    Article  Google Scholar 

  22. E. A. Lee: “Programmable DSP architectures: Part II”, IEEE ASSP Magazine, vol.6, pp. 4–14, Jan. 1989

    Article  Google Scholar 

  23. A. Dinning: “A survey of synchronisation methods for parallel computers”, IEEE Computer, vol. 22, pp. 66–77, July 1989

    Article  Google Scholar 

  24. ESPRIT II Project N.2218 (SUNDIAL). Technical Annex

    Google Scholar 

  25. D. MacGregor, D. Mothersole, B. Moyer: “The Motorola MC68020”, IEEE Micro, vol. 4, pp. 101–118, Aug. 1984

    Article  Google Scholar 

  26. VERSADOS Operating System — Technical Documentation

    Google Scholar 

  27. C. Huntsman D. Cawthron: “The MC68881 floating point coprocessor”, IEEE Micro, vol. 3, pp. 44–54, Dec. 1983

    Article  Google Scholar 

  28. G.W. Cherry: Pascal Programming Structures for Motorola Microprocessors. Reston Publishing, Prentice Hall, 1982

    Google Scholar 

  29. M. Ajmone Marsan, G. Balbo, G. Conte: “Performance models of multiprocessor systems”, MIT Press Series in Computer Systems, Chapters 9 and 10, 1986

    Google Scholar 

  30. A. Ciaramella, G. Venuti: “Vector quantization firmware for an acoustical front end using the TMS32020”, Proc. of the ICASSP’ 87, pp. 1895–1898, Dallas, Tex., Apr. 1987

    Google Scholar 

  31. F.J. Harris: “On the use of windows for harmonic analysis with the Discrete Fourier Transform”, IEEE Proceedings, vol. 66, pp. 51–83, Jan. 1978

    Article  Google Scholar 

  32. E. O. Brigham: The Fast Fourier Transform, Sect. 10-10, pp. 163-171. Prentice Hall, 1974

    Google Scholar 

  33. L.R. Morris: “Structural considerations for large FFT programs on the TI TMS32010 DSP microchip”, Proc. of the ICASSP’ 85, pp. 42.13.1-4, Tampa, Fla., March 1985

    Google Scholar 

  34. P. Kabal, B. Sayar: “Performance of fixed-point FFT’s: rounding and scaling considerations”, Proc. of the ICASSP’ 86, pp. 6.3.1-4, Tokyo, Japan, Apr. 1986

    Google Scholar 

  35. S. Prakash, V.V. Rao: “Fixed point error analysis of Radix-4 FFT”, Signal Processing, vol.3, pp. 123–133, Apr. 1981

    Article  MathSciNet  Google Scholar 

  36. K. H. Davis, P. Mermelstein: “Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences”, IEEE trans. ASSP, vol.28, pp. 357–366, Aug. 1980

    Article  Google Scholar 

  37. A. Kaltenmeier: “Acoustic/phonetic transcription using a polynomial classifier and Hidden Markov Models” Proc. of the Montreal Symposium on Speech Technology, pp. 95–96, Montreal, Canada, July 1986

    Google Scholar 

  38. P. Capello, G. Davidson, A. Gersho, C. Koc, V. Somayazulu: “A systolic vector quantization processor for real time speech coding”, Proc. of the ICASSP’ 86, pp. 41.1.1-4, Tokyo, Japan, Apr. 1986

    Google Scholar 

  39. P. Laface, G. Micca, R. Pieraccini: “Experimentals results on a large lexicon access task”, Proc. of the ICASSP’ 87, pp. 809–812, Dallas, Tex., Apr. 1987

    Google Scholar 

  40. M. Cravero, R. Pieraccini, F. Raineri: “Definition and evaluation of phonetic units for speech recognition by Hidden Markov Models”, Proc. of the ICASSP’ 86, pp. 42.3.1-4, Tokyo, Japan, Apr. 1986

    Google Scholar 

  41. L. Fissore, E. Giachin, P. Laface, G. Micca, R. Pieraccini, C. Rullent: “Experimental results on large vocabulary continuous speech recognition and understanding”, Proc. of the ICASSP’ 88, pp. 414–417, New Jork, NY, Apr. 1988

    Google Scholar 

  42. L. Fissore, P. Laface, G. Micca, R. Pieraccini: “Interaction between fast lexical access and word verification in large vocabulary continuous speech recognition” Proc. of the ICASSP’ 88, pp. 279–282, New York, NY, Apr. 1988

    Google Scholar 

  43. L. Fissore, P. Laface, G. Micca, R. Pieraccini: “Very large vocabulary isolated utterance recognition: a comparison between one pass and two pass strategies”, Proc. of the ICASSP’ 88, pp. 203–206, New York, NY, Apr. 1988

    Google Scholar 

  44. G. Micca, R. Pieraccini, P. Laface, L. Saitta, A. Kaltenmeier: “Word Hypothesization and Verification in a Large Vocabulary”, Proc. of 3rd Esprit Technical Week, pp. 845–853, Brussels, Belgium, Sept. 1986

    Google Scholar 

  45. A. Ciaramella, G. Venuti: “Dynamic programming with hidden markov models on a TMS32020 digital signal processor”, Proc. of EUSIPCO’ 88, pp. 751–754, Grenoble, France, Sept. 1988

    Google Scholar 

  46. A. Ciaramella, D. Clementino, R. Pacifici: “Characterization of a large vocabulary isolated words and continuous speech recognizer”, Proc. of the Eurospeech’ 89, pp. 437–440, Paris, France, Sept. 1989

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1990 ECSC — EEC — EAEC, Brussels — Luxembourg

About this chapter

Cite this chapter

Breitschaedel, R., Ciaramella, A., Clementino, D., Pacifici, R., Riviere, J.P., Venuti, G. (1990). The Real Time Implementation of the Recognition Stage. In: Pirani, G. (eds) Advanced Algorithms and Architectures for Speech Understanding. Research Reports ESPRIT, vol 1. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-84341-9_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-84341-9_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-53402-0

  • Online ISBN: 978-3-642-84341-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics