Skip to main content

Automatic Generation of Linguistic, Phonetic and Acoustic Knowledge for a Diphone-Based Continuous Speech Recognition System

  • Conference paper
New Systems and Architectures for Automatic Speech Recognition and Synthesis

Part of the book series: NATO ASI Series ((NATO ASI F,volume 16))

Abstract

An important issue in template-matching continuous-speech recognition systems is the right choice of the language model, together with an appropriate definition of the basic units to be recognized. The advantages of using a hierarchical transition network model with diphones and diphone-like elements as basic units are illustrated in the paper. However, a severe drawback in the use of sub-word units is an increased complexity in producing and managing the overall knowledge relating to language representation and template definition and extraction. An efficient solution to this problem is required especially when the recognition system is to be used by unskilled users in actual applications. For this purpose we have developed an automatic procedure for generating the linguistic, phonetic and acoustic data bases expressing the whole information required by the diphone-based system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. P. S. Cohen and R. L. Mercer, “The Phonological Component of an Automatic Speech Recognition System”, in D. R. Reddy (Editor), SPEECH RECOGNITION, Academic Press, New York, p. 275, 1975.

    Google Scholar 

  2. A. M. Colla and D. Sciarra, “Automatic Diphone Bootstrapping for Speaker-Adaptive Continuous Speech Recognition”, Proc. ICASSP 1984, (35.2), S. Diego, 1984.

    Google Scholar 

  3. J. L. Flanagan, C. H. Coker, L. R. Rabiner, R. W. Schafer and N. Umeda, “Synthetic Voices for Computers”, IEEE Spectrum, 7, p. 22, 1970.

    Article  Google Scholar 

  4. K. S. Fu, SYNTACTIC METHODS IN PATTERN RECOGNITION, Academic Press, New York, 1974.

    MATH  Google Scholar 

  5. H. Fujisaki, K. Hirose and T. Inoue, “Automatic Recognition of Connected Words from a Large Vocabulary Using Syllable Templates”, Proc. ICASSP 1984, (26.9), S. Diego, 1984.

    Google Scholar 

  6. D. Hopkin and B. Moss, AUTOMATA, MacMillan, London, p. 5, 1976.

    MATH  Google Scholar 

  7. M. J. Hunt, M. Lennig and P. Mermelstein, “Experiments in Syllable Based Recognition of Continuous Speech”, Proc. ICASSP 1980, Denver, p. 880, 1980.

    Google Scholar 

  8. D. H. Klatt, “SCRIBER and LAFS: Two New Approaches to Speech Analysis”, in W. A. Lea (Editor), TRENDS IN SPEECH RECOGNITION, Prentice-Hall, Englewood Cliffs, p. 529, 1980.

    Google Scholar 

  9. B. T. Lowerre and D. R. Reddy, “The HARPY Speech Understanding System”, in W. A. Lea (Editor), TRENDS IN SPEECH RECOGNITION, Prentice-Hall, Englewood Cliffs, p. 340, 1980.

    Google Scholar 

  10. C. S. Myers, L. R. Rabiner and A. E. Rosenberg, “On the Use of Dynamic Time Warping for Word Spotting and Connected Word Recognition”, The Bell System Technical Journal, 60, 3, p. 303–325, 1981.

    Google Scholar 

  11. M. Onishi (Supervisor), A GRAND DICTIONARY OF PHONETICS, The Phonetic Society of Japan, p. 129, 1981.

    Google Scholar 

  12. J. E. Paul and A. S. Rabinowitz, “An Acoustically Based Continuous Speech Recognition System”, IEEE Symposium on Speech Recognition, Carnegie-Mellon University, Pittsburgh, PA, p. 63, 1974.

    Google Scholar 

  13. L. R. Rabiner and R. W. Schafer, DIGITAL PROCESSING OF SPEECH SIGNAL, Prentice-Hall, Englewood Cliffs, p. 442, 1978.

    Google Scholar 

  14. C. Scagliola and L. Marmi, “Continuous Speech Recognition via Diphone Spotting: a Preliminary Implementation”, Proc. ICASSP 1982, Paris, p. 2008, 1982.

    Google Scholar 

  15. C. Scagliola, “Continuous Speech Recognition Without Segmentation: Two Ways of Using Diphones as Basic Speech Units”, Speech Communication, 2 (2–3), p. 199, 1983.

    Article  Google Scholar 

  16. C. Scagliola, “Language Models and Search Algorithms for Real-Time Speech Recognition” (to appear on: International Journal of Man-Machine Studies).

    Google Scholar 

  17. J. E. Shoup, “Phonological Aspects of Speech Recognition”, in W. A. Lea (Editor), TRENDS IN SPEECH RECOGNITION, Prentice-Hall, Englewood Cliffs, p. 125, 1980.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1985 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Colla, A.M., Sciarra, D. (1985). Automatic Generation of Linguistic, Phonetic and Acoustic Knowledge for a Diphone-Based Continuous Speech Recognition System. In: De Mori, R., Suen, C.Y. (eds) New Systems and Architectures for Automatic Speech Recognition and Synthesis. NATO ASI Series, vol 16. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-82447-0_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-82447-0_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-82449-4

  • Online ISBN: 978-3-642-82447-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics