Automatic Generation of Linguistic, Phonetic and Acoustic Knowledge for a Diphone-Based Continuous Speech Recognition System

  • Anna Maria Colla
  • Donatella Sciarra
Part of the NATO ASI Series book series (volume 16)


An important issue in template-matching continuous-speech recognition systems is the right choice of the language model, together with an appropriate definition of the basic units to be recognized. The advantages of using a hierarchical transition network model with diphones and diphone-like elements as basic units are illustrated in the paper. However, a severe drawback in the use of sub-word units is an increased complexity in producing and managing the overall knowledge relating to language representation and template definition and extraction. An efficient solution to this problem is required especially when the recognition system is to be used by unskilled users in actual applications. For this purpose we have developed an automatic procedure for generating the linguistic, phonetic and acoustic data bases expressing the whole information required by the diphone-based system.


Speech Recognition Continuous Speech Recognition Regular Grammar Cepstrum Coefficient Training Sentence 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    P. S. Cohen and R. L. Mercer, “The Phonological Component of an Automatic Speech Recognition System”, in D. R. Reddy (Editor), SPEECH RECOGNITION, Academic Press, New York, p. 275, 1975.Google Scholar
  2. 2.
    A. M. Colla and D. Sciarra, “Automatic Diphone Bootstrapping for Speaker-Adaptive Continuous Speech Recognition”, Proc. ICASSP 1984, (35.2), S. Diego, 1984.Google Scholar
  3. 3.
    J. L. Flanagan, C. H. Coker, L. R. Rabiner, R. W. Schafer and N. Umeda, “Synthetic Voices for Computers”, IEEE Spectrum, 7, p. 22, 1970.CrossRefGoogle Scholar
  4. 4.
    K. S. Fu, SYNTACTIC METHODS IN PATTERN RECOGNITION, Academic Press, New York, 1974.MATHGoogle Scholar
  5. 5.
    H. Fujisaki, K. Hirose and T. Inoue, “Automatic Recognition of Connected Words from a Large Vocabulary Using Syllable Templates”, Proc. ICASSP 1984, (26.9), S. Diego, 1984.Google Scholar
  6. 6.
    D. Hopkin and B. Moss, AUTOMATA, MacMillan, London, p. 5, 1976.MATHGoogle Scholar
  7. 7.
    M. J. Hunt, M. Lennig and P. Mermelstein, “Experiments in Syllable Based Recognition of Continuous Speech”, Proc. ICASSP 1980, Denver, p. 880, 1980.Google Scholar
  8. 8.
    D. H. Klatt, “SCRIBER and LAFS: Two New Approaches to Speech Analysis”, in W. A. Lea (Editor), TRENDS IN SPEECH RECOGNITION, Prentice-Hall, Englewood Cliffs, p. 529, 1980.Google Scholar
  9. 9.
    B. T. Lowerre and D. R. Reddy, “The HARPY Speech Understanding System”, in W. A. Lea (Editor), TRENDS IN SPEECH RECOGNITION, Prentice-Hall, Englewood Cliffs, p. 340, 1980.Google Scholar
  10. 10.
    C. S. Myers, L. R. Rabiner and A. E. Rosenberg, “On the Use of Dynamic Time Warping for Word Spotting and Connected Word Recognition”, The Bell System Technical Journal, 60, 3, p. 303–325, 1981.Google Scholar
  11. 11.
    M. Onishi (Supervisor), A GRAND DICTIONARY OF PHONETICS, The Phonetic Society of Japan, p. 129, 1981.Google Scholar
  12. 12.
    J. E. Paul and A. S. Rabinowitz, “An Acoustically Based Continuous Speech Recognition System”, IEEE Symposium on Speech Recognition, Carnegie-Mellon University, Pittsburgh, PA, p. 63, 1974.Google Scholar
  13. 13.
    L. R. Rabiner and R. W. Schafer, DIGITAL PROCESSING OF SPEECH SIGNAL, Prentice-Hall, Englewood Cliffs, p. 442, 1978.Google Scholar
  14. 14.
    C. Scagliola and L. Marmi, “Continuous Speech Recognition via Diphone Spotting: a Preliminary Implementation”, Proc. ICASSP 1982, Paris, p. 2008, 1982.Google Scholar
  15. 15.
    C. Scagliola, “Continuous Speech Recognition Without Segmentation: Two Ways of Using Diphones as Basic Speech Units”, Speech Communication, 2 (2–3), p. 199, 1983.CrossRefGoogle Scholar
  16. 16.
    C. Scagliola, “Language Models and Search Algorithms for Real-Time Speech Recognition” (to appear on: International Journal of Man-Machine Studies).Google Scholar
  17. 17.
    J. E. Shoup, “Phonological Aspects of Speech Recognition”, in W. A. Lea (Editor), TRENDS IN SPEECH RECOGNITION, Prentice-Hall, Englewood Cliffs, p. 125, 1980.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1985

Authors and Affiliations

  • Anna Maria Colla
    • 1
  • Donatella Sciarra
    • 1
  1. 1.Central Research DepartmentElettronica San Giorgio, ELSAG S.p.A.Genova SestriItaly

Personalised recommendations