Automatic Generation of Linguistic, Phonetic and Acoustic Knowledge for a Diphone-Based Continuous Speech Recognition System

Colla, Anna Maria; Sciarra, Donatella

doi:10.1007/978-3-642-82447-0_13

Anna Maria Colla² &
Donatella Sciarra²

Part of the book series: NATO ASI Series ((NATO ASI F,volume 16))

77 Accesses
3 Citations

Abstract

An important issue in template-matching continuous-speech recognition systems is the right choice of the language model, together with an appropriate definition of the basic units to be recognized. The advantages of using a hierarchical transition network model with diphones and diphone-like elements as basic units are illustrated in the paper. However, a severe drawback in the use of sub-word units is an increased complexity in producing and managing the overall knowledge relating to language representation and template definition and extraction. An efficient solution to this problem is required especially when the recognition system is to be used by unskilled users in actual applications. For this purpose we have developed an automatic procedure for generating the linguistic, phonetic and acoustic data bases expressing the whole information required by the diphone-based system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

P. S. Cohen and R. L. Mercer, “The Phonological Component of an Automatic Speech Recognition System”, in D. R. Reddy (Editor), SPEECH RECOGNITION, Academic Press, New York, p. 275, 1975.
Google Scholar
A. M. Colla and D. Sciarra, “Automatic Diphone Bootstrapping for Speaker-Adaptive Continuous Speech Recognition”, Proc. ICASSP 1984, (35.2), S. Diego, 1984.
Google Scholar
J. L. Flanagan, C. H. Coker, L. R. Rabiner, R. W. Schafer and N. Umeda, “Synthetic Voices for Computers”, IEEE Spectrum, 7, p. 22, 1970.
Article Google Scholar
K. S. Fu, SYNTACTIC METHODS IN PATTERN RECOGNITION, Academic Press, New York, 1974.
MATH Google Scholar
H. Fujisaki, K. Hirose and T. Inoue, “Automatic Recognition of Connected Words from a Large Vocabulary Using Syllable Templates”, Proc. ICASSP 1984, (26.9), S. Diego, 1984.
Google Scholar
D. Hopkin and B. Moss, AUTOMATA, MacMillan, London, p. 5, 1976.
MATH Google Scholar
M. J. Hunt, M. Lennig and P. Mermelstein, “Experiments in Syllable Based Recognition of Continuous Speech”, Proc. ICASSP 1980, Denver, p. 880, 1980.
Google Scholar
D. H. Klatt, “SCRIBER and LAFS: Two New Approaches to Speech Analysis”, in W. A. Lea (Editor), TRENDS IN SPEECH RECOGNITION, Prentice-Hall, Englewood Cliffs, p. 529, 1980.
Google Scholar
B. T. Lowerre and D. R. Reddy, “The HARPY Speech Understanding System”, in W. A. Lea (Editor), TRENDS IN SPEECH RECOGNITION, Prentice-Hall, Englewood Cliffs, p. 340, 1980.
Google Scholar
C. S. Myers, L. R. Rabiner and A. E. Rosenberg, “On the Use of Dynamic Time Warping for Word Spotting and Connected Word Recognition”, The Bell System Technical Journal, 60, 3, p. 303–325, 1981.
Google Scholar
M. Onishi (Supervisor), A GRAND DICTIONARY OF PHONETICS, The Phonetic Society of Japan, p. 129, 1981.
Google Scholar
J. E. Paul and A. S. Rabinowitz, “An Acoustically Based Continuous Speech Recognition System”, IEEE Symposium on Speech Recognition, Carnegie-Mellon University, Pittsburgh, PA, p. 63, 1974.
Google Scholar
L. R. Rabiner and R. W. Schafer, DIGITAL PROCESSING OF SPEECH SIGNAL, Prentice-Hall, Englewood Cliffs, p. 442, 1978.
Google Scholar
C. Scagliola and L. Marmi, “Continuous Speech Recognition via Diphone Spotting: a Preliminary Implementation”, Proc. ICASSP 1982, Paris, p. 2008, 1982.
Google Scholar
C. Scagliola, “Continuous Speech Recognition Without Segmentation: Two Ways of Using Diphones as Basic Speech Units”, Speech Communication, 2 (2–3), p. 199, 1983.
Article Google Scholar
C. Scagliola, “Language Models and Search Algorithms for Real-Time Speech Recognition” (to appear on: International Journal of Man-Machine Studies).
Google Scholar
J. E. Shoup, “Phonological Aspects of Speech Recognition”, in W. A. Lea (Editor), TRENDS IN SPEECH RECOGNITION, Prentice-Hall, Englewood Cliffs, p. 125, 1980.
Google Scholar

Download references

Author information

Authors and Affiliations

Central Research Department, Elettronica San Giorgio, ELSAG S.p.A., Via G. Puccini, 2, 16154, Genova Sestri, Italy
Anna Maria Colla & Donatella Sciarra

Authors

Anna Maria Colla
View author publications
You can also search for this author in PubMed Google Scholar
Donatella Sciarra
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Concordia University, Montréal, Québec, H3G 1M8, Canada
Renato De Mori & Ching Y. Suen &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Colla, A.M., Sciarra, D. (1985). Automatic Generation of Linguistic, Phonetic and Acoustic Knowledge for a Diphone-Based Continuous Speech Recognition System. In: De Mori, R., Suen, C.Y. (eds) New Systems and Architectures for Automatic Speech Recognition and Synthesis. NATO ASI Series, vol 16. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-82447-0_13

Download citation

DOI: https://doi.org/10.1007/978-3-642-82447-0_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-82449-4
Online ISBN: 978-3-642-82447-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics