Abstract
We propose a Multigranular Automatic Speech Recognizer. The hypothesis is that speech signal contains information distributed on more different time scales. Many works from various scientific fields ranging from neurobiology to speech technologies, seem to concord on this assumption. In a broad sense, it seems that speech recognition in human is optimal because of a partial parallelization process according to which the left-to-right stream of speech is captured in a multilevel grid in which several linguistic analyses take place contemporarily. Our investigation aims, in this view, to apply these new ideas to the project of more robust and efficient recognizers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Poeppel, D.: The Analysis of Speech in Different Temporal Integration Windows: Cerebral Lateralization as Asymmetric Sampling in Time. Speech Communication 41, 245–255 (2003)
Hawkins, S., Smith, R.: Polysp: a Polysystemic, Phonetically-Rich Approach to Speech Understanding. Rivista di Linguistica, 99–189 (2001)
Wu, S.-L., Kingsbury, E.D., Morgan, N., Greenberg, S.: Incorporating Information from Syllable-Length Time Scales into Automatic Speech Recognition. ICSI PhD. Thesis (1998)
Chang, S.: A Syllable, Articulatory-Feature and Stress-Accent model of Speech Recognition. PhD. Dissertation, Department of Electrical Engineering and Computer Sciences, University of California, Berkeley (2002)
Greenberg, S.: Understanding Speech Understanding: Towards a Unified Theory of Speech Perception. In: ESCA Workshop on Auditory Basis of Speech Perception, pp. 1-8 (1996)
Erman, L.D., Hayes-Roth, F., Lesser, V.R., Reddy, R.: The Hearsay-II Speech-Understanding System: Integrating Knowledge to Resolve Uncertainty. ACM Computing Surveys 12(2) (1980)
Massaro, D.W.: Preperceptual images, processing time and perceptual units in auditory perception. Psychological Review 79(2), 124–145 (1972)
Greenberg, S.: On the Origins of Speech Intelligibility. In: ESCA Workshop for Robust Speech Recognition for Unknown Communication Channels, pp. 23-32 (1997)
Maggiolo-Schettini, A., Peron, A., Tini, S.: A Comparison of Step-Semantics of Statecharts. In: Theoretical Computer Science, pp. 465-498 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cutugno, F., Coro, G., Petrillo, M. (2005). Multigranular Scale Speech Recognizers: Technological and Cognitive View. In: Bandini, S., Manzoni, S. (eds) AI*IA 2005: Advances in Artificial Intelligence. AI*IA 2005. Lecture Notes in Computer Science(), vol 3673. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11558590_33
Download citation
DOI: https://doi.org/10.1007/11558590_33
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29041-4
Online ISBN: 978-3-540-31733-3
eBook Packages: Computer ScienceComputer Science (R0)