Multigranular Scale Speech Recognizers: Technological and Cognitive View

Cutugno, Francesco; Coro, Gianpaolo; Petrillo, Massimo

doi:10.1007/11558590_33

Francesco Cutugno²⁰,
Gianpaolo Coro²⁰ &
Massimo Petrillo²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3673))

Included in the following conference series:

Congress of the Italian Association for Artificial Intelligence

683 Accesses
2 Citations

Abstract

We propose a Multigranular Automatic Speech Recognizer. The hypothesis is that speech signal contains information distributed on more different time scales. Many works from various scientific fields ranging from neurobiology to speech technologies, seem to concord on this assumption. In a broad sense, it seems that speech recognition in human is optimal because of a partial parallelization process according to which the left-to-right stream of speech is captured in a multilevel grid in which several linguistic analyses take place contemporarily. Our investigation aims, in this view, to apply these new ideas to the project of more robust and efficient recognizers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Poeppel, D.: The Analysis of Speech in Different Temporal Integration Windows: Cerebral Lateralization as Asymmetric Sampling in Time. Speech Communication 41, 245–255 (2003)
Article Google Scholar
Hawkins, S., Smith, R.: Polysp: a Polysystemic, Phonetically-Rich Approach to Speech Understanding. Rivista di Linguistica, 99–189 (2001)
Google Scholar
Wu, S.-L., Kingsbury, E.D., Morgan, N., Greenberg, S.: Incorporating Information from Syllable-Length Time Scales into Automatic Speech Recognition. ICSI PhD. Thesis (1998)
Google Scholar
Chang, S.: A Syllable, Articulatory-Feature and Stress-Accent model of Speech Recognition. PhD. Dissertation, Department of Electrical Engineering and Computer Sciences, University of California, Berkeley (2002)
Google Scholar
Greenberg, S.: Understanding Speech Understanding: Towards a Unified Theory of Speech Perception. In: ESCA Workshop on Auditory Basis of Speech Perception, pp. 1-8 (1996)
Google Scholar
Erman, L.D., Hayes-Roth, F., Lesser, V.R., Reddy, R.: The Hearsay-II Speech-Understanding System: Integrating Knowledge to Resolve Uncertainty. ACM Computing Surveys 12(2) (1980)
Google Scholar
Massaro, D.W.: Preperceptual images, processing time and perceptual units in auditory perception. Psychological Review 79(2), 124–145 (1972)
Article Google Scholar
Greenberg, S.: On the Origins of Speech Intelligibility. In: ESCA Workshop for Robust Speech Recognition for Unknown Communication Channels, pp. 23-32 (1997)
Google Scholar
Maggiolo-Schettini, A., Peron, A., Tini, S.: A Comparison of Step-Semantics of Statecharts. In: Theoretical Computer Science, pp. 465-498 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Physics, University Federico II, Naples, Italy
Francesco Cutugno, Gianpaolo Coro & Massimo Petrillo

Authors

Francesco Cutugno
View author publications
You can also search for this author in PubMed Google Scholar
Gianpaolo Coro
View author publications
You can also search for this author in PubMed Google Scholar
Massimo Petrillo
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Research Center on Complex Systems and Artificial Intelligence (CSAI) Department of Computer Science, Systems and Communication (DISCo), University of Milan, Bicocca viale Sarca, 336, 20126, Milan, (Italy)
Stefania Bandini
CSAI - Complex Systems & Artificial Intelligence Research Centre, University of Milano–Bicocca,
Sara Manzoni

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cutugno, F., Coro, G., Petrillo, M. (2005). Multigranular Scale Speech Recognizers: Technological and Cognitive View. In: Bandini, S., Manzoni, S. (eds) AI*IA 2005: Advances in Artificial Intelligence. AI*IA 2005. Lecture Notes in Computer Science(), vol 3673. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11558590_33

Download citation

DOI: https://doi.org/10.1007/11558590_33
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29041-4
Online ISBN: 978-3-540-31733-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics