Plausible Self-Organizing Maps for Speech Recognition
A major problem in connectionist phonetic acoustic decoding is the way to present acoustic signal to the network. Neurobiological data about the inner ear and the primary auditory cortex can be very helpful but are rare. On the other hand, other biological works have shown that structure and functioning of the visual cortex, which have been extensively studied, are very close to the structure and functioning of the auditory cortex.
It has been shown that a simple two-layered network of linear neurons can organize itself to extract the complete information contained in a set of presented patterns. This model has been applied to visual information and has revealed orientation and spatial frequency selective cells.
Such principles have been applied to speech recognition. So we have designed two kinds of maps. The first kind is able to represent frequency characteristics of the signal (e.g.: formantic structure). The second kind takes into account dynamic aspects of the signal (e.g.: formantic transition).
First, these results have been analyzed both from a phonetic and a signal processing point of view and show very interesting representation of the signal. Second, these representations can be used as the input map of a dynamic connectionist network, for speech recognition. The input maps have a selective activity with regard to phonemic structures, and enable dynamic networks to differentiate the phonemes.
KeywordsSpeech Recognition Speech Signal Auditory Cortex Output Neuron Output Unit
Unable to display preview. Download preview PDF.
- Waibel and al, ‘Phoneme recognition using timedelay neural networks’, IEEE transactions on acoustics, speech, and signal processing, 37, 3, 1989.Google Scholar
- Hubel D and Wiesel T, ‘Functional architecture of macaque monkey visual cortex’, Ferner Lecture Proc. Roy. Soc. Lond.B, (198): 1–59, 1977.Google Scholar
- Bienenstock E L, Cooper L N and Munro P W, ‘Theory for the development of neuron selectivity: orientation specificity and binocular interaction in visual cortex’, The Journal of Neuroscience, 2,(1):32–48, 1982.Google Scholar
- Alexandre F, ‘Une modélisation fonctionnelle du cortex: la colonne corticale. Aspects visuels et moteurs’, thèse de l’Université de Nancy I, 1990.Google Scholar
- Burnod Y, ‘An adaptative neural network: The cerebral cortex’, 2nd edition, Masson, Paris, 1988.Google Scholar
- Elman J L and Zipser D, ‘Learning the hidden structure of speech’, J. Acoust. Soc. Am., 1615-1626, 1987.Google Scholar