Skip to main content

Spatio-Temporal Mask Learning: Application to Speech Recognition

  • Conference paper
Artificial Neural Nets and Genetic Algorithms

Abstract

In this paper, we describe the “spatio-temporal” map which is an original algorithm to learn and recognize dynamic patterns represented by sequences. This work is slanted toward an internal and explicit representation of time which seems to be neuro-biologically relevant. The map involves units with different kinds of links: feed-forward connections, intra-map connections and inter-map connections. This architecture is able to learn sequences robust to noise from an input stream. The learning process is self-organized for the feed-forward links and “pseudo” self-organized for the intra-map links. An application to French spoken digits recognition is presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. F. Alexandre. Une modélisation fonctionnelle du cortex: la colonne corticale. Aspects visuels et moteurs. PhD thesis, Université Nancy I, 1990.

    Google Scholar 

  2. F. Alexandre, F. Guyot, J P. Haton, and Y. Burnod. The cortical column: a new processing unit for multilayered networks. Neural networks, 4: 15–25, 1991

    Article  Google Scholar 

  3. B. Ans. Modèle neuromimétique du stockage et du rappel de séquences temporelles. t311, série iii, C. R. Acad. Sci. Paris, 1990.

    Google Scholar 

  4. B. Colnet and S. Durand. Application of temporal neural networks to source localisation. In ICANNGA, second international conference on artificial neural networks and genetic algorithms, Alès, France, 1995.

    Google Scholar 

  5. S. B. Davis and P. Mermelstein. Comparison of parametric representation for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on acoustics, speech, and signal processing, ASSP-28(4): 357–366, 1980.

    Article  Google Scholar 

  6. S. Durand and F. Alexandre. A neural network based on sequence learning: Application to spoken digits recognition. In 7th international conference on Neural Networks and Their Applications, pages 290–298, Marseille, 1994.

    Google Scholar 

  7. J L. Elman. Finding structure in time. Cognitive Science, 14: 179–211, 1990.

    Article  Google Scholar 

  8. D H. Hubel and T N. Wiesel. Functional architecture of macaque monkey visual cortex. Ferrier Lecture Proc. Roy. Soc. Lond.B, pages 1–59, 1977.

    Google Scholar 

  9. M I. Jordan. Attractor dynamics and parallelism in a connectionist sequential machine. In Hillsdale, editor, Proceedings of the Eighth Annual Conference of the Cognitive Science Society. Erlbaum, 1986.

    Google Scholar 

  10. T. Kohonen. Self-Organization and Associative Memory. Springer Series in Information Sciences. Springer-Verlag, third edition, 1989.

    Google Scholar 

  11. V.I. Nenov and M.G. Dyer. Perceptually grounded language learning: Part1-a neural network architecture for robust sequence association. Connection Science, 5 (2): 115–138, 1993.

    Article  Google Scholar 

  12. A. Waibel, T. Hanazawa, G. Hinton, K. Shikano, and K J. Lang. Phoneme recognition using time-delay neural networks. IEEE Transaction on Acoustics, Speech and Signal Processing, 37 (3): 328–339, 1989.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 1995 Springer-Verlag/Wien

About this paper

Cite this paper

Durand, S., Alexandre, F. (1995). Spatio-Temporal Mask Learning: Application to Speech Recognition. In: Artificial Neural Nets and Genetic Algorithms. Springer, Vienna. https://doi.org/10.1007/978-3-7091-7535-4_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-7091-7535-4_36

  • Publisher Name: Springer, Vienna

  • Print ISBN: 978-3-211-82692-8

  • Online ISBN: 978-3-7091-7535-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics