Skip to main content

Relations of Audio and Visual Speech Signals in a Physical Feature Space: Implications for the Hearing-impaired

  • Chapter
Speechreading by Humans and Machines

Part of the book series: NATO ASI Series ((NATO ASI F,volume 150))

Abstract

Goal of this paper is to introduce appropriate motion models for those visual articulatory movements that are relevant for the process of speechreading, and with this, design a facial animation program with an open input text vocabulary for use as a training aid for speechreading.

Since the experimental work of Menzerath and de Lacerda (1931) it is known that the movements of the speech organs are structurally interrelated within a spoken context. The sound signals and the related visual articulatory movements are created in the course of a fully overlapping coarticulation.

The paper will illustrate the interrelation of audio and visual speech features. It can be shown that even in fluent speech an interactive determination of moments of optimum articulation is possible for most of the phonemes. The facial expressions and the lip contours as well as the set points within the phoneme boundaries depend on the context. Several approaches for the description of these dependencies will be explained and discussed.

A facial animation system will be described which uses a codebook of specific key-pictures corresponding to the moments of optimum articulatory positions in fluent speech. Face movements are generated by selecting video pictures out of thecodebook and subsequently calculating interim pictures with the help of interpolation algorithms Movements of the tongue are artificially introduced into the opening of the mouth.

In order to evaluate different motion models it is necessary to present the corresponding facial animation model to people who can speechread. This is due to the point that too many parameters of the process of human visual perception are still widely unknown. For this reason, the presented evaluation methods are based on visemes as the smallest perceptible visual units of the articulation process. The interaction of these units is qualitatively described. Results of this experimental research extend the knowledge about articulation and coarticulation and are being used for the improvement of the facial animation system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 299.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 379.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 379.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Bothe, HH. (1996). Relations of Audio and Visual Speech Signals in a Physical Feature Space: Implications for the Hearing-impaired. In: Stork, D.G., Hennecke, M.E. (eds) Speechreading by Humans and Machines. NATO ASI Series, vol 150. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-13015-5_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-13015-5_34

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-08252-8

  • Online ISBN: 978-3-662-13015-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics