Skip to main content

Lip Biometrics for Digit Recognition

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4673))

Abstract

This paper presents a speaker-independent audio-visual digit recognition system that utilizes speech and visual lip signals. The extracted visual features are based on line-motion estimation obtained from video sequences with low resolution (128 ×128 pixels) to increase the robustness of audio recognition. The core experiments investigate lip motion biometrics as stand-alone as well as merged modality in speech recognition system. It uses Support Vector Machines, showing favourable experimental results with digit recognition featuring 83% to 100% on the XM2VTS database depending on the amount of available visual information.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Potamianos, G., Neti, C., Gravier, G., Garg, A., Senior, A.: Recent advances in the automatic recognition of audiovisual speech. Proceedings of the IEEE 91(9), 1306–1326 (2003)

    Article  Google Scholar 

  2. Brunelli, K.R., Falavigna, D.: Person identification using multiple cues. IEEE Transactions on Pattern Analysis and Machine Intelligence 17(10), 955–966 (1995)

    Article  Google Scholar 

  3. Chibelushi, C., Deravi, F., Mason, J.: A review of speech-based bimodal recognition. IEEE Transactions on Multimedia 4(1), 23–37 (2002)

    Article  Google Scholar 

  4. Duc, B., Fischer, S., Bigun, J.: Face authentication with sparse grid gabor information. IEEE International Conference Acoustics, Speech, and Signal Processing 4(21), 3053–3056 (1997)

    Google Scholar 

  5. Tang, X., Li, X.: Video based face recognition using multiple classifiers. In: FGR 2004. Sixth IEEE International Conference on Automatic Face and Gesture Recognition, pp. 345–349. IEEE Computer Society Press, Los Alamitos (2004)

    Google Scholar 

  6. Faraj, M.I., Bigun, J.: Speaker and speech recognition by audio-visual lip biometrics. In: The 2nd International Conference on Biometrics, Seoul Korea (2007)

    Google Scholar 

  7. Faraj, M.I., Bigun, J.: Person verification by lip-motion. In: 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW), pp. 37–45 (2006)

    Google Scholar 

  8. Faraj, M.I., Bigun, J.: Audio-visual person authentication using lip-motion from orientation maps. Article accepted for publication in Pattern Recognition Letters – 2007 (2007)

    Google Scholar 

  9. Luettin, J., Maitre, G.: Evaluation protocol for the extended m2vts database xm2vtsdb 1998. In: IDIAP Communication 98-054, Technical report R R-21, number = IDIAP (1998)

    Google Scholar 

  10. Dieckmann, U., Plankensteiner, P., Wagner, T.: Acoustic-labial speaker verification. In: Bigün, J., Borgefors, G., Chollet, G. (eds.) AVBPA 1997. LNCS, vol. 1206, pp. 301–310. Springer, Heidelberg (1997)

    Chapter  Google Scholar 

  11. Jourlin, P., Luettin, J., Genoud, D., Wassner, H.: Acoustic-labial speaker verification. In: Bigün, J., Borgefors, G., Chollet, G. (eds.) AVBPA 1997. LNCS, vol. 1206, pp. 319–326. Springer, Heidelberg (1997)

    Chapter  Google Scholar 

  12. Chen, T.: Audiovisual speech processing. IEEE Signal Processing Magazine 18(1), 9–21 (2001)

    Article  MATH  Google Scholar 

  13. Liang, L., Zhao, X.L.Y., Pi, X., Nefian, A.: Speaker independent audio-visual continuous speech recognition. In: ICME 2002. Proceedings of IEEE International Conference on Multimedia and Expo, 2002, vol. 2, pp. 26–29 (2002)

    Google Scholar 

  14. Kollreider, K., Fronthaler, H., Bigun, J.: Evaluating liveness by face images and the structure tensor. In: AutoID 2005. Fourth Workshop on Automatic Identification Advanced Technologies, pp. 75–80. IEEE Computer Society Press, Los Alamitos (2005)

    Google Scholar 

  15. Bigun, J., Granlund, G., Wiklund, J.: Multidimensional orientation estimation with applications to texture analysis of optical flow. IEEE-Trans Pattern Analysis and Machine Intelligence 13(8), 775–790 (1991)

    Article  Google Scholar 

  16. Granlund, G.H.: In search of a general picture processing operator. Computer Graphics and Image Processing 8(2), 155–173 (1978)

    Article  Google Scholar 

  17. Davis, S., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE transactions on Acoustics, Speech, and Signal Processing 28(4), 357–366 (1980)

    Article  Google Scholar 

  18. Young, S., Kershaw, D., Odell, J., Ollason, D., Valtchev, V., Woodland, P.: The htk book (for htk version 3.0) (2000), http://htk.eng.cam.ac.uk/docs/docs.shtml

  19. Chang, C.C., Lin, C.J.: Libsvm–a library for support vector machines (2001), software available at www.csie.ntu.edu.tw/~cjlin/libsvm

  20. Messer, K., Matas, J., Kittler, J., Luettin, J.: Xm2vtsdb: The extended m2vts database. In: ICSLP 1996. Second International Conference of Audio and Video-based Biometric Person Authentication, pp. 72–77 (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Walter G. Kropatsch Martin Kampel Allan Hanbury

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Faraj, M.I., Bigun, J. (2007). Lip Biometrics for Digit Recognition. In: Kropatsch, W.G., Kampel, M., Hanbury, A. (eds) Computer Analysis of Images and Patterns. CAIP 2007. Lecture Notes in Computer Science, vol 4673. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74272-2_45

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74272-2_45

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74271-5

  • Online ISBN: 978-3-540-74272-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics