The AIT Multimodal Person Identification System for CLEAR 2007

Stergiou, Andreas; Pnevmatikakis, Aristodemos; Polymenakos, Lazaros

doi:10.1007/978-3-540-68585-2_20

Andreas Stergiou¹,
Aristodemos Pnevmatikakis¹ &
Lazaros Polymenakos¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4625))

Included in the following conference series:

1218 Accesses
2 Citations

Abstract

This paper presents the person identification system developed at Athens Information Technology and its performance in the CLEAR 2007 evaluations. The system operates on the audiovisual information (speech and faces) collected over the duration of gallery and probe videos. It comprises of an audio-only (speech), a video-only (face) and an audiovisual fusion subsystem. Audio recognition is based on the Gaussian Mixture modeling of the principal components of composite feature vectors, consisting of Mel-Frequency Cepstral Coefficients and Perceptual Linear Prediction coefficients of speech. Video recognition is based on combining three different classification algorithms: Principal Components Analysis with a modified Mahalanobis distance, sub-class Linear Discriminant Analysis (featuring automatic sub-class generation) with cosine distance and Bayesian classifier based on Gaussian modeling of intrapersonal differences. A nearest neighbor classification rule is applied. A decision fusion scheme across time and classifiers returns the video identity. The audiovisual subsystem fuses the unimodal identities into the multimodal one, using a suitable confidence metric.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Phillips, J., Flynn, P., Scruggs, T., Boyer, K., Worek, W.: Preliminary Face Recognition Grand Challenge Results. In: Proceedings of IEEE Conference on Automatic Face and Gesture Recognition, Southampton, UK, pp. 15–21 (2006)
Google Scholar
http://www.clear-evaluation.org
Stiefelhagen, R., Bernardin, K., Bowers, R., Garofolo, J., Mostefa, D., Soundararajan, P.: The CLEAR 2006 Evaluation. In: Stiefelhagen, R., Garofolo, J.S. (eds.) CLEAR 2006. LNCS, vol. 4122, pp. 1–44. Springer, Heidelberg (2007)
Chapter Google Scholar
Ekenel, H., Pnevmatikakis, A.: Video-Based Face Recognition Evaluation in the CHIL Project – Run 1, Face and Gesture Recognition 2006, Southampton, UK, April 2006, pp. 85–90 (2006)
Google Scholar
Waibe1, A., Steusloff, H., Stiefelhagen, R., et al.: CHIL: Computers in the Human Interaction Loop. In: 5th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS), Lisbon, Portugal (April 2004)
Google Scholar
Stergiou, A., Pnevmatikakis, A., Polymenakos, L.: A Decision Fusion System Across Time and Classifiers for Audio-Visual Person Identification. In: Stiefelhagen, R., Garofolo, J.S. (eds.) CLEAR 2006. LNCS, vol. 4122, pp. 223–232. Springer, Heidelberg (2007)
Chapter Google Scholar
HTK (Hidden Markov Toolkit), http://htk.eng.cam.ac.uk/
Weng, J., Evans, C.H., Hwang, W.-S.: An Incremental Learning Method for Face Recognition under Continuous Video Stream. In: Proceedings of IEEE Conference on Automatic Face and Gesture Recognition, Grenoble, France, pp. 251–256 (2000)
Google Scholar
Lee, K.-C., Ho, J., Yang, M.-H., Kriegman, D.: Video-based face recognition using probabilistic appearance manifolds. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Madison, Wisconsin, USA, pp. 313–320 (2003)
Google Scholar
Liu, X., Chen, T.: Video-based face recognition using adaptive hidden markov models. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Madison, Wisconsin, USA, pp. 340–345 (2003)
Google Scholar
Raytchev, B., Murase, H.: Unsupervised recognition of multi-view face sequences based on pairwise clustering with attraction and repulsion. Computer Vision and Image Understanding 91, 22–52 (2003)
Article Google Scholar
Aggarwal, G., Roy-Chowdhury, A.K., Chellappa, R.: A System Identification Approach for Video-based Face Recognition. In: Proceedings of International Conference on Pattern Recognition, Cambridge, UK (2004)
Google Scholar
Xie, C., Vijaya Kumar, B.V.K., Palanivel, S., Yegnanarayana, B.: A Still-to-Video Face Verification System Using Advanced Correlation Filters. In: Zhang, D., Jain, A.K. (eds.) ICBA 2004. LNCS, vol. 3072, pp. 102–108. Springer, Heidelberg (2004)
Google Scholar
Pnevmatikakis, A., Polymenakos, L.: Far-Field Multi-Camera Video-to-Video Face Recognition. In: Delac, K., Grgic, M. (eds.) Face Recognition”, Advanced Robotics Systems, ISBN 978-3-902613-03-5
Google Scholar
Fukunaga, K.: Statistical Pattern Recognition. Academic Press, London (1990)
MATH Google Scholar
Moghaddam, B.: Principal Manifolds and Probabilistic Subspaces for Visual Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 24(6) (2002)
Google Scholar
Kittler, J., Hatef, M., Duin, R.P.W., Matas, J.: On combining classifiers. IEEE Trans. Pattern Anal. Mach. Intell. 20(3), 226–239 (1998)
Article Google Scholar
Sohn, J., Kim, N.S., Sung, W.: A Statistical Model Based Voice Activity Detection. IEEE Sig. Proc. Letters 6(1) (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

Autonomic and Grid Computing, Athens Information Technology, Markopoulou Ave., 19002, Peania, Greece
Andreas Stergiou, Aristodemos Pnevmatikakis & Lazaros Polymenakos

Authors

Andreas Stergiou
View author publications
You can also search for this author in PubMed Google Scholar
Aristodemos Pnevmatikakis
View author publications
You can also search for this author in PubMed Google Scholar
Lazaros Polymenakos
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Rainer Stiefelhagen Rachel Bowers Jonathan Fiscus

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Stergiou, A., Pnevmatikakis, A., Polymenakos, L. (2008). The AIT Multimodal Person Identification System for CLEAR 2007. In: Stiefelhagen, R., Bowers, R., Fiscus, J. (eds) Multimodal Technologies for Perception of Humans. RT CLEAR 2007 2007. Lecture Notes in Computer Science, vol 4625. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68585-2_20

Download citation

DOI: https://doi.org/10.1007/978-3-540-68585-2_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68584-5
Online ISBN: 978-3-540-68585-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics