A Decision Fusion System Across Time and Classifiers for Audio-Visual Person Identification

  • Andreas Stergiou
  • Aristodemos Pnevmatikakis
  • Lazaros Polymenakos
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4122)


In this paper the person identification system developed at Athens Information Technology is presented. It comprises of an audio-only (speech), a video-only (face) and an audiovisual fusion subsystem. Audio recognition is based on the Gaussian Mixture modeling of the principal components of the Mel-Frequency Cepstral Coefficients of speech. Video recognition is based on linear subspace projection methods and temporal fusion using weighted voting on the results. Audiovisual fusion is done by fusing the unimodal identities into the multimodal one, using a suitable confidence metric for the results of the unimodal classifiers.


Principal Component Analysis Face Recognition Linear Discriminant Analysis Gaussian Mixture Model Smart Space 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Phillips, P., et al.: Overview of the Face Recognition Grand Challenge. In: CVPR (2005)Google Scholar
  2. 2.
    Ekenel, H., Pnevmatikakis, A.: Video-Based Face Recognition Evaluation in the CHIL Project – Run 1. In: Face and Gesture Recognition 2006, Southampton, UK, pp. 85–90 (April 2006)Google Scholar
  3. 3.
    Waibel, A., Steusloff, H., Stiefelhagen, R., et al.: CHIL: Computers in the Human Interaction Loop. In: 5th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS), Lisbon, Portugal (April 2004)Google Scholar
  4. 4.
    Brunelli, R., Falavigna, D.: Person Recognition Using Multiple Cues. IEEE Trans. Pattern Anal. Mach. Intell. 17(10), 955–966 (1995)CrossRefGoogle Scholar
  5. 5.
    Kittler, J., Hatef, M., Duin, R.P.W., Matas, J.: On combining classifiers. IEEE Trans. Pattern Anal. Mach. Intell. 20(3), 226–239 (1998)CrossRefGoogle Scholar
  6. 6.
    Turk, M., Pentland, A.: Eigenfaces for Recognition. J. Cognitive Neuroscience 3, 71–86 (1991)CrossRefGoogle Scholar
  7. 7.
    Belhumeur, P., Hespanha, J., Kriegman, D.: Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection. IEEE Trans. Pattern Analysis and Machine Intelligence 19(7), 711–720 (1997)CrossRefGoogle Scholar
  8. 8.
    Rentzeperis, E., Stergiou, A., Pnevmatikakis, A., Polymenakos, L.: Impact of Face Registration Errors on Recognition. In: Artificial Intelligence Applications and Innovations, Peania, Greece (June 2006)Google Scholar
  9. 9.
    Jesorsky, O., Kirchberg, K., Frischholz, R.: Robust Face Detection Using the Hausdorff Distance. In: Bigun, J., Smeraldi, F. (eds.) AVBPA 2001. LNCS, vol. 2091, pp. 90–95. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  10. 10.
    Yu, H., Yang, J.: A direct LDA algorithm for high-dimensional data with application to face recognition. Pattern Recognition 34, 2067–2070 (2001)zbMATHCrossRefGoogle Scholar
  11. 11.
    Sohn, J., Kim, N.S., Sung, W.: A Statistical Model Based Voice Activity Detection. IEEE Sig. Proc. Letters 6(1) (1999)Google Scholar

Copyright information

© Springer Berlin Heidelberg 2007

Authors and Affiliations

  • Andreas Stergiou
    • 1
  • Aristodemos Pnevmatikakis
    • 1
  • Lazaros Polymenakos
    • 1
  1. 1.Athens Information Technology, Autonomic and Grid Computing, Markopoulou Ave., 19002 PeaniaGreece

Personalised recommendations