Speaker Localization in CHIL Lectures: Evaluation Criteria and Results

  • Maurizio Omologo
  • Piergiorgio Svaizer
  • Alessio Brutti
  • Luca Cristoforetti
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3869)


This work addresses the problem of automatic speaker localization and tracking in a real lecture scenario. Evaluation criteria recently adopted under CHIL and NIST benchmarking are outlined. Two speaker localization systems are described, which are based on the use of Generalized Cross Correlation Phase Transform analysis and Global Coherence Field. Benchmarking results, obtained on a set of 13 lectures, showed an average RMS error of about 30 cm in the speaker localization.


Coherence Measure Microphone Array Time Delay Estimation Global Coherence Active Speaker 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Brandstein, M., Ward, D.: Microphone Arrays. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  2. 2.
    Knapp, C.H., Carter, C.: The Generalized Correlation Method for Estimation of Time Delay. IEEE Trans. on ASSP 24, 320–327 (1976)CrossRefGoogle Scholar
  3. 3.
    Omologo, M., Svaizer, P.: Acoustic Event Localization using a Crosspower-Spectrum Phase based Techniques. Proc. IEEE ICASSP 2, 273–276 (Adelaide 1994)Google Scholar
  4. 4.
    De Mori, R.: Spoken Dialogues with Computers, ch. 2. Academic Press, London (1998)Google Scholar
  5. 5.
    Rabinkin, D.V., Ranomeron, R.J., French, J.C., Flanagan, J.L.: A DSP Implementation of Source Location using Microphone Arrays. In: Proc. of SPIE, vol. 2846 (1996)Google Scholar
  6. 6.
    Wang, H., Chu, P.: Voice Source Localization for Automatic Camera Pointing System in Videoconferencing. In: Proc. of ICASSP (1997)Google Scholar
  7. 7.
    Huang, Y.A., Benesty, J., Elko, G.W.: Microphone Arrays for Video Camera Steering. In: Gay, S.L., Benesty, J. (eds.) Acoustic Signal Processing for Telecommunication. Kluwer Academic Publishers, Dordrecht (2000)Google Scholar
  8. 8.
    Silverman, H.F., et al.: Performance of Real-Time Source Location Estimators for a Large-Aperture Microphone Array. IEEE Trans. on SAP 13(4) (2005)Google Scholar
  9. 9.
    Van Trees, H.L.: Optimum Array Processing-Part IV. John Wiley & Sons, Chichester (2002)CrossRefGoogle Scholar
  10. 10.
    Omologo, M., Svaizer, P.: Use of the Crosspower-Spectrum Phase in Acoustic Event Location. IEEE Trans. on SAP 5(3), 288–292 (May 1997)Google Scholar
  11. 11.
    Omologo, M., Svaizer, P.: Acoustic Source Localization in Noisy and Reverberant Environment using CSP Analysis. In: Proc. IEEE ICASSP (1996)Google Scholar
  12. 12.
    Chen, J., Benesty, J., Huang, Y.: Robust Time Delay Estimation exploting Redundancy among Multiple Microphones. IEEE Trans. on SAP 11(6) (2003)Google Scholar
  13. 13.
    Macho, D., et al.: Automatic Speech Activity Detection, Source Localization, and Speech Recognition on the CHIL Seminar Corpus. In: Proceedings of ICME (2005)Google Scholar
  14. 14.
    Buchner, H., et al.: Simultaneous Localization of Multiple Sound Sources using Blind Adaptive MIMO Filtering. In: Proc. of ICASSP (2005)Google Scholar
  15. 15.
    Alvarado, V.: Talker Localization and Optimal Placement of Microphones for a Linear Microphone Array using Stochastic Region Contraction, PhD Thesis, Technical Report LEMS-69, Brown University (1990)Google Scholar
  16. 16.
    Focken, D., Stiefelhagen, R.: Towards Vision-based 3-d People Tracking in a Smart Room. In: IEEE Int. Conf. Multimodal Interfaces (2002)Google Scholar
  17. 17.
    Champagne, B., Bedard, S., Stephenne, A.: Performance of Time Delay Estimation in the Presence of Room Reverberation. IEEE Trans. on SAP 4 (1996)Google Scholar
  18. 18.
    Nishiura, T., Yamada, T., Nakamura, S., Shikano, K.: Localization of Multiple Sound Source based on a CSP analysis with a Microphone Array. In: ICASSP 2000 (2000)Google Scholar
  19. 19.
    Brutti, A., Omologo, M., Svaizer, P.: Oriented Global Coherence Field for the Estimation of the Head Orientation in Smart Rooms equipped with Distributed Microphone Arrays. In: Proc. of Interspeech (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Maurizio Omologo
    • 1
  • Piergiorgio Svaizer
    • 1
  • Alessio Brutti
    • 1
  • Luca Cristoforetti
    • 1
  1. 1.ITC-irstPovo, TrentoItaly

Personalised recommendations