Skip to main content

Speaker Segmentation for Air Traffic Control

  • Chapter
Speaker Classification II

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4441))

Abstract

In this contribution a novel system of speaker segmentation has been designed for improving safety on voice communication in air traffic control. In addition to the usage of the aircraft identification tag to assign speaker turns on the shared communication channel to aircrafts, speaker verification is investigated as an add-on attribute to improve security level effectively for the air traffic control. The verification task is done by training universal background models and speaker dependent models based on Gaussian mixture model approach. The feature extraction and normalization units are especially optimized to deal with small bandwidth restrictions and very short speaker turns. To enhance the robustness of the verification system, a cross verification unit is further applied. The designed system is tested with SPEECHDAT-AT and WSJ0 database to demonstrate its superior performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Van Es, G.: Air-ground communication safety study: An analysis of pilot-controller occurrences. EUROCONTROL DAP/SAF Ed. 1.0 (2004/04)

    Google Scholar 

  2. Hering, H., Hagmuller, M., Kubin, G.: Safety and security increase for air traffic management through unnoticeable watermark aircraft identification tag transmitted with the VHF voice communication. In: Proc. of the 22nd IEEE Dig. Avionics Sys. Conf. DASC 2003, vol. 1, 4.E.2–41-10 (2003)

    Google Scholar 

  3. Neffe, M., Hering, H., Kubin, G.: Speaker segmentation for conventional ATC voice communication. In: 4th EUROCONTROL Innovative Research Workshop Brétigny-sur-Orge France (2005)

    Google Scholar 

  4. Mistral Project (2005-2006), http://www.mistral-project.at

  5. Abad, A., Brutti, A., Chu, S., Hernando, J., Klee, U., Macho, D., McDonough, J., Nadeu, C., Omologo, M., Padrell, J., Potamianos, G., Svaizer, P., Wölfel, M.: First experiments of automatic speech activity detection, source localization and speech recognition in the chil project. In: Proc. of Workshop on Hands-Free Speech Communication and Microphone Arrays, Rutgers University, Piscataway, NJ (2005), http://chil.server.de/servlet/is/101/

  6. Airlines electronic engineering committee. Airborne VHF Communications Transceiver, Manual Annapolis, Maryland (2003), https://www.arinc.com/cf/store/catalog_detail.cfm?item_id=493

  7. Hofbauer, K., Hering, H., Kubin, G.: A measurement system and the TUG-EEC-Channels database for the aeronautical voice radio. In: IEEE Vehicular Technology Conference, Montreal, Canada (2006)

    Google Scholar 

  8. Hofbauer, K., Hering, H., Kubin, G.: Aeronautical voice radio channel modelling and simulation - A tutorial review. ICRAT, Belgrade, Serbia and Montenegro (2006)

    Google Scholar 

  9. Reynolds, D.A., Campbell, W., Gleason, T.T., Quillen, C., Sturim, D., Torres-Carrasquillo, P., Adami, A.: The 2004 MIT Lincoln Laboratory Speaker Recognition System. In: Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), vol.1, pp. 177–180 (2005)

    Google Scholar 

  10. Kinnunen, T., Karpov, E., Franti, P.: Real-time speaker identification and verification. IEEE Transactions on Audio, Speech and Language Processing 14(1), 277–288 (2006)

    Article  Google Scholar 

  11. Chen, K.: On the use of different speech representations for speaker modeling. IEEE Trans. on Systems, Man and Cybernetics, Part C 35, 301–314 (2005)

    Article  Google Scholar 

  12. Wan, V., Campbell, W.M.: Support vector machines for speaker verification and identification. In: Proc. of the IEEE Signal Processing Society Workshop on Neural Networks for Signal Processing X, vol. 2, pp. 775–784 (2000)

    Google Scholar 

  13. Reynolds, D.A., Rose, R.C.: Robust text-independent speaker identification using gaussian mixture speaker models. IEEE Transactions on Speech and Audio Processing 3(1), 72–83 (1995)

    Article  Google Scholar 

  14. Stahl, V., Fischer, A., Bippus, R.: Quantile based noise estimation for spectral subtraction and Wiener filtering. In: Proc. of the IEEE International Conf. on Acoustics, Speech, and Signal ICASSP 2000, vol. 3, pp. 1875–1878 (2000)

    Google Scholar 

  15. Pham, T.V., Kubin, G.: WPD-based noise suppression using nonlinearly weighted threshold quantile estimation and optimal wavelet shrinking. In: Proc. Interspeech 2005 Lisboa, Portugal, pp. 2089–2092 (september 4-8, 2005)

    Google Scholar 

  16. Pham, T.V., Kèpèsi, M., Kubin, G., Weruaga, L., Juffinger, A., Grabner, M.: Noise cancellation frontends for automatic meeting transcription. In: Euronoise Conf. Tampere, Finland, CS.42-445 (2006)

    Google Scholar 

  17. Brady, P.T.: A statistical analysis of on-off pattern in 16 conversations. Bell Syst. Tech.J. 47(1), 73–91 (1968)

    Google Scholar 

  18. Eurocontrol experimental centre-EEC Brètigny-sur-Orge, France, http://www.eurocontrol.int/eec/public/subsite_homepage/homepage.html

  19. Bimbot, F., Bonastre, J.-F., Fredouille, C., Gravier, G., Magrin-Chagnolleau, I., Meignier, S., Merlin, T., Ortega-Garcia, J., Petrovska-Delacretaz, D., Reynolds, D.A.: A tutorial on text-independent speaker verification. EURASIP Journal on Applied Signal Processing, pp. 430–451 (2003)

    Google Scholar 

  20. de la Torre, A., Peinado, A.M., Segura, J.C., Perez-Cordoba, J.L., Benitez, M.C., Rubio, A.J.: Histogram equalization of speech representation for robust speech recognition. IEEE Transactions on Speech and Audio Processing 13, 355–366 (2005)

    Article  Google Scholar 

  21. Skosan, M., Mashao, D.: Modified segmental histogram equalization for robust speaker verification. Pattern Recognition Letters 27(5), 479–486 (2006)

    Article  Google Scholar 

  22. Bimbot, F., Bonastre, J.-F., Fredouille, C., Gravier, G., Magrin-Chagnolleau, I., Meignier, S., Merlin, T., Ortega-Garcia, J., Petrovska-Delacretaz, D., Reynolds, D.A.: Speaker verification using adapted gaussian mixture models. Digital Signal Processing, 10, 19–41 (2000)

    Google Scholar 

  23. Auckenthaler, R., Carey, M., Lloyd-Thomas, H.: Score normalization for text-independent speaker verification systems. Digital Signal Processing 10, 42–54 (2000)

    Article  Google Scholar 

  24. Baum, M., Erbach, G., Kubin, G.: Speechdat-AT: A telephone speech database for Austrian German. In: Proc. LREC Workshop Very Large Telephone Databases (XLDB) Athen, Greece, pp. 51–56 (2000)

    Google Scholar 

  25. Garofalo, J., David Graff, D., Paul, D., Pallett, D.: Continuous Speech Recognition (CSR-I) Wall Street Journal (WSJ0) news, complete. Linguistic Data Consortium, Philadelphia (1993), http://ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC93S6A

  26. Przybocki, M., Martin, A.: Nist speaker recognition evaluation (1997), http://www.nist.gov/speech/tests/spk/1997/sp_v1p1.htm

  27. Martin, A., Doddington, G., Kamm, T., Ordowski, M., Przybocki, M.: The det curve in assessment of detection task performance. In: Proc. Eurospeech, Rhodes, pp. 1895–1898 (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Christian Müller

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Neffe, M., Van Pham, T., Hering, H., Kubin, G. (2007). Speaker Segmentation for Air Traffic Control. In: Müller, C. (eds) Speaker Classification II. Lecture Notes in Computer Science(), vol 4441. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74122-0_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74122-0_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74121-3

  • Online ISBN: 978-3-540-74122-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics