Advertisement

Voice Control in a Real Flight Deck Environment

  • Michal Trzos
  • Martin Dostl
  • Petra Machkov
  • Jana Eitlerov
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11107)

Abstract

In this paper, we present a methodology on how to implement multimodal voice controlled systems by means of automatic speech recognition. The real flight deck environment brings many challenges such as high accuracy requirements, high noise conditions, non-native English-speaking users or limited hardware and software resources. We present the design of an automatic speech recognition system based on a freely available AMI Meeting Corpus and a proprietary corpus provided by Airbus. Then we describe how we trained and evaluated the speech recognition models in a simulated environment using the anechoic chamber laboratory. The tuned speech recognition models were tested in real flight environment on two Honeywell experimental airplanes: Dassault Falcon 900 and Boeing 757.

Keywords

Automatic speech recognition Multi-modal interaction Navigation display 

References

  1. 1.
    Dostal, M., Kolcarek, P.: Multimodal navigation display. In: 2015 IEEE/AIAA 34th Digital Avionics Systems Conference (DASC), Prague, pp. 3B1-1–3B1-11 (2015)Google Scholar
  2. 2.
    Swearingen, P.A.: United States Patent No. 8,234,121 B1. U.S. Patent and Trademark Office, Washington, DC (2012)Google Scholar
  3. 3.
    Mccowan, I., et al.: The AMI meeting corpus. In: Proceedings Measuring Behavior 2005, 5th International Conference on Methods and Techniques in Behavioral Research. In: Noldus, L.P.J.J., Grieco, F., Loijens, L.W.S., Zimmerman, P.H. (eds.) Noldus Information Technology, Wageningen (2005)Google Scholar
  4. 4.
    Povey, D., et al.: The Kaldi speech recognition toolkit. In: IEEE 2011 Workshop on Automatic Speech Recognition and Understanding. IEEE Signal Processing Society (2011)Google Scholar
  5. 5.
    Peddinti, V., Povey, D., Khudanpur, S.: A time delay neural network architecture for efficient modeling of long temporal contexts. In: proceedings of INTERSPEECH 2015, Dresden, Germany, pp. 3214–3218 (2015)Google Scholar
  6. 6.
    Airband. https://en.wikipedia.org/wiki/Airband. Accessed 20 Mar 2018
  7. 7.
    Srinivasamurthy, A., Motlicek, P., Himawan, I., Szaszk, G., Oualil, Y., Helmke, H.: Semi-supervised learning with semantic knowledge extraction for improved speech recognition in air traffic control. In: Proceedings of the Interspeech 2017, pp. 2406–2410 (2017).  https://doi.org/10.21437/Interspeech.2017-1446
  8. 8.
    Oualil, Y., Klakow, D., Szaszk, G., Srinivasamurthy, A., Helmke, H., Motlicek, P.: A context-aware speech recognition and understanding system for air traffic control domain. In: 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Okinawa, pp. 404–408 (2017)Google Scholar
  9. 9.
    Delpech, E., et al.: A real-life, french-accented corpus of air traffic control communications. In: Proceedings of the 11th Language Resources and Evaluation Conference (LREC 2018), Miyazaki, Japan (2018)Google Scholar
  10. 10.
    Ranzenberger, T., Hacker, Ch., Gallwitz, F.: Integration of a Kaldi speech recognizer into a speech dialog system for automotive infotainment applications. In: Conference on Electronic Speech Signal Processing (ESSV 2018), Ulm (2018)Google Scholar
  11. 11.
    Word Error Rate. https://en.wikipedia.org/wiki/Word_error_rate. Accessed 20 Mar 2018
  12. 12.
    JSpeech Grammar Format. http://www.w3.org/TR/jsgf. Accessed 20 Mar 2018
  13. 13.
    ICAO. Manual of Radiotelephony. Document 9432-AN/925, 4th edn (2007)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Michal Trzos
    • 1
  • Martin Dostl
    • 1
  • Petra Machkov
    • 1
  • Jana Eitlerov
    • 1
  1. 1.Honeywell International, Aerospace Advanced Technology EuropeBrnoCzech Republic

Personalised recommendations