Skip to main content

Automatic Classification and Transcription of Telephone Speech in Radio Broadcast Data

  • Conference paper
Book cover Computational Processing of the Portuguese Language (PROPOR 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5190))

Abstract

Automatic transcription of telephone speech involves additional challenges compared to wideband data processing, mainly due to channel limitations and to particular characteristics of conversational telephone speech. While in TV speech recognition applications, such as automatic transcription of broadcast news, the presence of telephone data is nearly insignificant (less than 1 %), in most radio broadcast stations the presence of telephone speech grows significantly. Thus, transcription of telephone speech data deserves special attention in radio broadcast applications. In this work, we describe our initial efforts to tackle this particular problem. First, a telephone channel classifier is proposed to automatically detect telephone segments. Then, some strategies for increasing robustness of the automatic transcription system are investigated.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Nguyen, L., Xiang, B., Afify, M., Abdou, S., Matsoukas, S., Schwartz, R., Makhoul, J.: The BBN RT04 English Broadcast News Transcription System. In: Proceedings of Interspeech 2005, Lisbon, Portugal (2005)

    Google Scholar 

  2. Gales, M.J.F., Kim, D.Y., Woodland, P.C., Chan, H.Y., Mrva, D., Sinha, R., Tranter, S.E.: Progress in the CU-HTK Broadcast News Transcription System. IEEE Transactions on Audio, Speech, and Language Processing 14(5), 1513–1525 (2006)

    Article  Google Scholar 

  3. Galliano, S., Geoffrois, E., Mostefa, D., Choukri, K., Bonastre, J.-F., Gravier, G.: The ESTER Phase II Evaluation Campaign for the Rich Transcription of French Broadcast News. In: Proceedings of Interspeech 2005, Lisbon, Portugal (2005)

    Google Scholar 

  4. Meinedo, H., Caseiro, D., Neto, J., Trancoso, I.: AUDIMUS.media: A Broadcast News speech recognition system for the European Portuguese language. In: Proceedings of PROPOR- 2003, Portugal (2003)

    Google Scholar 

  5. Gauvain, J.-L., Lamel, L., Schwenk, H., Adda, G., Chen, L., Lefèvre, F.: Conversational telephone speech recognition. In: Proceedings of ICASSP-2003, pp. 212–215 (April 2003)

    Google Scholar 

  6. Matsoukas, S., Prasad, R., Laxminarayan, S., Xiang, B., Nguyen, L., Schwartz, R.: The 2004 BBN 1xRT Recognition Systems for English Broadcast News and Conversational Telephone Speech. In: Proceedings of Interspeech 2005, Lisbon, Portugal (2005)

    Google Scholar 

  7. Godfrey, J.J., Holliman, E.C., McDaniel, J.: Switchboard: Telephone speech corpus for research and development. In: Proceedings of ICASSP-1992, pp. 517–520 (March 1992)

    Google Scholar 

  8. Morgan, N., Bourlard, H.: An introduction to hybrid HMM/Connectionist continuous speech recognition. IEEE Signal Processing Magazine, 25–42 (1995)

    Google Scholar 

  9. Mohri, M., Pereira, F., Riley, M.: Weighted finite-state transducers in speech recognition. In: ISCA ITRW Automatic Speech Recognition, Paris, pp. 97–106 (2000)

    Google Scholar 

  10. Martins, C., Teixeira, A., Neto, J.: Dynamic language modeling for a daily broadcast news transcription system. In: Proceedings of ASRU-2007, Kyoto, pp. 165–170 (2007)

    Google Scholar 

  11. Hagen, A., Neto, J.: HMM/MLP Hybrid Speech Recognizer for the Portuguese Telephone SpeechDat Corpus. In: Proceedings of PROPOR-2003, Portugal (2003)

    Google Scholar 

  12. Lindberg, B., Johansen, F., Warakagoda, N., Lehtinen, G., Kacic, Z., Zgank, A., Elenius, K., Salvi, G.: A noise robust multilingual reference recogniser based on SpeechDat(II). In: Proceedings of ICSLP 2000, Beijing, pp. III, 370–373 (2000)

    Google Scholar 

  13. Junqua, J.-C., Haton, J.P.: Robustness in Automatic Speech Recognition: Fundamentals and Applications. Kluwer Academic Publishers, Dordrecht (1996)

    Google Scholar 

  14. ETSI standard doc.: Speech Processing, Transmission and Quality Aspects (STQ); Distributed speech recognition; Advanced feature extraction algorithm. ETSI ES 202 050 Ver. 1.1.5 (2002)

    Google Scholar 

  15. Kamm, T., Andreou, G., Cohen, J.: Vocal tract normalization in speech recognition: Compensating for systematic speaker variability. In: Proceedings of the 15th Annual Speech Research Symposium, Baltimore, USA (1995)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

António Teixeira Vera Lúcia Strube de Lima Luís Caldas de Oliveira Paulo Quaresma

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Abad, A., Meinedo, H., Neto, J. (2008). Automatic Classification and Transcription of Telephone Speech in Radio Broadcast Data. In: Teixeira, A., de Lima, V.L.S., de Oliveira, L.C., Quaresma, P. (eds) Computational Processing of the Portuguese Language. PROPOR 2008. Lecture Notes in Computer Science(), vol 5190. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85980-2_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-85980-2_18

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-85979-6

  • Online ISBN: 978-3-540-85980-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics