Skip to main content

Tree-Structured Named Entities Extraction from Competing Speech Transcriptions

  • Conference paper
  • First Online:
  • 1759 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9103))

Abstract

When real applications are working with automatic speech transcription, the first source of error does not originate from the incoherence in the analysis of the application but from the noise in the automatic transcriptions. This study presents a simple but effective method to generate a new transcription of better quality by combining utterances from competing transcriptions. We have extended a structured Named Entity (NE) recognizer submitted during the ETAPE Challenge. Working on French TV and Radio programs, our system revises the transcriptions provided by making use of the NEs it has detected. Our results suggest that combining the transcribed utterances which optimize the F-measures, rather than minimizing the WER scores, allows the generation of a better transcription for NE extraction. The results show a small but significant improvement of 0.9 % SER against the baseline system on the ROVER transcription. These are the best performances reported to date on this corpus.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    \(WER=\frac{S+D+I}{N}\), where D, I, S stand for the number of deletions, insertions, substitutions of words and N for the total number of words in the reference.

  2. 2.

    More information about the challenge can be found at www.afcp-parole.org/etape/workshop.html.

  3. 3.

    As a first working hypothesis, we have segmented the transcriptions based on the gold standard utterances.

  4. 4.

    SCLite: www1.icsi.berkeley.edu/Speech/docs/sctk-1.2/sclite.htm.

References

  1. Dinarelli, M., Rosset, S.: Models cascade for tree-structured named entity detection. In: Proceedings of International Joint Conference on Natural Language Processing (IJCNLP), pp. 1269–1278 (2011)

    Google Scholar 

  2. Favre, B., Béchet, F., Nocéra, P.: Robust named entity extraction from large spoken archives. In: Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP), pp. 491–498 (2005)

    Google Scholar 

  3. Fiscus, J.: A post-processing system to yield reduced word error rates: recognizer output voting error reduction (rover). In: Proceedings IEEE Automatic Speech Recognition and Understanding Workshop, pp. 347–352 (1997)

    Google Scholar 

  4. Goel, V., Byrne, W.: Minimum bayes-risk automatic speech recognition. Comput. Speech Lang. 14(2), 115–135 (2000)

    Article  Google Scholar 

  5. Gravier, G., Adda, G., Paulson, N., Carré, M., Giraudel, A., Galibert, O.: The ETAPE corpus for the evaluation of speech-based TV content processing in the french language. In: International Conference on Language Resources, Evaluation and Corpora (2012)

    Google Scholar 

  6. Gravier, G., Bonastre, J., Geoffrois, E., Galliano, S., McTait, K., Choukri, K.: Ester, une campagne d’évaluation des systèmes d’indexation automatique d’émissions radiophoniques en franais. In: Proceedings Journées d’Etude sur la Parole (JEP) (2004)

    Google Scholar 

  7. Hakkani-Tr, D., Béchet, F., Riccardi, G., Tur, G.: Beyond ASR 1-best: using word confusion networks in spoken language understanding. Comput. Speech Lang. 20, 495–514 (2006)

    Article  Google Scholar 

  8. Jurafsky, D., Martin, J.: Speech and Language Processing. Prentice Hall, Englewood Cliffs (2008)

    Google Scholar 

  9. Kripke, S.: Naming and necessity. In: Davidson, D., Harman, G. (eds.) Semantics of Natural Language. Harvard University Press, Cambridge (1972)

    Google Scholar 

  10. Makhoul, J., Kubala, F., Schwartz, R., Weischedel, R.: Performance measures for information extraction. In: Proceedings of DARPA Broadcast News Workshop, pp. 249–252 (1999)

    Google Scholar 

  11. Marin, A., Kwiatkowski, T., Ostendorf, M., Zettlemoyer, L.: Using syntactic and confusion network structure for out-of-vocabulary word detection. In: Proceedings IEEE Spoken Language Technology Workshop (SLT), pp. 159–164 (2012)

    Google Scholar 

  12. McCallum, A., Li, W.: Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In: Proceedings of CoNLL-2013, pp. 188–191 (2013)

    Google Scholar 

  13. Nowozin, S., Lampert, C.: Structured learning and prediction in computer vision. Found. Trends Comput. Graph. Vis. 6, 185–365 (2010)

    Article  Google Scholar 

  14. Palmer, D., Ostendorf, M.: Improving information extraction by modeling errors in speech recognizer output. In: Proceedings of the First International Conference on Human Language Technology Research (2001)

    Google Scholar 

  15. Punyakanok, V., Roth, D., Tau Yih, W., Zimak, D.: Learning and inference over constrained output. In: Proceedings of International Joint Conferences on Artificial Intelligence (2005)

    Google Scholar 

  16. Raymond, C.: Robust tree-structured named entities recognition from speech. In: Proceedings of International Conference on Acoustic Speech and Signal Processing, ICASSP 2013 (2013)

    Google Scholar 

  17. Raymond, C., Fayolle, J.: Reconnaissance robuste d’entités nommées sur de la parole transcrite automatiquement. In: Proceedings of Traitement Automatique des Langues Naturelles (2010)

    Google Scholar 

  18. Rosset, S., Grouin, C., Zweigenbaum, P.: Entités nommées structurées: guide d’annotation quaero. Technical report, LIMSI-Centre national de la recherche scientifique (2011)

    Google Scholar 

  19. Subramaniam, L., Roy, S., Faruquie, T., Negi, S.: A survey of types of text noise and techniques to handle noisy text. In: Proceedings of The Third Workshop on Analytics for Noisy Unstructured Text Data, pp. 115–122 (2009)

    Google Scholar 

  20. Tur, G., Deoras, A., Hakkani-Tr, D.: Semantic parsing using word confusion networks with conditional random fields. In: Proceedings of Interspeech 2013, pp. 2579–2583 (2013)

    Google Scholar 

Download references

Acknowledgments

We thank Dr. Abeed Sarker and Dr. Graciela Gonzalez for their helpful comments and remarks.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Davy Weissenbacher .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Weissenbacher, D., Raymond, C. (2015). Tree-Structured Named Entities Extraction from Competing Speech Transcriptions. In: Biemann, C., Handschuh, S., Freitas, A., Meziane, F., Métais, E. (eds) Natural Language Processing and Information Systems. NLDB 2015. Lecture Notes in Computer Science(), vol 9103. Springer, Cham. https://doi.org/10.1007/978-3-319-19581-0_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-19581-0_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-19580-3

  • Online ISBN: 978-3-319-19581-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics