Abstract
Speech data has been established as an extremely rich and important source of information. However, we still lack suitable methods for the semantic annotation of speech that has been transcribed by automated speech recognition (ASR) systems . For instance, the semantic role labeling (SRL) task for ASR data is still an unsolved problem, and the achieved results are significantly lower than with regular text data. SRL for ASR data is a difficult and complex task due to the absence of sentence boundaries, punctuation, grammar errors, words that are wrongly transcribed, and word deletions and insertions. In this paper we propose a novel approach to SRL for ASR data based on the following idea: (1) combine evidence from different segmentations of the ASR data, (2) jointly select a good segmentation, (3) label it with the semantics of PropBank roles. Experiments with the OntoNotes corpus show improvements compared to the state-of-the-art SRL systems on the ASR data. As an additional contribution, we semi-automatically align the predicates found in the ASR data with the predicates in the gold standard data of OntoNotes which is a quite difficult and challenging task, but the result can serve as gold standard alignments for future research.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Pradhan, S., Hacioglu, K., Ward, W., Martin, J.H., Jurafsky, D.: Semantic role chunking combining complementary syntactic views. In: Proceedings of the Ninth Conference on Computational Natural Language Learning, CoNLL 2005, Stroudsburg, PA, USA, pp. 217–220 (2005)
Johansson, R., Nugues, P.: Dependency-based semantic role labeling of PropBank. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP 2008, pp. 69–78. Association for Computational Linguistics, Stroudsburg (2008)
Surdeanu, M., Turmo, J.: Semantic role labeling using complete syntactic analysis. In: Proceedings of the Ninth Conference on Computational Natural Language Learning, CoNLL 2005, pp. 221–224. Association for Computational Linguistics, Stroudsburg (2005)
Punyakanok, V., Roth, D., Yih, W.: The importance of syntactic parsing and inference in seamantic role labeling. Comput. Linguist. 34, 257–287 (2008)
Favre, B., Bohnet, B., Hakkani-Tür, D.: Evaluation of semantic role labeling and dependency parsing of automatic speech recognition output. In: Proceedings of the IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), pp. 5342–5345 (2010)
Hovy, E., Marcus, M., Palmer, M., Ramshaw, L., Weischedel, R.: OntoNotes: The 90% solution. In: Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers. NAACL-Short 2006, pp. 57–60. Association for Computational Linguistics, Stroudsburg (2006)
Gildea, D., Jurafsky, D.: Automatic labeling of semantic roles. Comput. Linguist. 28, 245–288 (2002)
Mà rquez, L., Carreras, X., Litkowski, K.C., Stevenson, S.: Semantic role labeling: An introduction to the special issue. Comput. Linguist. 34, 145–159 (2008)
Palmer, M., Gildea, D., Kingsbury, P.: The proposition bank: An annotated corpus of semantic roles. Comput. Linguist. 31(1), 71–106 (2005)
Meyers, A., Reeves, R., Macleod, C., Szekely, R., Zielinska, V., Young, B., Grishman, R.: The NomBank project: An interim report. In: Proceedings of the NAACL/HLT Workshop on Frontiers in Corpus Annotation (2004)
Baker, C.F., Fillmore, C.J., Lowe, J.B.: The Berkeley FrameNet project. In: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, ACL 1998, vol. 1, pp. 86–90. Association for Computational Linguistics, Stroudsburg (1998)
Zhao, H., Chen, W., Kit, C., Zhou, G.: Multilingual dependency learning: A huge feature engineering method to semantic dependency parsing. In: Proceedings of the Thirteenth Conference on Computational Natural Language Learning: Shared Task, CoNLL 2009, Boulder, Colorado, USA, June 4, pp. 55–60 (2009)
Stenchikova, S., Hakkani-Tür, D., Tür, G.: QASR: question answering using semantic roles for speech interface. In: Proceedings of INTERSPEECH (2006)
Kolomiyets, O., Moens, M.F.: A survey on question answering technology from an information retrieval perspective. Inf. Sci. 181, 5412–5434 (2011)
Hüwel, S., Wrede, B.: Situated speech understanding for robust multi-modal human-robot communication (2006)
Huang, X., Baker, J., Reddy, R.: A historical perspective of speech recognition. Commun. ACM 57, 94–103 (2014)
Mohammad, S., Zhu, X., Martin, J.: Semantic role labeling of emotions in Tweets. In: Proceedings of the 5th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pp. 32–41. Association for Computational Linguistics, Baltimore (2014)
Ngoc Thi Do, Q., Bethard, S., Moens, M.F.: Text mining for open domain semi-supervised semantic role labeling. In: Proceedings of the First International Workshop on Interactions Between Data Mining and Natural Language Processing, pp. 33–48 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Shrestha, N., Vulić, I., Moens, MF. (2015). Semantic Role Labeling of Speech Transcripts. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2015. Lecture Notes in Computer Science(), vol 9042. Springer, Cham. https://doi.org/10.1007/978-3-319-18117-2_43
Download citation
DOI: https://doi.org/10.1007/978-3-319-18117-2_43
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18116-5
Online ISBN: 978-3-319-18117-2
eBook Packages: Computer ScienceComputer Science (R0)