Abstract
In this paper, we try three distinct approaches to chunk transcribed oral data with labeling tools learnt from a corpus of written texts. The purpose is to reach the best possible results with the least possible manual correction or re-learning effort.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abney, S.: Parsing by chunks. In: Berwick, R., Abney, R., Tenny, C. (eds.) Principle-based Parsing. Kluwer Academic Publisher (1991)
Abeillé, A., Clément, L., Toussenel, F.: Building a treebank for french. In: Abeillé, A. (ed.) Treebanks. Kluwer, Dordrecht (2003)
Antoine, J.-Y., Mokrane, A., Friburger, N.: Automatic rich annotation of large corpus of conversational transcribed speech: the chunking task of the epac project. In: Proceedings of LREC 2008 (May 2008)
Blanche-Benveniste, C.: Sémantique de l’oral, chapter Sémantique et corpus. Les aspects dynamiques de la composition sémantique de l’oral (2005)
Blanche-Benveniste, C., Jeanjean, C.: Le français parlé, transcription et édition. Didier Erudition (1987)
Blanc, O., Constant, M., Dister, A., Watrin, P.: Partial parsing of spontaneous spoken french. In: Proceedings of LREC 2010 (2010)
Crabbé, B., Candito, M.H.: Expériences d’analyse syntaxique statistique du français. In: Actes de TALN 2008 (2008)
Constant, M., Tellier, I.: Evaluating the impact of external lexical resources unto a crf-based multiword segmenter and part-of-speech tagger. In: Proceedings of LREC 2012 (2012)
Lavergne, T., Cappé, O., Yvon, F.: Practical very large scale CRFs. In: Proceedings of ACL 2010, pp. 504–513. Association for Computational Linguistics (July 2010)
Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proceedings of ICML 2001, pp. 282–289 (2001)
Sha, F., Pereira, F.: Shallow parsing with conditional random fields. In: Proceedings of HLT-NAACL, pp. 213–220 (2003)
Tellier, I., Eshkol, I., Taalab, S., Prost, J.P.: Pos-tagging for oral texts with crf and category decomposition. Research in Computing Science 46, 79–90 (2010)
Valli, A., Veronis, J.: Etiquetage grammatical des corpus de parole: problèmes et perspectives. Revue Française de Linguistique Appliquée 4(2), 113–133 (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tellier, I., Dupont, Y., Eshkol, I., Wang, I. (2013). Adapt a Text-Oriented Chunker for Oral Data: How Much Manual Effort Is Necessary?. In: Yin, H., et al. Intelligent Data Engineering and Automated Learning – IDEAL 2013. IDEAL 2013. Lecture Notes in Computer Science, vol 8206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41278-3_28
Download citation
DOI: https://doi.org/10.1007/978-3-642-41278-3_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41277-6
Online ISBN: 978-3-642-41278-3
eBook Packages: Computer ScienceComputer Science (R0)