Adapt a Text-Oriented Chunker for Oral Data: How Much Manual Effort Is Necessary?

Tellier, Isabelle; Dupont, Yoann; Eshkol, Iris; Wang, Ilaine

doi:10.1007/978-3-642-41278-3_28

Isabelle Tellier²⁴,
Yoann Dupont²⁴,
Iris Eshkol²⁵ &
…
Ilaine Wang²⁴

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8206))

Included in the following conference series:

International Conference on Intelligent Data Engineering and Automated Learning

4806 Accesses

Abstract

In this paper, we try three distinct approaches to chunk transcribed oral data with labeling tools learnt from a corpus of written texts. The purpose is to reach the best possible results with the least possible manual correction or re-learning effort.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abney, S.: Parsing by chunks. In: Berwick, R., Abney, R., Tenny, C. (eds.) Principle-based Parsing. Kluwer Academic Publisher (1991)
Google Scholar
Abeillé, A., Clément, L., Toussenel, F.: Building a treebank for french. In: Abeillé, A. (ed.) Treebanks. Kluwer, Dordrecht (2003)
Google Scholar
Antoine, J.-Y., Mokrane, A., Friburger, N.: Automatic rich annotation of large corpus of conversational transcribed speech: the chunking task of the epac project. In: Proceedings of LREC 2008 (May 2008)
Google Scholar
Blanche-Benveniste, C.: Sémantique de l’oral, chapter Sémantique et corpus. Les aspects dynamiques de la composition sémantique de l’oral (2005)
Google Scholar
Blanche-Benveniste, C., Jeanjean, C.: Le français parlé, transcription et édition. Didier Erudition (1987)
Google Scholar
Blanc, O., Constant, M., Dister, A., Watrin, P.: Partial parsing of spontaneous spoken french. In: Proceedings of LREC 2010 (2010)
Google Scholar
Crabbé, B., Candito, M.H.: Expériences d’analyse syntaxique statistique du français. In: Actes de TALN 2008 (2008)
Google Scholar
Constant, M., Tellier, I.: Evaluating the impact of external lexical resources unto a crf-based multiword segmenter and part-of-speech tagger. In: Proceedings of LREC 2012 (2012)
Google Scholar
Lavergne, T., Cappé, O., Yvon, F.: Practical very large scale CRFs. In: Proceedings of ACL 2010, pp. 504–513. Association for Computational Linguistics (July 2010)
Google Scholar
Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proceedings of ICML 2001, pp. 282–289 (2001)
Google Scholar
Sha, F., Pereira, F.: Shallow parsing with conditional random fields. In: Proceedings of HLT-NAACL, pp. 213–220 (2003)
Google Scholar
Tellier, I., Eshkol, I., Taalab, S., Prost, J.P.: Pos-tagging for oral texts with crf and category decomposition. Research in Computing Science 46, 79–90 (2010)
Google Scholar
Valli, A., Veronis, J.: Etiquetage grammatical des corpus de parole: problèmes et perspectives. Revue Française de Linguistique Appliquée 4(2), 113–133 (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

Lattice, University Paris 3 - Sorbonne Nouvelle, France
Isabelle Tellier, Yoann Dupont & Ilaine Wang
LLL, University of Orléans, France
Iris Eshkol

Authors

Isabelle Tellier
View author publications
You can also search for this author in PubMed Google Scholar
Yoann Dupont
View author publications
You can also search for this author in PubMed Google Scholar
Iris Eshkol
View author publications
You can also search for this author in PubMed Google Scholar
Ilaine Wang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Electrical and Electronic Engineering, University of Manchester, UK
Hujun Yin
University of Science and Technology of China, Hefei, China
Ke Tang
Nanjing University, Nanjing, China
Yang Gao
Ostfalia University of Applied Sciences, 38302, Wolfenbüttel, Germany
Frank Klawonn
Kyungpook National University, 702-701, Buk-Gu, Daegu, Korea
Minho Lee
Nature Inspired Computational and Applications Laboratory, School of Computer Science and Technology,, University of Science and Technology of China, 230027, Hefei, China
Thomas Weise
University of Science and Technology of China, 230017, Hefei, China
Bin Li
CERCIA, School of Computer Science, University of Birmingham, B15 2TT, Edgbaston, Birmingham, UK
Xin Yao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tellier, I., Dupont, Y., Eshkol, I., Wang, I. (2013). Adapt a Text-Oriented Chunker for Oral Data: How Much Manual Effort Is Necessary?. In: Yin, H., et al. Intelligent Data Engineering and Automated Learning – IDEAL 2013. IDEAL 2013. Lecture Notes in Computer Science, vol 8206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41278-3_28

Download citation

DOI: https://doi.org/10.1007/978-3-642-41278-3_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41277-6
Online ISBN: 978-3-642-41278-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics