Segmentation of Complex Sentences

Kuboň, Vladislav; Lopatková, Markéta; Plátek, Martin; Pognan, Patrice

doi:10.1007/11846406_19

Vladislav Kuboň²¹,
Markéta Lopatková²¹,
Martin Plátek²² &
…
Patrice Pognan²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4188))

Included in the following conference series:

International Conference on Text, Speech and Dialogue

1053 Accesses
2 Citations

Abstract

The paper describes a method of dividing complex sentences into segments, easily detectable and linguistically motivated units that may be subsequently combined into clauses and thus provide a structure of a complex sentence with regard to the mutual relationship of individual clauses. The method has been developed for Czech as a language representing languages with relatively high degree of word-order freedom. The paper introduces important terms, describes a segmentation chart, the data structure used for the description of mutual relationship between individual segments and separators. It also contains a simple set of rules applied for the segmentation of a small set of Czech sentences. The segmentation results are evaluated against a small hand-annotated corpus of Czech complex sentences.

This paper is a result of the project supported by the grant No. 1ET100300517.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Oliva, K.: A Parser for Czech Implemented in Systems Q. In: Explizite Beschreibung der Sprache und automatische Textbearbeitung, MFF UK Praha (1989)
Google Scholar
Kuboň, V.: Problems of Robust Parsing of Czech. Ph.D. Thesis, MFF UK, Prague (2001)
Google Scholar
Zeman, D.: Parsing with a Statistical Dependency Model. Ph.D. Thesis. MFF UK, Prague (2004)
Google Scholar
Abney, S.: Partial Parsing via Finite-State Cascades. Journal of Natural Language Engineering 2(4), 337–344 (1995)
Article Google Scholar
Ciravegna, F., Lavelli, A.: Full Text Parsing using Cascades of Rules: An Information Extraction Procedure. In: Proceedings of EACL 1999, University of Bergen (1999)
Google Scholar
Brants, T.: Cascaded Markov Models. In: Proceedings of EACL 1999, University of Bergen (1999)
Google Scholar
Debusmann, R., Duchier, D., Rossberg, A.: Modular grammar design with typed parametric principles. In: Proceedings of FG-MOL 2005, Edinburgh (2005)
Google Scholar
Jones, B.E.M.: Exploiting the Role of Punctuation in Parsing Natural Text. In: Proceedings of the COLING 1994, pp. 421–425. University of Kyoto, Kyoto (1994)
Google Scholar
Hajič, J., Vidová-Hladká, B., Zeman, D.: Core Natural Language Processing Technology Applicable to Multiple Languages. In: The Workshop 1998 Final Report. Center for Language and Speech Processing, Johns Hopkins University, Baltimore (1998)
Google Scholar
Šmilauer, V.: Učebnice větného rozboru. SPN, Praha (1958)
Google Scholar
Holan, T., Kuboň, V., Oliva, K., Plátek, M.: On Complexity of Word Order. Les grammaires de dépendance – Traitement automatique des langues 41(1), 273–300 (2000)
Google Scholar
Hajič, J.: Disambiguation of Rich Inflection (Computational Morphology of Czech). UK, Nakladatelství Karolinum, Praha (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

ÚFAL MFF UK, Prague
Vladislav Kuboň & Markéta Lopatková
KTIML MFF UK, Prague
Martin Plátek
CERTAL INALCO, Paris
Patrice Pognan

Authors

Vladislav Kuboň
View author publications
You can also search for this author in PubMed Google Scholar
Markéta Lopatková
View author publications
You can also search for this author in PubMed Google Scholar
Martin Plátek
View author publications
You can also search for this author in PubMed Google Scholar
Patrice Pognan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Informatics, Masaryk University, Brno, Czech Republic
Petr Sojka
Faculty of Informatics, Masaryk University, Botanická 68a, CZ-602 00, Brno, Czech Republic
Ivan Kopeček
Faculty of Informatics, Department of Computer Graphics and Design, Masaryk University, Botanická 68a, 60200, Brno, Czech Republic
Karel Pala

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kuboň, V., Lopatková, M., Plátek, M., Pognan, P. (2006). Segmentation of Complex Sentences. In: Sojka, P., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2006. Lecture Notes in Computer Science(), vol 4188. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11846406_19

Download citation

DOI: https://doi.org/10.1007/11846406_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-39090-9
Online ISBN: 978-3-540-39091-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics