Skip to main content

Segmentation of Complex Sentences

  • Conference paper
Text, Speech and Dialogue (TSD 2006)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4188))

Included in the following conference series:

Abstract

The paper describes a method of dividing complex sentences into segments, easily detectable and linguistically motivated units that may be subsequently combined into clauses and thus provide a structure of a complex sentence with regard to the mutual relationship of individual clauses. The method has been developed for Czech as a language representing languages with relatively high degree of word-order freedom. The paper introduces important terms, describes a segmentation chart, the data structure used for the description of mutual relationship between individual segments and separators. It also contains a simple set of rules applied for the segmentation of a small set of Czech sentences. The segmentation results are evaluated against a small hand-annotated corpus of Czech complex sentences.

This paper is a result of the project supported by the grant No. 1ET100300517.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Oliva, K.: A Parser for Czech Implemented in Systems Q. In: Explizite Beschreibung der Sprache und automatische Textbearbeitung, MFF UK Praha (1989)

    Google Scholar 

  2. Kuboň, V.: Problems of Robust Parsing of Czech. Ph.D. Thesis, MFF UK, Prague (2001)

    Google Scholar 

  3. Zeman, D.: Parsing with a Statistical Dependency Model. Ph.D. Thesis. MFF UK, Prague (2004)

    Google Scholar 

  4. Abney, S.: Partial Parsing via Finite-State Cascades. Journal of Natural Language Engineering 2(4), 337–344 (1995)

    Article  Google Scholar 

  5. Ciravegna, F., Lavelli, A.: Full Text Parsing using Cascades of Rules: An Information Extraction Procedure. In: Proceedings of EACL 1999, University of Bergen (1999)

    Google Scholar 

  6. Brants, T.: Cascaded Markov Models. In: Proceedings of EACL 1999, University of Bergen (1999)

    Google Scholar 

  7. Debusmann, R., Duchier, D., Rossberg, A.: Modular grammar design with typed parametric principles. In: Proceedings of FG-MOL 2005, Edinburgh (2005)

    Google Scholar 

  8. Jones, B.E.M.: Exploiting the Role of Punctuation in Parsing Natural Text. In: Proceedings of the COLING 1994, pp. 421–425. University of Kyoto, Kyoto (1994)

    Google Scholar 

  9. Hajič, J., Vidová-Hladká, B., Zeman, D.: Core Natural Language Processing Technology Applicable to Multiple Languages. In: The Workshop 1998 Final Report. Center for Language and Speech Processing, Johns Hopkins University, Baltimore (1998)

    Google Scholar 

  10. Šmilauer, V.: Učebnice větného rozboru. SPN, Praha (1958)

    Google Scholar 

  11. Holan, T., Kuboň, V., Oliva, K., Plátek, M.: On Complexity of Word Order. Les grammaires de dépendance – Traitement automatique des langues 41(1), 273–300 (2000)

    Google Scholar 

  12. Hajič, J.: Disambiguation of Rich Inflection (Computational Morphology of Czech). UK, Nakladatelství Karolinum, Praha (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kuboň, V., Lopatková, M., Plátek, M., Pognan, P. (2006). Segmentation of Complex Sentences. In: Sojka, P., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2006. Lecture Notes in Computer Science(), vol 4188. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11846406_19

Download citation

  • DOI: https://doi.org/10.1007/11846406_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-39090-9

  • Online ISBN: 978-3-540-39091-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics