Resolving Noun Phrase Coreference in Czech

  • Michal Novák
  • Zdeněk Žabokrtský
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7099)


In this work, we present first results on noun phrase coreference resolution on Czech data. As the data resource for our experiments, we employed yet unfinished and unpublished extension of Prague Dependency Treebank 2.0, which captures noun phrase coreference and bridging relations. Incompleteness of the data influenced one of our motivations – to aid annotators with automatic pre-annotation of the data. Although we introduced several novel tree features and tried different machine learning approaches, results on a growing amount of data shows that the selected feature set and learning methods are not able to sufficiently exploit the data.


coreference resolution Czech ranking Prague Dependency Treebank 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bojar, O., Žabokrtský, Z.: CzEng 0.9, Building a Large Czech-English Automatic Parallel Treebank. The Prague Bulletin of Mathematical Linguistics (92), 63–83 (2009)Google Scholar
  2. 2.
    Collins, M.: Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms. In: EMNLP, vol. 10, pp. 1–8 (2002)Google Scholar
  3. 3.
    Denis, P., Baldridge, J.: A Ranking Approach to Pronoun Resolution. In: IJCAI, pp. 1588–1593 (2007)Google Scholar
  4. 4.
    Denis, P., Baldridge, J.: Specialized Models and Ranking for Coreference Resolution. In: EMNLP, pp. 660–669 (2008)Google Scholar
  5. 5.
    Liu, D.C., Nocedal, J.: On the Limited Memory BFGS Method for Large Scale Optimization. Mathematical Programming 45, 503–528 (1989)MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Haghighi, A., Klein, D.: Simple Coreference Resolution with Rich Syntactic and Semantic Features. In: EMNLP, pp. 1152–1161 (2009)Google Scholar
  7. 7.
    Haghighi, A., Klein, D.: Coreference Resolution in a Modular, Entity-Centered Model. In: HLT-NAACL, pp. 385–393 (2010)Google Scholar
  8. 8.
    Hajič, J., et al.: Prague Dependency Treebank 2.0. CD-ROM, Linguistic Data Consortium, LDC Catalog No.: LDC2006T01, Philadelphia (2006)Google Scholar
  9. 9.
    Malouf, R.: A Comparison of Algorithms for Maximum Entropy Parameter Estimation. In: 6th Conference on Natural Language Learning, COLING 2002, vol. 20, pp. 1–7. Association for Computational Linguistics, Stroudsburg (2002)Google Scholar
  10. 10.
    MUC-7: Coreference Task Definition. In: Seventh Message Understanding Conference. Morgan Kaufmann, San Francisco, CA (1998)Google Scholar
  11. 11.
    Ng, V.: Supervised Noun Phrase Coreference Research: The First Fifteen Years. In: ACL, Uppsala, Sweden, pp. 1396–1411 (July 2010)Google Scholar
  12. 12.
    Nguy, G.L., Novák, V., Žabokrtský, Z.: Comparison of Classification and Ranking Approaches to Pronominal Anaphora Resolution in Czech. In: SIGDIAL 2009 Conference, pp. 276–285. ACL, London (2009)Google Scholar
  13. 13.
    NIST: ACE Evaluation Plan. Tech. rep. (2007),
  14. 14.
    Nědolužko, A., Mírovský, J., Ocelák, R., Pergler, J.: Extended Coreferential Relations and Bridging Anaphora in the Prague Dependency Treebank. In: DAARC 2009 (2009)Google Scholar
  15. 15.
    Rahman, A., Ng, V.: Supervised models for coreference resolution. In: EMNLP, pp. 968–977 (2009)Google Scholar
  16. 16.
    Sgall, P., Hajičová, E., Panevová, J.: The Meaning of the Sentence in Its Semantic and Pragmatic Aspects. D. Reidel Publishing Company, Dordrecht (1986)Google Scholar
  17. 17.
    Soon, W.M., Ng, H.T., Lim, C.Y.: A Machine Learning Approach to Coreference Resolution of Noun Phrases. Computational Linguistics 27(4), 521–544 (2001)CrossRefGoogle Scholar
  18. 18.
    Žabokrtský, Z., Ptáček, J., Pajas, P.: TectoMT: Highly Modular MT System with Tectogrammatics Used as Transfer Layer. In: ACL 2008 WMT, pp. 167–170 (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Michal Novák
    • 1
  • Zdeněk Žabokrtský
    • 1
  1. 1.Institute of Formal and Applied LinguisticsCharles University in PraguePraha 1Czech Republic

Personalised recommendations