Skip to main content
Log in

Cross-lingual implicit discourse relation recognition with co-training

  • Published:
Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

Abstract

A lack of labeled corpora obstructs the research progress on implicit discourse relation recognition (DRR) for Chinese, while there are some available discourse corpora in other languages, such as English. In this paper, we propose a cross-lingual implicit DRR framework that exploits an available English corpus for the Chinese DRR task. We use machine translation to generate Chinese instances from a labeled English discourse corpus. In this way, each instance has two independent views: Chinese and English views. Then we train two classifiers in Chinese and English in a co-training way, which exploits unlabeled Chinese data to implement better implicit DRR for Chinese. Experimental results demonstrate the effectiveness of our method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Biran O, McKeown K, 2013. Aggregated word pair features for implicit discourse relation disambiguation. Proc 51st Annual Meeting of the Association for Computational Linguistics, p.69–73. https://doi.org/10.7916/D8PN9FZ4

    Google Scholar 

  • Blum A, Mitchell T, 1998. Combining labeled and unlabeled data with cotraining. Proc 11th Annual Conf on Computational Learning Theory, p.92-100. https://doi.org/10.1145/279943.279962

    Google Scholar 

  • Braud C, Denis P, 2015. Comparing word representations for implicit discourse relation classification. Proc Conf on Empirical Methods in Natural Language Processing, p.2201–2211. https://doi.org/10.18653/v1/d15-1262

    Google Scholar 

  • Carlson L, Marcu D, Okurowski M, 2001. Building a discourse-tagged corpus in the framework of rhetorical structure theory. Proc 2nd SIGDIAL Workshop on Discourse and Dialogue, p.1–10. https://doi.org/10.3115/1118078.1118083

    Google Scholar 

  • Chen J, Zhang Q, Liu P, et al., 2016. Implicit discourse relation detection via a deep architecture with gated relevance network. Proc 54th Annual Meeting of the Association for Computational Linguistics, p.1726–1735. https://doi.org/10.18653/v1/p16-1163

    Google Scholar 

  • Chen L, 2006. English and Chinese Discourse Structure Dimension Theory and Practice. PhD Thesis, Shanghai International Studies University, China.

    Google Scholar 

  • Chiarcos C, 2012. Towards the unsupervised acquisition of discourse relations. Proc 50th Annual Meeting of the Association for Computational Linguistics, p.213–217.

    Google Scholar 

  • Cimiano P, Reyle U, Šaric J, 2005. Ontology-driven discourse analysis for information extraction. Data Knowl Eng, 55(1):59–83. https://doi.org/10.1016/j.datak.2004.11.009

    Article  Google Scholar 

  • Clark S, Curran J, Osborne M, 2003. Bootstrapping POStaggers using unlabelled data. Proc 7th Conf on Natural Language Learning, p.49–55. https://doi.org/10.3115/1119176.1119183

    Google Scholar 

  • Guzmán F, Joty S, Màrquez L, et al., 2014. Using discourse structure improves machine translation evaluation. Proc 52nd Annual Meeting of the Association for Computational Linguistics, p.687–698. https://doi.org/10.3115/v1/p14-1065

    Google Scholar 

  • Hernault H, Bollegala D, Ishizuka M, 2010. Towards semisupervised classification of discourse relations using feature correlations. Proc SIGDIAL Conf and the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue, p.55–58.

    Google Scholar 

  • Huang H, Chen H, 2011. Chinese discourse relation recognition. Proc 5th Int Joint Conf on Natural Language Processing, p.1442–1446.

    Google Scholar 

  • Jain S, Batra S, 2015. Cross lingual sentiment analysis using modified BRAE. Proc Conf on Empirical Methods in Natural Language Processing, p.159–168. https://doi.org/10.18653/v1/d15-1016

    Google Scholar 

  • Ji Y, Eisenstein J, 2015. One vector is not enough: entityaugmented distributed semantics for discourse relations. Trans Assoc Comput Ling, 3:329–344.

    Google Scholar 

  • Ji Y, Haffari G, Eisenstein J, 2016. A latent variable recurrent neural network for discourse relation language models. Proc Conf North American Chapter of the Association for Computational Linguistics on Human Language Technologies, p.332–342. https://doi.org/10.18653/v1/n16-1037

    Google Scholar 

  • Laali M, Kosseim L, 2014. Inducing discourse connectives from parallel texts. Proc 25th Int Conf on Computational Linguistics, p.610–619.

    Google Scholar 

  • Lan M, Xu Y, Niu Z, 2013. Leveraging synthetic discourse data via multi-task learning for implicit discourse relation recognition. Proc 51st Annual Meeting of the Association for Computational Linguistics, p.476–485.

    Google Scholar 

  • Li J, Carpuat M, Nenkova A, 2014. Cross-lingual discourse relation analysis: a corpus study and a semi-supervised classification system. Proc 25th Int Conf on Computational Linguistics, p.577–587.

    Google Scholar 

  • Li Y, Feng W, Sun J, et al., 2014. Building Chinese discourse corpus with connective-driven dependency tree structure. Proc Conf on Empirical Methods in Natural Language Processing, p.2105–2114. https://doi.org/10.3115/v1/d14-1224

    Google Scholar 

  • Lin Z, Kan M, Ng H, 2009. Recognizing implicit discourse relations in the Penn discourse treebank. Proc Conf on Empirical Methods in Natural Language Processing, p.343–351. https://doi.org/10.3115/1699510.1699555

    Google Scholar 

  • Liu Y, Li S, Zhang X, et al., 2016. Implicit discourse relation classification via multi-task neural networks. Proc 30th Conf on Artificial Intelligence, p.2750–2756.

    Google Scholar 

  • Louis A, Joshi A, Prasad R, et al., 2010. Using entity features to classify implicit discourse relations. Proc 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue, p.59–62.

    Google Scholar 

  • Marcu D, Echihabi A, 2002. An unsupervised approach to recognizing discourse relations. Proc 40th Annual Meeting of the Association for Computational Linguistics, p.368–375. https://doi.org/10.3115/1073083.1073145

    Google Scholar 

  • Miltsakaki E, Dinesh N, Prasad R, et al., 2005. Experiments on sense annotations and sense disambiguation of discourse connectives. Proc 4th Workshop on Treebanks and Linguistic Theories, p.1–13.

    Google Scholar 

  • Ming Y, 2008. Rhetorical structure annotation of Chinese news commentaries. J Chin Inform Proc, 22(4):19–23.

    Google Scholar 

  • Ng V, Cardie C, 2003. Weakly supervised natural language learning without redundant views. Proc Conf North American Chapter of the Association for Computational Linguistics on Human Language Technology, p.94–101. https://doi.org/10.3115/1073445.1073468

    Google Scholar 

  • Park J, Cardie C, 2012. Improving implicit discourse relation recognition through feature set optimization. Proc 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue, p.108–112.

    Google Scholar 

  • Pitler E, Nenkova A, 2009. Using syntax to disambiguate explicit discourse connectives in text. Proc ACL-IJCNLP Conf, p.13–16. https://doi.org/10.3115/1667583.1667589

    Chapter  Google Scholar 

  • Pitler E, Louis A, Nenkova A, 2009. Automatic sense prediction for implicit discourse relations in text. Proc of the Joint Conf 47th Annual Meeting of the ACL and the 4th Int Joint Conf on Natural Language Processing of the AFNLP, p.683–691. https://doi.org/10.3115/1690219.1690241

    Google Scholar 

  • Prasad R, Dinesh N, Lee A, et al., 2008. The Penn discourse treebank 2.0. Proc Int Conf on Language Resources and Evaluation, p.2961–2968.

    Google Scholar 

  • Qian L, Hui H, Hu Y, et al., 2014. Bilingual active learning for relation classification via pseudo parallel corpora. Proc 52nd Annual Meeting of the Association for Computational Linguistics, p.582–592. https://doi.org/10.3115/v1/p14-1055

    Google Scholar 

  • Qin L, Zhang Z, Zhao H, 2016. A stacking gated neural architecture for implicit discourse relation classification. Proc Conf on Empirical Methods in Natural Language Processing, p.2263–2270. https://doi.org/10.18653/v1/d16-1246

    Google Scholar 

  • Rutherford A, Xue N, 2014. Discovering implicit discourse relations through brown cluster pair representation and coreference patterns. Proc 14th Conf European Chapter of the Association for Computational Linguistics, p.645–654. https://doi.org/10.3115/v1/e14-1068

    Google Scholar 

  • Rutherford A, Xue N, 2015. Improving the inference of implicit discourse relations via classifying explicit discourse connectives. Proc Conf of the North American Chapter of the Association for Computational Linguistics on Human Language Technologies, p.799–808. https://doi.org/10.3115/v1/n15-1081

    Google Scholar 

  • Rutherford A, Demberg V, Xue N, 2016. Neural network models for implicit discourse relation classification in English and Chinese without surface features. http://arxiv.org/abs/1606.01990

    Google Scholar 

  • Sarkar A, 2001. Applying co-training methods to statistical parsing. Proc 2nd Meeting of the North American Chapter of the Association for Computational Linguistics on Language Technologies, p.1–8. https://doi.org/10.3115/1073336.1073359

    Google Scholar 

  • Sporleder C, Lascarides A, 2008. Using automatically labelled examples to classify rhetorical relations: an assessment. Nat Lang Eng, 14:369–416. https://doi.org/10.1017/S1351324906004451

    Article  Google Scholar 

  • Verberne S, Boves L, Oostdijk N, et al., 2007. Evaluating discourse-based answer extraction for why-question answering. Proc 30th Annual Int ACM SIGIR Conf on Research and Development in Information Retrieval, p.735–736. https://doi.org/10.1145/1277741.1277883

    Google Scholar 

  • Wan X, 2009. Co-training for cross-lingual sentiment classification. Proc of the Joint Conf 47th Annual Meeting of the ACL and the 4th Int Joint Conf on Natural Language Processing of the AFNLP, p.235–243.

    Google Scholar 

  • Wang W, Su J, Tan C, 2010. Kernel based discourse relation recognition with temporal ordering information. Proc 48th Annual Meeting of the Association for Computational Linguistics, p.710–719.

    Google Scholar 

  • Wang X, Li S, Li J, et al., 2012. Implicit discourse relation recognition by selecting typical training examples. Proc of COLING, p.2757–2772.

    Google Scholar 

  • Xue N, 2005. Annotating discourse connectives in the Chinese treebank. Proc Workshop on Frontiers in Corpus Annotations II: Pie in the Sky, p.84–91. https://doi.org/10.3115/1608829.1608841

    Chapter  Google Scholar 

  • Zhang B, Su J, Xiong D, et al., 2015. Shallow convolutional neural network for implicit discourse relation recognition. Proc Conf on Empirical Methods in Natural Language Processing, p.2230–2235. https://doi.org/10.18653/v1/d15-1266

    Google Scholar 

  • Zhang B, Xiong D, Su J, et al., 2016. Variational neural discourse relation recognizer. Proc Conf on Empirical Methods in Natural Language Processing, p.382–391. https://doi.org/10.18653/v1/d16-1037

    Google Scholar 

  • Zhang M, Song Y, Qin B, et al., 2013. Chinese discourse relation recognition. J Chin Inform Proc, 27(6):51–57.

    Google Scholar 

  • Zhang M, Qin B, Liu T, 2014. Chinese discourse relation hierarchy and annotation. J Chin Inform Proc, 28(2):28–36.

    Google Scholar 

  • Zhou L, GaoW, Li B, et al., 2012. Cross-lingual identification of ambiguous discourse connectives for resource poor language. Proc COLING, p.1409–1418.

    Google Scholar 

  • Zhou Y, Xue N, 2015. The Chinese discourse treebank: a Chinese corpus annotated with discourse relations. Lang Res Eval, 49(2):397–431. https://doi.org/10.1007/s10579-014-9290-3

    Article  MathSciNet  Google Scholar 

  • Zhou Z, Xu Y, Niu Z, et al., 2010. Predicting discourse connectives for implicit discourse relation recognition. 23rd Int Conf on Computational Linguistics, p.1507–1514.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jin-song Su.

Additional information

Project supported by the National Natural Science Foundation of China (No. 61672440), the Natural Science Foundation of Fujian Province, China (No. 2016J05161), the Research Fund of the State Key Laboratory for Novel Software Technology in Nanjing University, China (No. KFKT2015B11), the Scientific Research Project of the National Language Committee of China (No. YB135-49), and the Fundamental Research Funds for the Central Universities, China (No. ZK1024)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lu, Yj., Xu, M., Wu, Cx. et al. Cross-lingual implicit discourse relation recognition with co-training. Frontiers Inf Technol Electronic Eng 19, 651–661 (2018). https://doi.org/10.1631/FITEE.1601865

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1631/FITEE.1601865

Key words

CLC number

Navigation