Skip to main content

Automatic Mining of Discourse Connectives for Russian

  • Conference paper
  • First Online:
Artificial Intelligence and Natural Language (AINL 2018)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 930))

Included in the following conference series:

Abstract

The identification of discourse connectives plays an important role in many discourse processing approaches. Among them there are functional words usually enumerated in grammars (iz-za ‘due to’, blagodarya ‘thanks to’,) and not grammaticalized expressions (X vedet k Y ‘X leads to Y’, prichina etogo ‘the cause is’). Both types of connectives signal certain relations between discourse units. However, there are no ready-made lists of the second type of connectives. We suggest a method for expanding a seed list of connectives based on their vector representations by candidates for not grammaticalized connectives for Russian. Firstly, we compile a list of patterns for this type of connectives. These patterns are based on the following heuristics: the connectives are often used with anaphoric expressions substituting discourse units (thus, some patterns include special anaphoric elements); the connectives more frequently occur at the sentence beginning or after a comma. Secondly, we build multi-word tokens that are based on these patterns. Thirdly, we build vector representations for the multi-word tokens that match these patterns. Our experiments based on distributional semantics give quite reasonable list of the candidates for connectives.

The study was funded by RFBR according to the research project 17-29-07033.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Alonso, L., Castellón, I., Gibert, K., Padró, L.: An empirical approach to discourse markers by clustering. In: Escrig, M.T., Toledo, F., Golobardes, E. (eds.) CCIA 2002. LNCS (LNAI), vol. 2504, pp. 173–183. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-36079-4_15

    Chapter  Google Scholar 

  2. Apresyan, Y.D.: System-forming meanings to know and to consider in russian /sistemoobrazuyushchiye smysly znat’ i schitat’ v russkom yazyke. In: Russian Language and Linguistic Theory /Russkiy yazyk v nauchnom osveshchenii, vol. 1, pp. 5–26 (2001)

    Google Scholar 

  3. Boguslavskaya, O.Y., Levontina, I.B.: Meanings cause and purpose in natural language /smysly ‘prichina’ i ‘tsel’ v yestestvennom yazyke. In: Topics in the study of language /Voprosy yazykoznaniya, vol. 2, pp. 68–88 (2004)

    Google Scholar 

  4. Carlson, L., Marcu, D.: Discourse tagging reference manual. Technical report, ISI-TR-545, University of Southern California Information Sciences Institute (2001). http://www.isi.edu/~marcu/discourse/tagging-ref-manual.pdf

  5. Carlson, L., Marcu, D., Okurowski, M.E.: Building a discourse-tagged corpus in the framework of rhetorical structure theory. In: Proceedings of the Second SIGdial Workshop on Discourse and Dialogue, SIGDIAL 2001, vol. 16, pp. 1–10. Association for Computational Linguistics, Stroudsburg (2001). https://doi.org/10.3115/1118078.1118083

  6. Crible, L.: Discourse markers and (dis) fluency across registers: a contrastive usage-based study in English and French. Ph.D. thesis, UCL-Université Catholique de Louvain (2017)

    Google Scholar 

  7. Ferrucci, D., et al.: Building watson: an overview of the DeepQA project. AI Mag. 31(3), 59–79 (2010)

    Article  Google Scholar 

  8. Galitsky, B., Ilvovsky, D., Kuznetsov, S.O.: Detecting logical argumentation in text via communicative discourse tree. J. Exp. Theor. Artif. Intell. 30, 1–27 (2018)

    Article  Google Scholar 

  9. Harris, Z.S.: Distributional structure. In: Harris, Z.S. (ed.) Papers in Structural and Transformational Linguistics, pp. 775–794. Springer, Dordrecht (1970). https://doi.org/10.1007/978-94-017-6059-1

    Chapter  MATH  Google Scholar 

  10. Heerschop, B., Goossen, F., Hogenboom, A., Frasincar, F., Kaymak, U., de Jong, F.: Polarity analysis of texts using discourse structure. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 1061–1070. ACM (2011)

    Google Scholar 

  11. Louis, A., Joshi, A., Nenkova, A.: Discourse indicators for content selection in summarization. In: Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pp. 147–156. Association for Computational Linguistics (2010)

    Google Scholar 

  12. Mann, W.C., Thompson, S.A.: Rhetorical Structure Theory: Description and Construction of Text Structures. In: Kempen, G. (ed.) Natural Language Generation, pp. 85–95. Springer, Dordrecht (1987). https://doi.org/10.1007/978-94-009-3645-4_7

    Chapter  Google Scholar 

  13. Mukherjee, S., Bhattacharyya, P.: Sentiment analysis in Twitter with lightweight discourse analysis. In: Proceedings of COLING 2012, pp. 1847–1864 (2012)

    Google Scholar 

  14. Pekelis, O.Y.: Causal subordinate clauses /prichinnyye pridatochnyye. In: Materials for the Project of Russian Grammar Corpus Description /Materialy dlya proyekta korpusnogo opisaniya russkoy grammatiki (2014). http://rusgram.ru

  15. Pisarevskaya, D.: Rhetorical structure theory as a feature for deception detection in news reports in the Russian language. In: Computational Linguistics and Intellectual Technologies, pp. 184–193 (2017)

    Google Scholar 

  16. Ribaldo, R., Akabane, A.T., Rino, L.H.M., Pardo, T.A.S.: Graph-based methods for multi-document summarization: exploring relationship maps, complex networks and discourse information. In: Caseli, H., Villavicencio, A., Teixeira, A., Perdigão, F. (eds.) PROPOR 2012. LNCS (LNAI), vol. 7243, pp. 260–271. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28885-2_30

    Chapter  Google Scholar 

  17. Rubin, V.L., Conroy, N.J., Chen, Y.: Towards news verification: deception detection methods for news discourse. In: HICSS 2015 (2015)

    Google Scholar 

  18. Rysová, K., Rysová, M.: Discourse connectives and reference. In: TextLink2018-Final Action Conference, p. 122 (2018)

    Google Scholar 

  19. Rysova, M., Mírovský, J.: Use of coreference in automatic searching for multiword discourse markers in the Prague dependency treebank. In: LAW VIII - The 8th Linguistic Annotation Workshop, pp. 11–19 (2014)

    Google Scholar 

  20. Schauer, H.: From elementary discourse units to complex ones. In: Proceedings of the 1st SIGdial Workshop on Discourse and Dialogue, vol. 10, pp. 46–55. Association for Computational Linguistics (2000). https://doi.org/10.3115/1117736.1117742. http://portal.acm.org/citation.cfm?doid=1117736.1117742

  21. Shvedova, N.Y. (ed.): Russian Grammar [Russkaya grammatika]. Nauka, Moscow (1980)

    Google Scholar 

  22. Taboada, M., Mann, W.C.: Applications of rhetorical structure theory. Discourse Stud. 8(4), 567–588 (2006). https://doi.org/10.1177/1461445606064836

    Article  Google Scholar 

  23. Taboada, M., Voll, K., Brooke, J.: Extracting sentiment as a function of discourse structure and topicality (2008)

    Google Scholar 

  24. Toldova, S., Pisarevskaya, D., Kobozeva, M.: The cues for rhetorical relations in Russian: cause-effect relation in Russian rhetorical structure treebank. Comput. Linguist. Intellect. Technol. 17(24), 748–761 (2018)

    Google Scholar 

  25. Verberne, S., Boves, L., Oostdijk, N., Coppen, P.A.: Evaluating discourse-based answer extraction for why-question answering. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 735–736. ACM (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dina Pisarevskaya .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Toldova, S., Kobozeva, M., Pisarevskaya, D. (2018). Automatic Mining of Discourse Connectives for Russian. In: Ustalov, D., Filchenkov, A., Pivovarova, L., Žižka, J. (eds) Artificial Intelligence and Natural Language. AINL 2018. Communications in Computer and Information Science, vol 930. Springer, Cham. https://doi.org/10.1007/978-3-030-01204-5_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-01204-5_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-01203-8

  • Online ISBN: 978-3-030-01204-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics