Automatic Mining of Discourse Connectives for Russian

Toldova, Svetlana; Kobozeva, Maria; Pisarevskaya, Dina

doi:10.1007/978-3-030-01204-5_8

Svetlana Toldova¹²,
Maria Kobozeva¹³ &
Dina Pisarevskaya¹³

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 930))

Included in the following conference series:

Conference on Artificial Intelligence and Natural Language

819 Accesses
1 Citations

Abstract

The identification of discourse connectives plays an important role in many discourse processing approaches. Among them there are functional words usually enumerated in grammars (iz-za ‘due to’, blagodarya ‘thanks to’,) and not grammaticalized expressions (X vedet k Y ‘X leads to Y’, prichina etogo ‘the cause is’). Both types of connectives signal certain relations between discourse units. However, there are no ready-made lists of the second type of connectives. We suggest a method for expanding a seed list of connectives based on their vector representations by candidates for not grammaticalized connectives for Russian. Firstly, we compile a list of patterns for this type of connectives. These patterns are based on the following heuristics: the connectives are often used with anaphoric expressions substituting discourse units (thus, some patterns include special anaphoric elements); the connectives more frequently occur at the sentence beginning or after a comma. Secondly, we build multi-word tokens that are based on these patterns. Thirdly, we build vector representations for the multi-word tokens that match these patterns. Our experiments based on distributional semantics give quite reasonable list of the candidates for connectives.

The study was funded by RFBR according to the research project 17-29-07033.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Alonso, L., Castellón, I., Gibert, K., Padró, L.: An empirical approach to discourse markers by clustering. In: Escrig, M.T., Toledo, F., Golobardes, E. (eds.) CCIA 2002. LNCS (LNAI), vol. 2504, pp. 173–183. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-36079-4_15
Chapter Google Scholar
Apresyan, Y.D.: System-forming meanings to know and to consider in russian /sistemoobrazuyushchiye smysly znat’ i schitat’ v russkom yazyke. In: Russian Language and Linguistic Theory /Russkiy yazyk v nauchnom osveshchenii, vol. 1, pp. 5–26 (2001)
Google Scholar
Boguslavskaya, O.Y., Levontina, I.B.: Meanings cause and purpose in natural language /smysly ‘prichina’ i ‘tsel’ v yestestvennom yazyke. In: Topics in the study of language /Voprosy yazykoznaniya, vol. 2, pp. 68–88 (2004)
Google Scholar
Carlson, L., Marcu, D.: Discourse tagging reference manual. Technical report, ISI-TR-545, University of Southern California Information Sciences Institute (2001). http://www.isi.edu/~marcu/discourse/tagging-ref-manual.pdf
Carlson, L., Marcu, D., Okurowski, M.E.: Building a discourse-tagged corpus in the framework of rhetorical structure theory. In: Proceedings of the Second SIGdial Workshop on Discourse and Dialogue, SIGDIAL 2001, vol. 16, pp. 1–10. Association for Computational Linguistics, Stroudsburg (2001). https://doi.org/10.3115/1118078.1118083
Crible, L.: Discourse markers and (dis) fluency across registers: a contrastive usage-based study in English and French. Ph.D. thesis, UCL-Université Catholique de Louvain (2017)
Google Scholar
Ferrucci, D., et al.: Building watson: an overview of the DeepQA project. AI Mag. 31(3), 59–79 (2010)
Article Google Scholar
Galitsky, B., Ilvovsky, D., Kuznetsov, S.O.: Detecting logical argumentation in text via communicative discourse tree. J. Exp. Theor. Artif. Intell. 30, 1–27 (2018)
Article Google Scholar
Harris, Z.S.: Distributional structure. In: Harris, Z.S. (ed.) Papers in Structural and Transformational Linguistics, pp. 775–794. Springer, Dordrecht (1970). https://doi.org/10.1007/978-94-017-6059-1
Chapter MATH Google Scholar
Heerschop, B., Goossen, F., Hogenboom, A., Frasincar, F., Kaymak, U., de Jong, F.: Polarity analysis of texts using discourse structure. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 1061–1070. ACM (2011)
Google Scholar
Louis, A., Joshi, A., Nenkova, A.: Discourse indicators for content selection in summarization. In: Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pp. 147–156. Association for Computational Linguistics (2010)
Google Scholar
Mann, W.C., Thompson, S.A.: Rhetorical Structure Theory: Description and Construction of Text Structures. In: Kempen, G. (ed.) Natural Language Generation, pp. 85–95. Springer, Dordrecht (1987). https://doi.org/10.1007/978-94-009-3645-4_7
Chapter Google Scholar
Mukherjee, S., Bhattacharyya, P.: Sentiment analysis in Twitter with lightweight discourse analysis. In: Proceedings of COLING 2012, pp. 1847–1864 (2012)
Google Scholar
Pekelis, O.Y.: Causal subordinate clauses /prichinnyye pridatochnyye. In: Materials for the Project of Russian Grammar Corpus Description /Materialy dlya proyekta korpusnogo opisaniya russkoy grammatiki (2014). http://rusgram.ru
Pisarevskaya, D.: Rhetorical structure theory as a feature for deception detection in news reports in the Russian language. In: Computational Linguistics and Intellectual Technologies, pp. 184–193 (2017)
Google Scholar
Ribaldo, R., Akabane, A.T., Rino, L.H.M., Pardo, T.A.S.: Graph-based methods for multi-document summarization: exploring relationship maps, complex networks and discourse information. In: Caseli, H., Villavicencio, A., Teixeira, A., Perdigão, F. (eds.) PROPOR 2012. LNCS (LNAI), vol. 7243, pp. 260–271. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28885-2_30
Chapter Google Scholar
Rubin, V.L., Conroy, N.J., Chen, Y.: Towards news verification: deception detection methods for news discourse. In: HICSS 2015 (2015)
Google Scholar
Rysová, K., Rysová, M.: Discourse connectives and reference. In: TextLink2018-Final Action Conference, p. 122 (2018)
Google Scholar
Rysova, M., Mírovský, J.: Use of coreference in automatic searching for multiword discourse markers in the Prague dependency treebank. In: LAW VIII - The 8th Linguistic Annotation Workshop, pp. 11–19 (2014)
Google Scholar
Schauer, H.: From elementary discourse units to complex ones. In: Proceedings of the 1st SIGdial Workshop on Discourse and Dialogue, vol. 10, pp. 46–55. Association for Computational Linguistics (2000). https://doi.org/10.3115/1117736.1117742. http://portal.acm.org/citation.cfm?doid=1117736.1117742
Shvedova, N.Y. (ed.): Russian Grammar [Russkaya grammatika]. Nauka, Moscow (1980)
Google Scholar
Taboada, M., Mann, W.C.: Applications of rhetorical structure theory. Discourse Stud. 8(4), 567–588 (2006). https://doi.org/10.1177/1461445606064836
Article Google Scholar
Taboada, M., Voll, K., Brooke, J.: Extracting sentiment as a function of discourse structure and topicality (2008)
Google Scholar
Toldova, S., Pisarevskaya, D., Kobozeva, M.: The cues for rhetorical relations in Russian: cause-effect relation in Russian rhetorical structure treebank. Comput. Linguist. Intellect. Technol. 17(24), 748–761 (2018)
Google Scholar
Verberne, S., Boves, L., Oostdijk, N., Coppen, P.A.: Evaluating discourse-based answer extraction for why-question answering. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 735–736. ACM (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

National Research University “Higher School of Economics”, Moscow, Russia
Svetlana Toldova
Institute for Systems Analysis FRC CSC RAS, Moscow, Russia
Maria Kobozeva & Dina Pisarevskaya

Authors

Svetlana Toldova
View author publications
You can also search for this author in PubMed Google Scholar
Maria Kobozeva
View author publications
You can also search for this author in PubMed Google Scholar
Dina Pisarevskaya
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dina Pisarevskaya .

Editor information

Editors and Affiliations

Data and Web Science Group, University of Mannheim, Mannheim, Baden-Württemberg, Germany
Dmitry Ustalov
ITMO University, St. Petersburg, Russia
Andrey Filchenkov
University of Helsinki, Helsinki, Finland
Lidia Pivovarova
Mendel University, Brno, Czech Republic
Jan Žižka

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Toldova, S., Kobozeva, M., Pisarevskaya, D. (2018). Automatic Mining of Discourse Connectives for Russian. In: Ustalov, D., Filchenkov, A., Pivovarova, L., Žižka, J. (eds) Artificial Intelligence and Natural Language. AINL 2018. Communications in Computer and Information Science, vol 930. Springer, Cham. https://doi.org/10.1007/978-3-030-01204-5_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-01204-5_8
Published: 27 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01203-8
Online ISBN: 978-3-030-01204-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics