Abstract
The identification of discourse connectives plays an important role in many discourse processing approaches. Among them there are functional words usually enumerated in grammars (iz-za ‘due to’, blagodarya ‘thanks to’,) and not grammaticalized expressions (X vedet k Y ‘X leads to Y’, prichina etogo ‘the cause is’). Both types of connectives signal certain relations between discourse units. However, there are no ready-made lists of the second type of connectives. We suggest a method for expanding a seed list of connectives based on their vector representations by candidates for not grammaticalized connectives for Russian. Firstly, we compile a list of patterns for this type of connectives. These patterns are based on the following heuristics: the connectives are often used with anaphoric expressions substituting discourse units (thus, some patterns include special anaphoric elements); the connectives more frequently occur at the sentence beginning or after a comma. Secondly, we build multi-word tokens that are based on these patterns. Thirdly, we build vector representations for the multi-word tokens that match these patterns. Our experiments based on distributional semantics give quite reasonable list of the candidates for connectives.
The study was funded by RFBR according to the research project 17-29-07033.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alonso, L., Castellón, I., Gibert, K., Padró, L.: An empirical approach to discourse markers by clustering. In: Escrig, M.T., Toledo, F., Golobardes, E. (eds.) CCIA 2002. LNCS (LNAI), vol. 2504, pp. 173–183. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-36079-4_15
Apresyan, Y.D.: System-forming meanings to know and to consider in russian /sistemoobrazuyushchiye smysly znat’ i schitat’ v russkom yazyke. In: Russian Language and Linguistic Theory /Russkiy yazyk v nauchnom osveshchenii, vol. 1, pp. 5–26 (2001)
Boguslavskaya, O.Y., Levontina, I.B.: Meanings cause and purpose in natural language /smysly ‘prichina’ i ‘tsel’ v yestestvennom yazyke. In: Topics in the study of language /Voprosy yazykoznaniya, vol. 2, pp. 68–88 (2004)
Carlson, L., Marcu, D.: Discourse tagging reference manual. Technical report, ISI-TR-545, University of Southern California Information Sciences Institute (2001). http://www.isi.edu/~marcu/discourse/tagging-ref-manual.pdf
Carlson, L., Marcu, D., Okurowski, M.E.: Building a discourse-tagged corpus in the framework of rhetorical structure theory. In: Proceedings of the Second SIGdial Workshop on Discourse and Dialogue, SIGDIAL 2001, vol. 16, pp. 1–10. Association for Computational Linguistics, Stroudsburg (2001). https://doi.org/10.3115/1118078.1118083
Crible, L.: Discourse markers and (dis) fluency across registers: a contrastive usage-based study in English and French. Ph.D. thesis, UCL-Université Catholique de Louvain (2017)
Ferrucci, D., et al.: Building watson: an overview of the DeepQA project. AI Mag. 31(3), 59–79 (2010)
Galitsky, B., Ilvovsky, D., Kuznetsov, S.O.: Detecting logical argumentation in text via communicative discourse tree. J. Exp. Theor. Artif. Intell. 30, 1–27 (2018)
Harris, Z.S.: Distributional structure. In: Harris, Z.S. (ed.) Papers in Structural and Transformational Linguistics, pp. 775–794. Springer, Dordrecht (1970). https://doi.org/10.1007/978-94-017-6059-1
Heerschop, B., Goossen, F., Hogenboom, A., Frasincar, F., Kaymak, U., de Jong, F.: Polarity analysis of texts using discourse structure. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 1061–1070. ACM (2011)
Louis, A., Joshi, A., Nenkova, A.: Discourse indicators for content selection in summarization. In: Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pp. 147–156. Association for Computational Linguistics (2010)
Mann, W.C., Thompson, S.A.: Rhetorical Structure Theory: Description and Construction of Text Structures. In: Kempen, G. (ed.) Natural Language Generation, pp. 85–95. Springer, Dordrecht (1987). https://doi.org/10.1007/978-94-009-3645-4_7
Mukherjee, S., Bhattacharyya, P.: Sentiment analysis in Twitter with lightweight discourse analysis. In: Proceedings of COLING 2012, pp. 1847–1864 (2012)
Pekelis, O.Y.: Causal subordinate clauses /prichinnyye pridatochnyye. In: Materials for the Project of Russian Grammar Corpus Description /Materialy dlya proyekta korpusnogo opisaniya russkoy grammatiki (2014). http://rusgram.ru
Pisarevskaya, D.: Rhetorical structure theory as a feature for deception detection in news reports in the Russian language. In: Computational Linguistics and Intellectual Technologies, pp. 184–193 (2017)
Ribaldo, R., Akabane, A.T., Rino, L.H.M., Pardo, T.A.S.: Graph-based methods for multi-document summarization: exploring relationship maps, complex networks and discourse information. In: Caseli, H., Villavicencio, A., Teixeira, A., Perdigão, F. (eds.) PROPOR 2012. LNCS (LNAI), vol. 7243, pp. 260–271. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28885-2_30
Rubin, V.L., Conroy, N.J., Chen, Y.: Towards news verification: deception detection methods for news discourse. In: HICSS 2015 (2015)
Rysová, K., Rysová, M.: Discourse connectives and reference. In: TextLink2018-Final Action Conference, p. 122 (2018)
Rysova, M., Mírovský, J.: Use of coreference in automatic searching for multiword discourse markers in the Prague dependency treebank. In: LAW VIII - The 8th Linguistic Annotation Workshop, pp. 11–19 (2014)
Schauer, H.: From elementary discourse units to complex ones. In: Proceedings of the 1st SIGdial Workshop on Discourse and Dialogue, vol. 10, pp. 46–55. Association for Computational Linguistics (2000). https://doi.org/10.3115/1117736.1117742. http://portal.acm.org/citation.cfm?doid=1117736.1117742
Shvedova, N.Y. (ed.): Russian Grammar [Russkaya grammatika]. Nauka, Moscow (1980)
Taboada, M., Mann, W.C.: Applications of rhetorical structure theory. Discourse Stud. 8(4), 567–588 (2006). https://doi.org/10.1177/1461445606064836
Taboada, M., Voll, K., Brooke, J.: Extracting sentiment as a function of discourse structure and topicality (2008)
Toldova, S., Pisarevskaya, D., Kobozeva, M.: The cues for rhetorical relations in Russian: cause-effect relation in Russian rhetorical structure treebank. Comput. Linguist. Intellect. Technol. 17(24), 748–761 (2018)
Verberne, S., Boves, L., Oostdijk, N., Coppen, P.A.: Evaluating discourse-based answer extraction for why-question answering. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 735–736. ACM (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Toldova, S., Kobozeva, M., Pisarevskaya, D. (2018). Automatic Mining of Discourse Connectives for Russian. In: Ustalov, D., Filchenkov, A., Pivovarova, L., Žižka, J. (eds) Artificial Intelligence and Natural Language. AINL 2018. Communications in Computer and Information Science, vol 930. Springer, Cham. https://doi.org/10.1007/978-3-030-01204-5_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-01204-5_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01203-8
Online ISBN: 978-3-030-01204-5
eBook Packages: Computer ScienceComputer Science (R0)