Skip to main content

Intellectualization of Knowledge Acquisition of Academic Texts as an Answer to Challenges of Modern Information Society

  • Conference paper
  • First Online:
Book cover Electronic Governance and Open Society: Challenges in Eurasia (EGOSE 2018)

Abstract

Extracting knowledge from an increasing information flow is one of the main challenges of modern information society. The paper considers the possibilities and means for intellectualization of this process concerning such an important information source as the academic texts. In this case the user is faced with the task of finding fragments relevant to the subject of interest, within the vast textual documents often written in a foreign language. We experimentally investigated the comparative effectiveness of TS algorithms for extended coherent academic texts. The procedure of instrumental effectiveness evaluation was substantiated. The influence of the most significant characteristics of the text, including original language, structural organization (levels of heading), subjects of research (technique, information technologies and medicine) was considered. We have shown that for the intellectualization of knowledge acquisition from academic texts it is necessary to present to the reader the results of the TS fulfilled by different algorithms, in a complex. A system of complex visualization of TS results is proposed, and an appropriate software solution is developed. The visualization system for extended coherent texts explicitly demonstrates the semantic structure of the text, which allows the user to detect and analyze not the whole text, but only fragments corresponding to his current information needs and thus getting a complete idea of the subject of interest.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. LNCS homepage. http://www.springer.com/lncs. Accessed 21 Nov 2016

  2. Atkins, S., Clear, J., Ostler, N.: Corpus design criteria. Literary Linguist. Comput. 7(1), 1–16 (1992)

    Article  Google Scholar 

  3. Avdeeva, N., Artemova, G., Boyarsky, K., Gusarova, N., Dobrenko, N., Kanevsky, E.: Subtopic segmentation of scientific texts: parameter optimisation. In: Klinov, P., Mouromtsev, D. (eds.) KESW 2015. CCIS, vol. 518, pp. 3–15. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24543-0_1

    Chapter  Google Scholar 

  4. Aysina, R.: Survey of visualization tools for topic models of text corpora. Mach. Learn. Data Anal. 1(11), 1584–1618 (2015)

    Google Scholar 

  5. Biber, D.: Representativeness in corpus design. Literary Linguist. Comput. 8(4), 243–257 (1993)

    Article  Google Scholar 

  6. Boyarsky, K., Gusarova, N.F., Avdeeva, N., et al.: Specifics of applying topic segmentation algorithms to scientific texts In: Proceedings of XVII International Conference on DAMDID/RCDL (2015)

    Google Scholar 

  7. Burrough-Boenisch, J.: Culture and conventions: writing and reading Dutch scientific English. Netherlands Graduate School of Linguistics (2002)

    Google Scholar 

  8. Cardoso, P.C., Taboada, M., Pardo, T.A.: Subtopic annotation in a corpus of news texts: steps towards automatic subtopic segmentation. In: Proceedings of the 9th Brazilian Symposium in Information and Human Language Technology (2013)

    Google Scholar 

  9. Choi, F.Y., Wiemer-Hastings, P., Moore, J.: Latent semantic analysis for text segmentation. In: Proceedings of the 2001 Conference on Empirical Methods in Natural Language Processing (2001)

    Google Scholar 

  10. Halliday, M.A.K., Hasan, R.: Cohesion in English. Routledge, London (2014)

    Book  Google Scholar 

  11. Hearst, M.A.: Multi-paragraph segmentation of expository text. In: Proceedings of the 32nd Annual Meeting on Association for Computational Linguistics, pp. 9–16. Association for Computational Linguistics (1994)

    Google Scholar 

  12. Lloret, E.: Topic detection and segmentation in automatic text summarization (2009)

    Google Scholar 

  13. Martin, J.H., Jurafsky, D.: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Pearson/Prentice Hall, Upper Saddle River (2009)

    Google Scholar 

  14. Moens, M.F., Angheluta, R., De Busser, R., Jeuniaux, P.: Summarizing texts at various levels of detail. In: Coupling Approaches, Coupling Media and Coupling Languages for Information Retrieval, pp. 597–609. Le centre de hautes etudes internationales d’informatique documentaire (2004)

    Google Scholar 

  15. Myers, G.: Lexical cohesion and specialized knowledge in science and popular science texts. Discourse Processes 14(1), 1–26 (1991)

    Article  MathSciNet  Google Scholar 

  16. Pak, I., Teh, P.L.: Text segmentation techniques: a critical review. In: Zelinka, I., Vasant, P., Duy, V.H., Dao, T.T. (eds.) Innovative Computing, Optimization and Its Applications. SCI, vol. 741, pp. 167–181. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-66984-7_10

    Chapter  Google Scholar 

  17. Randaccio, M.: Language change in scientific discourse. JCOM 3(2), 1–15 (2004)

    Google Scholar 

  18. Riedl, M., Biemann, C.: Text segmentation with topic models. J. Lang. Technol. Comput. Linguist. 27(1), 47–69 (2012)

    Google Scholar 

  19. Ries, K.: Segmenting Conversations by Topic, Initiative, and Style. In: Coden, Anni R., Brown, Eric W., Srinivasan, S. (eds.) IRTSA 2001. LNCS, vol. 2273, pp. 51–66. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45637-6_5

    Chapter  Google Scholar 

  20. Song, F., Darling, W.M., Duric, A., Kroon, F.W.: An iterative approach to text segmentation. In: Clough, P., et al. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 629–640. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20161-5_63

    Chapter  Google Scholar 

  21. Van Dijk, T.A., Kintsch, W.: Strategies of discourse comprehension. Academic Press, New York (1983)

    Google Scholar 

  22. Vorontsov, K., Potapenko, A.: Additive regularization of topic models. Mach. Learn. 101(1–3), 303–323 (2015)

    Article  MathSciNet  Google Scholar 

  23. Yaari, Y.: Segmentation of expository texts by hierarchical agglomerative clustering. arXiv preprint cmp-lg/9709015 (1997)

    Google Scholar 

Download references

Acknowledgement

This work was financially supported by the Government of Russian Federation, Grant 08-08.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Artem Lobantsev .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Vatian, A. et al. (2019). Intellectualization of Knowledge Acquisition of Academic Texts as an Answer to Challenges of Modern Information Society. In: Chugunov, A., Misnikov, Y., Roshchin, E., Trutnev, D. (eds) Electronic Governance and Open Society: Challenges in Eurasia. EGOSE 2018. Communications in Computer and Information Science, vol 947. Springer, Cham. https://doi.org/10.1007/978-3-030-13283-5_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-13283-5_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-13282-8

  • Online ISBN: 978-3-030-13283-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics