Multi-class Text Complexity Evaluation via Deep Neural Networks

Cuzzocrea, Alfredo; Bosco, Giosué Lo; Pilato, Giovanni; Schicchi, Daniele

doi:10.1007/978-3-030-33617-2_32

Alfredo Cuzzocrea^14,16,
Giosué Lo Bosco¹⁵,
Giovanni Pilato¹⁶ &
…
Daniele Schicchi¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11872))

Included in the following conference series:

International Conference on Intelligent Data Engineering and Automated Learning

1050 Accesses
7 Citations

Abstract

Automatic Text Complexity Evaluation (ATE) is a natural language processing task which aims to assess texts difficulty taking into account many facets related to complexity. A large number of papers tackle the problem of ATE by means of machine learning algorithms in order to classify texts into complex or simple classes. In this paper, we try to go beyond the methodologies presented so far by introducing a preliminary system based on a deep neural network model whose objective is to classify sentences into more of two classes. Experiments have been carried out on a manually annotated corpus which has been preprocessed in order to make it suitable for the scope of the paper. The results show that a higher detail level of the classification makes the ATE problem much harder to resolve, showing the weaknesses of the model to accomplish the task correctly.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bosco, G.L., Pilato, G., Schicchi, D.: A neural network model for the evaluation of text complexity in Italian language: a representation point of view. Procedia Comput. Sci. 145, 464–470 (2018)
Article Google Scholar
Alfano, M., Lenzitti, B., Lo Bosco, G., Perticone, V.: An automatic system for helping health consumers to understand medical texts, pp. 622–627 (2015)
Google Scholar
Braun, P., Cameron, J.J., Cuzzocrea, A., Jiang, F., Leung, C.K.: Effectively and efficiently mining frequent patterns from dense graph streams on disk. In: 18th International Conference in Knowledge Based and Intelligent Information and Engineering Systems, KES 2014, Gdynia, Poland, 15–17 September 2014, pp. 338–347 (2014)
Article Google Scholar
Chiavetta, F., Lo Bosco, G., Pilato, G.: A lexicon-based approach for sentiment classification of Amazon books reviews in Italian language, vol. 2, pp. 159–170 (2016)
Google Scholar
Chiavetta, F., Lo Bosco, G., Pilato, G.: A layered architecture for sentiment classification of products reviews in Italian language. Lect. Notes Bus. Inf. Process. 292, 120–141 (2017)
Article Google Scholar
Cuzzocrea, A., Bertino, E.: Privacy preserving OLAP over distributed XML data: a theoretically-sound secure-multiparty-computation approach. J. Comput. Syst. Sci. 77(6), 965–987 (2011)
Article MathSciNet Google Scholar
Di Gangi, M., Lo Bosco, G., Pilato, G.: Effectiveness of data-driven induction of semantic spaces and traditional classifiers for sarcasm detection. Nat. Lang. Eng. 25(2), 257–285 (2019)
Article Google Scholar
Flesch, R.: Marks of Readable Style; A Study in Adult Education. Teachers College Contributions to Education (1943)
Google Scholar
Franchina, V., Vacca, R.: Adaptation of flesh readability index on a bilingual text written by the same author both in Italian and English languages. Linguaggi 3, 47–49 (1986)
Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
MATH Google Scholar
Grave, E., Bojanowski, P., Gupta, P., Joulin, A., Mikolov, T.: Learning word vectors for 157 languages. CoRR abs/1802.06893 (2018). http://arxiv.org/abs/1802.06893
Hinton, G., Srivastava, N., Swersky, K.: Neural networks for machine learning lecture 6a overview of mini-batch gradient descent (2012)
Google Scholar
Kauchak, D., Mouradi, O., Pentoney, C., Leroy, G.: Text simplification tools: using machine learning to discover features that identify difficult text. In: 2014 47th Hawaii International Conference on System Sciences, pp. 2616–2625, January 2014. https://doi.org/10.1109/HICSS.2014.330
Kincaid, J.: Derivation of New Readability Formulas: (automated Readability Index, Fog Count and Flesch Reading Ease Formula) for Navy Enlisted Personnel. Research Branch report, Chief of Naval Technical Training, Naval Air Station Memphis (1975). https://books.google.it/books?id=4tjroQEACAAJ
Lo Bosco, G., Pilato, G., Schicchi, D.: A sentence based system for measuring syntax complexity using a recurrent deep neural network. In: 2nd Workshop on Natural Language for Artificial Intelligence, NL4AI 2018, vol. 2244, pp. 95–101. CEUR-WS (2018)
Google Scholar
Schicchi, D., Lo Bosco, G., Pilato, G.: Machine learning models for measuring syntax complexity of English text. In: Samsonovich, A.V. (ed.) BICA 2019. AISC, vol. 948, pp. 449–454. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-25719-4_59
Chapter Google Scholar
Lo Bosco, G., Pilato, G., Schicchi, D.: A recurrent deep neural network model to measure sentence complexity for the Italian language. In: International Workshop on Artificial Intelligence and Cognition, 6th Edition, Palermo, Italy (2018, in press)
Google Scholar
Paetzold, G., Alva-Manchego, F., Specia, L.: Massalign: alignment and annotation of comparable documents. In: Proceedings of the IJCNLP 2017, System Demonstrations, pp. 1–4 (2017)
Google Scholar
Scarton, C., Paetzold, G., Specia, L.: Text simplification from professionally produced corpora. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC-2018). European Languages Resources Association (ELRA), Miyazaki, Japan, May 2018. https://www.aclweb.org/anthology/L18-1553
Schicchi, D., Pilato, G.: A social humanoid robot as a playfellow for vocabulary enhancement. In: 2018 Second IEEE International Conference on Robotic Computing (IRC), pp. 205–208. IEEE Computer Society, Los Alamitos, February 2018
Google Scholar
Schicchi, D., Pilato, G.: WORDY: a semi-automatic methodology aimed at the creation of neologisms based on a semantic network and blending devices. In: Barolli, L., Terzo, O. (eds.) Complex, Intelligent, and Software Intensive Systems. AISC, vol. 611, pp. 236–248. Springer International Publishing, Cham (2018). https://doi.org/10.1007/978-3-319-61566-0_23
Chapter Google Scholar
Siddharthan, A.: A survey of research on text simplification. ITL Int. J. Appl. Linguist. 165(2), 259–298 (2014)
Article Google Scholar
Subramani, S., Michalska, S., Wang, H., Du, J., Zhang, Y., Shakeel, H.: Deep learning for multi-class identification from domestic violence online posts. IEEE Access 7, 46210–46224 (2019)
Article Google Scholar
Vajjala, S., Meurers, D.: Assessing the relative reading level of sentence pairs for text simplification. In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pp. 288–297 (2014)
Google Scholar
Wu, Z., Yin, W., Cao, J., Xu, G., Cuzzocrea, A.: Community detection in multi-relational social networks. In: Web Information Systems Engineering - WISE 2013–14th International Conference, Nanjing, China, October 13–15, 2013, Proceedings, Part II, pp. 43–56 (2013)
Google Scholar
Xu, W., Callison-Burch, C., Napoles, C.: Problems in current text simplification research: new data can help. Trans. Assoc. Comput. Linguist. 3, 283–297 (2015)
Article Google Scholar

Download references

Author information

Authors and Affiliations

University of Calabria, Rende, Italy
Alfredo Cuzzocrea
University of Palermo, Palermo, Italy
Giosué Lo Bosco & Daniele Schicchi
ICAR-CNR, Palermo, Italy
Alfredo Cuzzocrea & Giovanni Pilato

Authors

Alfredo Cuzzocrea
View author publications
You can also search for this author in PubMed Google Scholar
Giosué Lo Bosco
View author publications
You can also search for this author in PubMed Google Scholar
Giovanni Pilato
View author publications
You can also search for this author in PubMed Google Scholar
Daniele Schicchi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alfredo Cuzzocrea .

Editor information

Editors and Affiliations

University of Manchester, Manchester, UK
Hujun Yin
Technical University of Madrid, Madrid, Spain
David Camacho
University of Birmingham, Birmingham, UK
Peter Tino
University of Huelva, Huelva, Spain
Antonio J. Tallón-Ballesteros
University of Exeter, Exeter, UK
Ronaldo Menezes
University of Manchester, Manchester, UK
Richard Allmendinger

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cuzzocrea, A., Bosco, G.L., Pilato, G., Schicchi, D. (2019). Multi-class Text Complexity Evaluation via Deep Neural Networks. In: Yin, H., Camacho, D., Tino, P., Tallón-Ballesteros, A., Menezes, R., Allmendinger, R. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2019. IDEAL 2019. Lecture Notes in Computer Science(), vol 11872. Springer, Cham. https://doi.org/10.1007/978-3-030-33617-2_32

Download citation

DOI: https://doi.org/10.1007/978-3-030-33617-2_32
Published: 18 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33616-5
Online ISBN: 978-3-030-33617-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics