Advertisement

Multi-class Text Complexity Evaluation via Deep Neural Networks

  • Alfredo CuzzocreaEmail author
  • Giosué Lo Bosco
  • Giovanni Pilato
  • Daniele Schicchi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11872)

Abstract

Automatic Text Complexity Evaluation (ATE) is a natural language processing task which aims to assess texts difficulty taking into account many facets related to complexity. A large number of papers tackle the problem of ATE by means of machine learning algorithms in order to classify texts into complex or simple classes. In this paper, we try to go beyond the methodologies presented so far by introducing a preliminary system based on a deep neural network model whose objective is to classify sentences into more of two classes. Experiments have been carried out on a manually annotated corpus which has been preprocessed in order to make it suitable for the scope of the paper. The results show that a higher detail level of the classification makes the ATE problem much harder to resolve, showing the weaknesses of the model to accomplish the task correctly.

Keywords

Automatic Text Complexity Evaluation Deep neural network Text simplification 

References

  1. 1.
    Bosco, G.L., Pilato, G., Schicchi, D.: A neural network model for the evaluation of text complexity in Italian language: a representation point of view. Procedia Comput. Sci. 145, 464–470 (2018)CrossRefGoogle Scholar
  2. 2.
    Alfano, M., Lenzitti, B., Lo Bosco, G., Perticone, V.: An automatic system for helping health consumers to understand medical texts, pp. 622–627 (2015)Google Scholar
  3. 3.
    Braun, P., Cameron, J.J., Cuzzocrea, A., Jiang, F., Leung, C.K.: Effectively and efficiently mining frequent patterns from dense graph streams on disk. In: 18th International Conference in Knowledge Based and Intelligent Information and Engineering Systems, KES 2014, Gdynia, Poland, 15–17 September 2014, pp. 338–347 (2014)CrossRefGoogle Scholar
  4. 4.
    Chiavetta, F., Lo Bosco, G., Pilato, G.: A lexicon-based approach for sentiment classification of Amazon books reviews in Italian language, vol. 2, pp. 159–170 (2016)Google Scholar
  5. 5.
    Chiavetta, F., Lo Bosco, G., Pilato, G.: A layered architecture for sentiment classification of products reviews in Italian language. Lect. Notes Bus. Inf. Process. 292, 120–141 (2017)CrossRefGoogle Scholar
  6. 6.
    Cuzzocrea, A., Bertino, E.: Privacy preserving OLAP over distributed XML data: a theoretically-sound secure-multiparty-computation approach. J. Comput. Syst. Sci. 77(6), 965–987 (2011)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Di Gangi, M., Lo Bosco, G., Pilato, G.: Effectiveness of data-driven induction of semantic spaces and traditional classifiers for sarcasm detection. Nat. Lang. Eng. 25(2), 257–285 (2019)CrossRefGoogle Scholar
  8. 8.
    Flesch, R.: Marks of Readable Style; A Study in Adult Education. Teachers College Contributions to Education (1943)Google Scholar
  9. 9.
    Franchina, V., Vacca, R.: Adaptation of flesh readability index on a bilingual text written by the same author both in Italian and English languages. Linguaggi 3, 47–49 (1986)Google Scholar
  10. 10.
    Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)zbMATHGoogle Scholar
  11. 11.
    Grave, E., Bojanowski, P., Gupta, P., Joulin, A., Mikolov, T.: Learning word vectors for 157 languages. CoRR abs/1802.06893 (2018). http://arxiv.org/abs/1802.06893
  12. 12.
    Hinton, G., Srivastava, N., Swersky, K.: Neural networks for machine learning lecture 6a overview of mini-batch gradient descent (2012)Google Scholar
  13. 13.
    Kauchak, D., Mouradi, O., Pentoney, C., Leroy, G.: Text simplification tools: using machine learning to discover features that identify difficult text. In: 2014 47th Hawaii International Conference on System Sciences, pp. 2616–2625, January 2014.  https://doi.org/10.1109/HICSS.2014.330
  14. 14.
    Kincaid, J.: Derivation of New Readability Formulas: (automated Readability Index, Fog Count and Flesch Reading Ease Formula) for Navy Enlisted Personnel. Research Branch report, Chief of Naval Technical Training, Naval Air Station Memphis (1975). https://books.google.it/books?id=4tjroQEACAAJ
  15. 15.
    Lo Bosco, G., Pilato, G., Schicchi, D.: A sentence based system for measuring syntax complexity using a recurrent deep neural network. In: 2nd Workshop on Natural Language for Artificial Intelligence, NL4AI 2018, vol. 2244, pp. 95–101. CEUR-WS (2018)Google Scholar
  16. 16.
    Schicchi, D., Lo Bosco, G., Pilato, G.: Machine learning models for measuring syntax complexity of English text. In: Samsonovich, A.V. (ed.) BICA 2019. AISC, vol. 948, pp. 449–454. Springer, Cham (2020).  https://doi.org/10.1007/978-3-030-25719-4_59CrossRefGoogle Scholar
  17. 17.
    Lo Bosco, G., Pilato, G., Schicchi, D.: A recurrent deep neural network model to measure sentence complexity for the Italian language. In: International Workshop on Artificial Intelligence and Cognition, 6th Edition, Palermo, Italy (2018, in press)Google Scholar
  18. 18.
    Paetzold, G., Alva-Manchego, F., Specia, L.: Massalign: alignment and annotation of comparable documents. In: Proceedings of the IJCNLP 2017, System Demonstrations, pp. 1–4 (2017)Google Scholar
  19. 19.
    Scarton, C., Paetzold, G., Specia, L.: Text simplification from professionally produced corpora. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC-2018). European Languages Resources Association (ELRA), Miyazaki, Japan, May 2018. https://www.aclweb.org/anthology/L18-1553
  20. 20.
    Schicchi, D., Pilato, G.: A social humanoid robot as a playfellow for vocabulary enhancement. In: 2018 Second IEEE International Conference on Robotic Computing (IRC), pp. 205–208. IEEE Computer Society, Los Alamitos, February 2018Google Scholar
  21. 21.
    Schicchi, D., Pilato, G.: WORDY: a semi-automatic methodology aimed at the creation of neologisms based on a semantic network and blending devices. In: Barolli, L., Terzo, O. (eds.) Complex, Intelligent, and Software Intensive Systems. AISC, vol. 611, pp. 236–248. Springer International Publishing, Cham (2018).  https://doi.org/10.1007/978-3-319-61566-0_23CrossRefGoogle Scholar
  22. 22.
    Siddharthan, A.: A survey of research on text simplification. ITL Int. J. Appl. Linguist. 165(2), 259–298 (2014)CrossRefGoogle Scholar
  23. 23.
    Subramani, S., Michalska, S., Wang, H., Du, J., Zhang, Y., Shakeel, H.: Deep learning for multi-class identification from domestic violence online posts. IEEE Access 7, 46210–46224 (2019)CrossRefGoogle Scholar
  24. 24.
    Vajjala, S., Meurers, D.: Assessing the relative reading level of sentence pairs for text simplification. In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pp. 288–297 (2014)Google Scholar
  25. 25.
    Wu, Z., Yin, W., Cao, J., Xu, G., Cuzzocrea, A.: Community detection in multi-relational social networks. In: Web Information Systems Engineering - WISE 2013–14th International Conference, Nanjing, China, October 13–15, 2013, Proceedings, Part II, pp. 43–56 (2013)Google Scholar
  26. 26.
    Xu, W., Callison-Burch, C., Napoles, C.: Problems in current text simplification research: new data can help. Trans. Assoc. Comput. Linguist. 3, 283–297 (2015)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Alfredo Cuzzocrea
    • 1
    • 3
    Email author
  • Giosué Lo Bosco
    • 2
  • Giovanni Pilato
    • 3
  • Daniele Schicchi
    • 2
  1. 1.University of CalabriaRendeItaly
  2. 2.University of PalermoPalermoItaly
  3. 3.ICAR-CNRPalermoItaly

Personalised recommendations