Abstract
The chapter introduces ReaderBench, a multi-lingual and flexible environment that integrates text mining technologies for assessing a wide range of learners’ productions and for supporting teachers in several ways. ReaderBench offers three main functionalities in terms of text analysis: cohesion-based assessment, reading strategies identification and textual complexity evaluation. All of these have been subject to empirical validations. ReaderBench may be used throughout an entire educational scenario, starting from the initial complexity assessment of the reading materials, the assignment of texts to learners, the detection of reading strategies reflected in one’s self-explanations, and comprehension evaluation fostering learner’s self-regulation process.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Abbreviations
- AA:
-
Adjacent agreement
- CAF:
-
Complexity, accuracy and fluency
- CSCL:
-
Computer supported collaborative learning
- DRP:
-
Degree of reading power
- EA:
-
Exact agreement
- FFL:
-
French as foreign language
- ICC:
-
Intra-class correlations
- LDA:
-
Latent Dirichlet allocation
- LMS:
-
Learning management system
- LSA:
-
Latent semantic analysis
- NLP:
-
Natural language processing
- POS:
-
Part of speech
- SVM:
-
Support vector machine
- TASA:
-
Touchstone Applied Science Associates, Inc
- Tf-Idf:
-
Term frequency – inverse document frequency
- WOLF:
-
WordNet Libre du Français
References
Agrawal, R., Batra, M.: A detailed study on text mining techniques. Int. J. Soft Comput. Eng. 2(6), 118–121 (2013)
Trausan-Matu, S., Dascalu, M., Dessus, P.: Textual complexity and discourse structure in computer-supported collaborative learning. In: Cerri, S.A., Clancey, W.J., Papadourakis, G., Panourgia, K. (eds.) ITS 2012. LNCS, vol. 7315, pp. 352–357. Springer, Heidelberg (2012)
Manning, C.D., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)
Landauer, T.K., Dumais, S.T.: A solution to Plato’s problem: the latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychol. Rev. 104(2), 211–240 (1997)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3(4–5), 993–1022 (2003)
Koedinger, K.R., Baker, R.S., Cunningham, K., Skogsholm, A., Leber, B., Stamper, J.: A data repository for the EDM community: the PSLC datashop. In: Romero, C., Ventura, S., Pechenizkiy, M., Baker, R.S. (eds.) Handbook of Educational Data Mining, pp. 43–55. CRC Press, Boca Raton (2010). (Chapman & Hall/CRC Data Mining and Knowledge Discovery Series)
Zou, M., Xu, Y., Nesbit, J.C., Winne, P.H.: Sequential pattern analysis of learning logs: methodology and applications. In: Romero, C., Ventura, S., Pechenizkiy, M., Baker, R.S. (eds.) Handbook of Educational Data Mining, pp. 107–121. CRC Press, Boca Raton (2010). (Chapman & Hall/CRC Data Mining and Knowledge Discovery Series)
Sheard, J.: Basics of statistical analysis of interactions data from web-based learning enviroments. In: Romero, C., Ventura, S., Pechenizkiy, M., Baker, R.S. (eds.) Handbook of Educational Data Mining, pp. 27–42. CRC Press, Boca Raton (2010). (Chapman & Hall/CRC Data Mining and Knowledge Discovery Series)
Tapiero, I.: Situation Models and Levels of Coherence. Lawrence Erlbaum Associates Inc, Mahwah (2007)
Schnotz, W.: Comparative instructional text organization. In: Mandl, H., Stein, N.L., Trabasso, T. (eds.) Learning and Comprehension of Text, pp. 53–81. Lawrence Erlbaum Associates Inc, Hillsdale (1984)
McNamara, D., Kintsch, E., Songer, N.B., Kintsch, W.: Are good texts always better? Interactions of text coherence, background knowledge, and levels of understanding in learning from text. Cogn. Instr. 14(1), 1–43 (1996)
Oakhill, J., Garnham, A.: On theories of belief bias in syllogistic reasoning. Cognition 46(1), 87–92 (1993)
O’Reilly, T., McNamara, D.S.: Reversing the reverse cohesion effect: good texts can be better for strategic, high-knowledge readers. Discourse Process. 43(2), 121–152 (2007)
Cain, K., Oakhill, J.: Reading comprehension development from 8 to 14 years: the contribution of component skills and processes. In: Wagner, R.K., Schatschneider, C., Phythian-Sence, C. (eds.) Beyond Decoding: the Behavioral and Biological Foundations of Reading Comprehension, pp. 143–175. Guilford Press, New York (2009)
Kintsch, W.: Comprehension: a Paradigm for Cognition. Cambridge University Press, Cambridge (1998)
McNamara, D.S., O’Reilly, T.: Theories of comprehension skill: knowledge and strategies versus capacity and suppression. In: Colombus, A.M. (ed.) Progress in Experimental Psychology Research, pp. 113–136. Nova Science Publishers, Hauppauge (2009)
Winne, P.H., Baker, R.S.: The potentials of educational data mining for researching metacognition, motivation and self-regulated learning. J. Educ. Data Mining 5(1), 1–8 (2013)
Eason, S.H., Goldberg, L., Cutting, L.: Reader-text interactions: how differential text and question types influence cognitive skills needed for reading comprehension. J. Educ. Psychol. 104(3), 515–528 (2012)
McNamara, D.S., Graesser, A.C., Louwerse, M.M.: Sources of text difficulty: across the ages and genres. In: Sabatini, J.P., Albro, E. (eds.) Assessing Reading in the 21st Century: Aligning and Applying Advances in the Reading and Measurement Sciences, Rowman & Littlefield Publishing, Lanham (in press)
Nelson, J., Perfetti, C., Liben, D., Liben, M.: Measures of text difficulty. Technical Report, Gates Foundation (2011)
McNamara, D.S., Louwerse, M.M., McCarthy, P.M., Graesser, A.C.: Coh-metrix: capturing linguistic features of cohesion. Discourse Process. 47(4), 292–330 (2010)
Millis, K., Magliano, J.: Assessing comprehension processes during reading. In: Sabatini, J. P., O’Reilly, T., Albro, E. R. (eds.) Reaching an understanding pp. 35–54. Lanham: Rowman & Littlefield (2012)
McNamara, D.S., Magliano, J.P.: Self-explanation and metacognition. In: Hacher, J.D., Dunlosky, J., Graesser, A.C. (eds.) Handbook of Metacognition in Education, pp. 60–81. Erlbaum, Mahwah (2009)
Millis, K., Magliano, J.: Assessing comprehension processes during reading. In: Sabatini, J.P., O’Reilly, T., Albro, E.R. (eds.) Reaching an Understanding, pp. 35–54. Rowman & Littlefield Publishing, Lanham (2012)
McNamara, D.S.: SERT: self-explanation reading training. Discourse Process. 38, 1–30 (2004)
Nardy, A., Bianco, M., Toffa, F., Rémond, M., Dessus, P.: Contrôle et Régulation de la Compréhension: L’acquisition de Stratégies de 8 à 11 ans. In: David, J., Royer, C. (eds.) L’apprentissage de la Lecture: Convergences, Innovations, Perspectives. Peter Lang, Bern (2003) (in press)
Hayes, A.F.: Introduction to Mediation, Moderation, and Conditional Process Analysis: A Regression-Based Approach. The Guilford Press, New York (2013)
Budanitsky, A., Hirst, G.: Evaluating WordNet-based measures of lexical semantic relatedness. Comput. Linguist. 32(1), 13–47 (2006)
Alias-i: LingPipe, http://alias-i.com/lingpipe
McCandless, M., Hatcher, E., Gospodnetic, O.: Lucene in Action (2nd ed.): Covers Apache Lucene 3.0. Manning Publications, Greenwich (2010)
Toutanova, K., Klein, D., Manning, C.D., Singer, Y.: Feature-rich part-of-speech tagging with a cyclic dependency network. In: Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pp. 173–180. Association for Computational Linguistics, Stroudsburg (2003)
Toutanova, K., Manning, C. D.: Enriching the knowledge sources used in a maximum entropy part-of-speech tagger. In: Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, pp. 63–70. Association for Computational Linguistics, Stroudsburg (2000)
Klein, D., Manning, C.D.: Accurate unlexicalized parsing. In: 41st Annual Meeting of the Association for Computational Linguistics, pp. 423–430. Association for Computational Linguistics, Stroudsburg (2003)
Green, S., de Marneffe, M., Bauer, J., Manning, C.D.: Multiword expression identification with tree substitution grammars: a parsing tour de force with French. In: Conference on Empirical Methods in Natural Language Processing EMNLP 2011, pp. 725–735. Association for Computational Linguistics, Stroudsburg (2011)
Snowball, http://snowball.tartarus.org/
Centre National de Ressources Textuelles et Lexicales. le Lexique Morphalou, http://www.cnrtl.fr/lexiques/morphalou/LMF-Morphalou.php
Finkel, J.R., Grenager, T., Manning, C.D.: Incorporating non-local information into information extraction systems by gibbs sampling. In: 43rd Annual Meeting on Association for Computational Linguistics, pp. 363–370. Association for Computational Linguistics, Stroudsburg (2005)
Lee, H., Chang, A., Peirsman, Y., Chambers, N., Surdeanu, M., Jurafsky, D.: Deterministic coreference resolution based on entity-centric, precision-ranked rules. Comput. Linguist. 39(4), 1–32 (2013)
Raghunathan, K., Lee, H., Rangarajan, S., Chambers, N., Surdeanu, M., Jurafsky, D., Manning, C.D.: A multi-pass sieve for coreference resolution. In: Conference on Empirical Methods in Natural Language Processing, pp. 492–501. Association for Computational Linguistics, Stroudsburg (2010)
Miller, G.A.: WordNet: a lexical database for english. Commun. ACM 38(11), 39–41 (1995)
Sagot, B., Darja, F.: Building a free french WordNet from multilingual resources. In: 6th International Conference on Language Resources and Evaluation, Ontolex 2008 Workshop, pp. 14–19. ELRA, Marrakech (2008)
Wu, Z., Palmer, M.: Verb semantics and lexical selection. In: 32nd Annual Meeting on Association for Computational Linguistics, pp. 133–138. Association for Computational Linguistics, Stroudsburg (1994)
Leacock, C., Chodorow, M.: Combining local context and WordNet similarity for wordsense identification. In: Fellbaum, C. (ed.) WordNet: An Electronic Lexical Database, pp. 265–283. MIT Press, Cambridge (1998)
Denhière, G., Lemaire, B., Bellissens, C., Jhean-Larose, S.: A semantic space for modeling children’s semantic memory. In: Landauer, T.K., McNamara, D.S., Dennis, S., Kintsch, W. (eds.) Handbook of Latent Semantic Analysis, pp. 143–165. Psychology Press, New York (2007)
Dascalu, M., Trausan-Matu, S., Dessus, P.: Utterances assessment in chat conversations. Res. Comput. Sci. 46, 323–334 (2010)
Lemaire, B.: Limites de la Lemmatisation pour L’extraction de Significations. In: 9es Journées Internationales d’Analyse Statistique des Données Textuelles, pp. 725–732. Presses Universitaires de Lyon, Lyon (2008)
Wiemer-Hastings, P., Zipitria, I.: Rules for syntax, vectors for semantics. In: 23rd Annual Conference of the Cognitive Science Society. Lawrence Erlbaum Associates Inc, Mahwah (2001)
Low, Y., Bickson, D., Gonzalez, J., Guestrin, C., Kyrola, A., Hellerstein, J.M.: Distributed GraphLab: a framework for machine learning and data mining in the cloud. VLDB Endowment 5(8), 716–727 (2012)
Mallet: A machine learning for language toolkit, http://mallet.cs.umass.edu/
Low, Y., Gonzalez, J., Kyrola, A., Bickson, D., Guestrin, C., Hellerstein, J.M.: GraphLab: a new parallel framework for machine learning. In: Grünwald, P., Spirtes, P. (eds.) 26th Conference on Uncertainty in Artificial Intelligence, pp. 340–349. AUAI Press, Catalina Island (2010)
Dascalu, M., Trausan-Matu, S., Dessus, P.: Cohesion-based analysis of CSCL conversations: holistic and individual perspectives. In: 10th International Conference on Computer-Supported Collaborative Learning, vol. 1, pp. 145–152. University of Wisconsin-Madison, Madison (2013)
Trausan-Matu, S., Stahl, G., Sarmiento, J.: Supporting polyphonic collaborative learning. E-service J. 6(1), 58–74 (2007). (Indiana University Press)
Rebedea, T., Dascalu, M., Trausan-Matu, S., Chiru, C.G.: Automatic feedback and support for students and tutors using CSCL chat conversations. In: 1st International K-Teams Workshop on Semantic and Collaborative Technologies for the Web, pp. 20–33. Politehnica Press, Bucharest (2011)
Trausan-Matu, S., Rebedea, T.: A polyphonic model and system for inter-animation analysis in chat conversations with multiple participants. In: Gelbukh, A. (ed.) 11th International Conference Computational Linguistics and Intelligent Text Processing. LNCS, vol. 6008, pp. 354–363. Springer, Heidelberg (2010)
Dascalu, M., Dessus, P., Trausan-Matu, S., Bianco, M., Nardy, A.: ReaderBench: an environment for analyzing text complexity and reading strategies. In: Lane, H.C., Yacef, K., Mostow, J., Pavlik. P. (eds.) 16th International Conference on Artificial Intelligence in Education. LNCS, vol. 7926, pp 379–388. Springer, Heidelberg (2013)
Topic Sentences and Signposting. Harvard University, Writing Center, http://www.fas.harvard.edu/~wricntr/documents/TopicSentences.html
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)
Galley, M., McKeown, K.: Improving word sense disambiguation in lexical chaining. In: 18th International Joint Conference on Artificial Intelligence, pp. 1486–1488. Morgan Kaufmann Publishers, San Francisco (2003)
Vidal, N.: Miguel de la Faim. Amitié-G.T. Rageot, Paris (1984)
Bastian, M., Heymann, S., Jacomy, M.: Gephi: An open source software for exploring and manipulating networks. In: 3rd International Conference on Weblogs and Social Media, pp. 361–362. AAAI Press, Menlo Park (2009)
Brandes, U.: A faster algorithm for betweenness centrality. J. Math. Sociol. 25(2), 163–177 (2001)
Williams, M.: Wittgenstein, Mind and Meaning: Towards a Social Conception of Mind. Routledge, New York (2002)
Mihalcea, R., Tarau, P.: TextRank: bringing order into texts. In: Conference on Empirical Methods in Natural Language Processing, pp. 404–411. Association for Computational Linguistics, Stroudsburg (2004)
McNamara, D.S., O’Reilly, T.P., Rowe, M., Boonthum, C., Levinstein, I.B.: iSTART: a web-based tutor that teaches self-explanation and metacognitive reading strategies. In: McNamara, D.S. (ed.) Reading Comprehension Strategies: Theories, Interventions, and Technologies, pp. 397–420. Lawrence Erlbaum Associates Inc, Mahwah (2007)
Dahl, R.: Matilda. Gallimard, Paris (2007)
Dascalu, M., Trausan-Matu, S., Dessus, P.: Towards an integrated approach for evaluating textual complexity for learning purposes. In: Popescu, E., Li, Q., Klamma R., Leung, H., Specht, M. (eds.) 11th International Conference in Advances in Web-Based Learning. LNCS, vol. 7558, pp. 268–278. Springer, Heidelberg (2012)
Cortes, C., Vapnik, V.N.: Support-Vector Networks. Mach. Learn. 20(3), 273–297 (1995)
François, T., Miltsakaki, E.: Do NLP and machine learning improve traditional readability formulas? In: 1st Workshop on Predicting and Improving Text Readability for Target Reader Populations, pp. 49–57. Association for Computational Linguistics, Stroudsburg (2012)
Petersen, S.E., Ostendorf, M.: A machine learning approach to reading level assessment. Comput. Speech Lang. 23, 89–106 (2009)
van Dijk, T.A., Kintsch, W.: Strategies of Discourse Comprehension. Academic Press, New York (1983)
Feng, L., Jansche, M., Huenerfauth, M., Elhadad, N.: A comparison of features for automatic readability assessment. In: 23rd International. Conference on Computational Linguistics, pp. 276–284. Association for Computational Linguistics, Stroudsburg (2010)
Lee, H., Peirsman, Y., Chang, A., Chambers, N., Surdeanu, M., Jurafsky, D.: Stanford’s multi-pass sieve coreference resolution system at the CoNLL-2011 Shared task. In: 15th Conference on Computational Natural Language Learning: Shared Task, pp. 28–34. Association for Computational Linguistics, Stroudsburg (2011)
Pfeffer, P.: Les Pharmacies des Éléphants. Vie et Mort d’un Géant: L’éléphant d’Afrique, Flammarion, Paris (1989)
Mandin, S.: Modèles Cognitifs Computationnels de L’activité de Résumer: Expérimentation d’un Eiah auprès D’élèves de Lycée. Laboratoire des Sciences de l’Éducation. PhD thesis. Université Grenoble (2009)
Donaway, R.L., Drummey, K.W., Mather, L.A.: A comparison of rankings produced by summarization evaluation measures. In: Workshop on Automatic Summarization, vol. 4, pp. 69–78. Association for Computational Linguistics, Stroudsburg (2000)
Graesser, A.C., Singer, M., Trabasso, T.: Constructing inferences during narrative text comprehension. Psychol. Rev. 101(3), 371–395 (1994)
Geisser, S.: Predictive Inference: An Introduction. Chapman and Hall, New York (1993)
Schulze, M.: Measuring textual complexity in student writing. In: American Association of Applied Linguistics. AAAL 2010, Atlanta (2010)
McNamara, D.S., Boonthum, C., Levinstein, I.B.: Evaluating self-explanations in iSTART: comparing word-based and LSA algorithms. In: Landauer, T.K., McNamara, D.S., Dennis, S., Kintsch, W. (eds.) Handbook of Latent Semantic Analysis, pp. 227–241. Psychology Press, New York (2007)
Graesser, A.C., McNamara, D.S., VanLehn, K.: Scaffolding deep comprehension strategies through point & query, AutoTutor, and iStart. Educ. Psychol. 40(4), 225–234 (2005)
Nardy, A., Bianco, M., Toffa, F., Rémond, M., Dessus, P.: Contrôle et Régulation de la Compréhension: L’acquisition de Stratégies de 8 à 11 ans. In: David, J., Royer, C. (eds.) L’apprentissage de la Lecture: Convergences, Innovations, Perspectives. Peter Lang, Bern (in press) (2003)
O’Reilly, T.P., Sinclair, G.P., McNamara, D.S.: iSTART: a web-based reading strategy intervention that improves students’ science comprehension. In: Kinshuk, K., Sampson D. G., Isaías P. (eds.) IADIS International Conference Cognition and Exploratory Learning in Digital Age: CELDA 2004 pp. 173-180. IADIS Press, Lisbon (2004)
Graesser, A.C., McNamara, D.S., Louwerse, M.M., Cai, Z.: Coh-metrix: analysis of text on cohesion and language. Behav. Res. Meth. Instrum. Comput. 36(2), 193–202 (2004)
François, T.: Les Apports du Traitement Automatique du Langage à la Lisibilité du Français Langue Étrangère. Centre de Traitement Automatique du Langage, PhD thesis. Université Catholique de Louvain, Faculté de Philosophie, Arts et Lettres, Louvain-la-Neuve (2012)
TreeTagger—A Language Independent Part of Speech Tagger, http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/
Kukemelk, H., Mikk, J.: The prognosticating effectivity of learning a text in physics. Quant. Linguist. 14, 82–103 (1993)
Bouhineau, D., Luengo, V., Mandran, N., Toussaint, B.M., Ortega, M., Wajeman, C.: Open platform to model and capture experimental data in technology enhanced learning systems. In: Workshop on Data Analysis and Interpretation for Learning Environments,Vienna University of Economics and Business, Vienna (2013)
Acknowledgments
This research was supported by an Agence Nationale de la Recherche (ANR-10-BLAN-1907) grant, by the 264207 ERRIC–Empowering Romanian Research on Intelligent Information Technologies/FP7-REGPOT-2010-1 and the POSDRU/107/1.5/S/76909 Harnessing human capital in research through doctoral scholarships (ValueDoc) projects. We also wish to thank Sonia Mandin, who kindly provided experimental data used for the validation of sentence importance. Some parts of this paper stem from [55].
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Dascalu, M., Dessus, P., Bianco, M., Trausan-Matu, S., Nardy, A. (2014). Mining Texts, Learner Productions and Strategies with ReaderBench . In: Peña-Ayala, A. (eds) Educational Data Mining. Studies in Computational Intelligence, vol 524. Springer, Cham. https://doi.org/10.1007/978-3-319-02738-8_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-02738-8_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-02737-1
Online ISBN: 978-3-319-02738-8
eBook Packages: EngineeringEngineering (R0)