We present a technique that examines handwritten equations from a student’s solution to an engineering problem and from this estimates the correctness of the work. More specifically, we demonstrate that lexical properties of the equations correlate with the grade a human grader would assign. We characterize these properties with a set of features that include the number of occurrences of various classes of symbols and binary and tripartite sequences of them. Support vector machine (SVM) regression models trained with these features achieved a correlation of r = .433 (p< .001) on a combined set of six exam problems. Prior work suggests that the number of long pauses in the writing that occur as a student solves a problem correlates with correctness. We found that combining this pause feature with our lexical features produced more accurate predictions than using either type of feature alone. SVM regression models trained using an optimized subset of three lexical features and the pause feature achieved an average correlation with grade across the six problems of r = .503 (p< .001). These techniques are an important step toward creating systems that can automatically assess handwritten coursework.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Price includes VAT for USA
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
This is the net price. Taxes to be calculated in checkout.
Attali, Y. (2013). Validity and reliability of automated essay scoring. Handbook of automated essay evaluation: current applications and new direction, 181–198.
Attali, Y. (2015). Reliability-based feature weighting for automated essay scoring. Appl. Psychol. Meas., 39(4), 303–313.
Beal, C.R., & Cohen, P.R. (2008). Temporal data mining for educational applications. In Proceedings of the 10th Pacific rim international conference on artificial intelligence: trends in artificial intelligence (pp. 66–77). Berlin: Springer.
Bransford, J.D., Brown, A.L., Cocking, R.R. (Eds.). (2000). How people learn: brain, mind, experience, and school: expanded edition. Washington: The National Academies Press.
Cheng, P.C., & Rojas-Anaya, H. (2008). Measuring mathematic formula writing competence: an application of graphical protocol analysis. In Proceedings of the 13th annual conference of the cognitive science society (pp. 869–874).
Demirci, N. (2010). Web-based vs. paper-based homework to evaluate students’ performance in introductory physics courses and students’ perceptions: two years experience. Int. J. E-Learning, 9(1), 27–49.
Gikandi, J., Morrow, D., Davis, N. (2011). Online formative assessment in higher education: a review of the literature. Comput. Educ., 57(4), 2333–2351.
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H. (2009). The WEKA data mining software: an update. ACM SIGKDD Explorations Newsletter, 11(1), 10–18.
Herold, J., & Stahovich, T. (2012). Characterizing students’ handwritten self-explanations. In Proceedings of the 2012 American society for engineering education annual conference and exposition.
Herold, J., Stahovich, T.F., Rawsonm, K. (2013a). Using educational data mining to identify correlations between homework effort and performance. In Proceedings of the 2013 American society for engineering education annual conference and exposition.
Herold, J., Zundel, A., Stahovich, T.F. (2013b). Mining meaningful patterns from students’ handwritten coursework. In Proceedings of the sixth international conference on educational data mining.
Kara, L.B., & Stahovich, T.F. (2005). An image-based, trainable symbol recognizer for hand-drawn sketches. Comput. Graph., 29, 501–517.
Kohavi, R., & John, G.H. (1997). Wrappers for feature subset selection. Artif. Intell., 97(1), 273–324. https://doi.org/10.1016/S0004-3702(97)00043-X, http://www.sciencedirect.com/science/article/pii/S000437029700043X, relevance.
Krüger, A., Merceron, A., Wolf, B. (2010). A data model to ease analysis and mining of educational data. In Proceedings of the 3rd international conference on educational data mining.
LaViola, J.J. Jr. (2007). An initial evaluation of mathpad2: a tool for creating dynamic mathematical illustrations. Comput. Graph., 31(4), 540–553. https://doi.org/10.1016/j.cag.2007.04.008, https://doi.org/10.1016/j.cag.2007.04.008.
LaViola, J.J. Jr, & Zeleznik, R.C. (2004). Mathpad2: a system for the creation and exploration of mathematical sketches. ACM Trans. Graph., 23 (3), 432–440. https://doi.org/10.1145/1015706.1015741, http://doi.acm.org/10.1145/1015706.1015741.
Li, N., Cohen, W.W., Koedinger, K.R., Matsuda, N. (2011). A machine learning approach for automatic student model discovery. In Proceedings of the 4th international conference on educational data mining (pp. 31–40).
Mostow, J., Gonzàlez-Brenes, J.P., Tan, B.H. (2011). Learning classifiers from a relational database of tutor logs. In Proceedings of the 4th international conference on educational data mining (pp. 149–158).
Oviatt, S., Arthur, A., Cohen, J. (2006). Quiet interfaces that help students think. In UIST ’06: Proceedings of the 19th annual ACM symposium on User interface software and technology, ACM Press, New York (pp. 191–200).
Pellegrino, J.W., Chudowsky, N., Glaser, R. (Eds.). (2001). Knowing what students know: the science and design of educational assessment. Washington: The National Academies Press.
Rabiner, L. (1989). A tutorial on hidden markov models and selected applications in speech recognition. Proc. IEEE, 77(2), 257–286.
Rawson, K., & Stahovich, T.F. (2013). Predicting course performance from homework habits. In Proceedings of the 2013 American society for engineering education annual conference and exposition.
Rawson, K., Stahovich, T.F., Mayer, R.E. (2017). Homework and achievement: using smartpen technology to find the connection. J. Educ. Psychol., 109(2), 208.
Romero, C., & Ventura, S. (2010). Educational data mining: a review of the state of the art. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, 40(6), 601–618.
Romero, C., Romero, J., Luna, J., Ventura, S. (2010). Mining rare association rules from e-learning data. Educ. Data Mining, 171–180.
Schneider, S.C. (2014). Paperless grading of handwritten homework: electronic process and assessment. In Proceedings of the American society for enginering education annual conference.
Shanabrook, D.H., Cooper, D.G., Woolf, B.P., Arroyo, I. (2010). Identifying high-level student behavior using sequence-based motif discovery. In de Baker, R.S.J., Merceron, A., Jr, P.I.P. (Eds.) Proceedings of the 3rd international conference on educational data mining (pp. 191–200).
Sharma, A., & Jayagopi, D.B. (2018). Automated grading of handwritten essays. In 2018 16Th international conference on frontiers in handwriting recognition (ICFHR), IEEE (pp. 279–284).
de Silva, R., Bischel, D.T., Lee, W., Peterson, E.J., Calfee, R.C., Stahovich, T.F. (2007). Kirchhoff’s pen: a pen-based circuit analysis tutor. In Proceedings of the 4th eurographics workshop on sketch-based interfaces and modeling, ACM, New York, NY, USA, SBIM ’07 (pp. 75–82).
Singh, A., Karayev, S., Gutowski, K., Abbeel, P. (2017). Gradescope: a fast, flexible, and fair system for scalable assessment of handwritten work. In Proceedings of fourth ACM conference on learning@ scale (pp. 81–88): ACM.
Smithies, S., Novins, K., Arvo, J. (1999). A handwriting-based equation editor. In Proceedings of the 1999 conference on graphics interface ’99, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (pp. 84–91). http://dl.acm.org/citation.cfm?id=351631.351660.
Srihari, S.N., Srihari, R.K., Babu, P., Srinivasan, H. (2007). On the automatic scoring of handwritten essays. In Proceeddings of international joint conference on artificial intelligence (pp. 2880–2884).
Stahovich, T.F., & Lin, H. (2016). Enabling data mining of handwritten coursework. Comput. Graph., 57, 31–45. https://doi.org/10.1016/j.cag.2016.01.002, http://www.sciencedirect.com/science/article/pii/S0097849316300012.
Steif, P.S., & Dollár, A. (2009). Study of usage patterns and learning gains in a web-based interactive static course. J. Eng. Educ., 98(4), 321–333.
Steif, P.S., Lobue, J.M., Kara, L.B., Fay, A.L. (2010). Improving problem solving performance by inducing talk about salient problem features. J. Eng. Educ., 99(2), 135–142.
Stevens, R., Johnson, D.F., Soller, A. (2005). Probabilities and predictions: modeling the development of scientific problem-solving skills. Cell Biol. Educ., 4(1), 42–57.
Trivedi, S., Pardos, Z.A., Sàrközy, G.N., Heffernan, N.T. (2011). Spectral clustering in educational data mining. In Proceedings of the 4th international conference on educational data mining (pp. 129–138).
Van Arsdale, T., & Stahovich, T. (2012). Does neatness count? What the organization of student work says about understanding. In Proceedings of the 2012 American society for engineering education annual conference and exposition.
This material is based upon work supported by the National Science Foundation under Award Numbers 0935239, 1432820, and 1612511. Livescribe, Inc. provided some materials used in the project. We thank Daniel Jeske for his suggestions about some of the statistical analysis.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Stahovich, T.F., Lin, H. & Gyllen, J. Using Lexical Properties of Handwritten Equations to Estimate the Correctness of Students’ Solutions to Engineering Problems. Int J Artif Intell Educ 29, 459–483 (2019). https://doi.org/10.1007/s40593-019-00181-3
- Educational data mining
- Digital ink
- Problem solving
- Handwritten equations