Introduction

We look back at research reported in our 1995 paper entitled “Supporting the use of external representations in problem solving: The need for flexible learning environments” published in Volume 6 of IJAIED. We revisit—19 volumes of IJIAED later—the issues that motivated the work and reflect upon it in the light of the subsequent developments in the field.

The Genesis of the Work

The work combined two of our interests—the emerging AI in education and intelligent tutoring systems community (AIED/ITS, e.g. Self 1988; Wenger 1987) and the cognitive effects of externalising information representations (Stenning and Oberlander 1995; Larkin and Simon 1987; Amarel 1968; Norman 1993; Day 1988; Levesque 1988; Green 1989). Combining the two fields eventually resulted in the building of a system (switchER2) capable of providing limited support to learners as they constructed external representations during ‘analytical reasoning’.

At the time several highly influential authors had raised the issue of external representations (ERs) in technology enhanced learning (as we now call it). These included Goldstein (1978), de Corte (1990), (e.g. Friedler, Nachmias & Linn, 1990) and Dillenbourg and Mendelsohn (1992). However, there had been few if any attempts to synthesise what was known at the time about the various roles that external representations had (or could have) in learning. Little, if any, of this work had resulted in AIED systems that focussed on helping learners to select, construct and use ERs—let alone learn to do these things better. There had been very little focus indeed on switching ERs as a reasoning strategy.

The first author’s background at the time was as a research assistant on a project studying the implications of a theory of representational systems (Stenning and Oberlander 1995). The theory characterised graphical and linguistic representational systems in terms of their expressivity and suggested that the cognitive effectiveness of graphical representations such as diagrams derived—at least in part—from their specificity or reduced ability to express abstraction and indeterminacy compared to linguistic representations. Specificity theory’s implications were tested in an evaluation of Hyperproof (Barwise and Etchemendy 1994)—an interactive desktop application for teaching proof methods in first-order logic (Stenning and Oberlander 1995). In the course of the Hyperproof evaluation a corpus of student responses to Graduate Record Exam (GRE) analytical reasoning problems was assembled. GRE response accuracy scores were used in the Hyperproof evaluation. However, we were struck by the variety and extent of the respondents’ paper-and-pencil ‘workscratching’ annotations to their GRE answers. A detailed analysis of the workscratchings and their relationship to response accuracy therefore formed the first part of the Cox and Brna (1995) paper.

The paper-and-pencil GRE workscratching analysis was a fascinating exercise—it revealed a wide range of individual differences between respondents in terms of the types of external representations they drew as they reasoned. Some students used textual or tabular representations whereas others used diagrams. Some respondents used more than one representation (perhaps switching between them) and respondents who appeared to have made bad representational choices—or who made mistakes in the execution of their ER—didn’t necessarily answer incorrectly. The analyses showed wide individual differences in ER behaviour and this became another phenomenon that we wished to study. The concept of cognitive styles (e.g. ‘visualiser-verbaliser’—Riding and Douglas 1993) was influential at the time.Footnote 1 However, from our perspective scores on cognitive style instruments (e.g. visualiser-verbaliser questionnaires) were not informative about the kinds of diagrams students were familiar with. We decided to try a different approach to assessing our participants’ representational knowledge. We therefore developed an ER card-sort task based on a knowledge elicitation method commonly used by expert system developers. Subjects sorted a wide and varied range of ERs into categories that made conceptual sense to them. We found that better diagrammatic reasoners tend to sort graphs, charts, notations and diagrams into fewer and more cohesive categories and name them more accurately than poorer diagrammatic reasoners.

A problem with workscratchings was that they represent only the final state of external representation. By ‘final’ we mean that they only show the end-state of the representation creation process—they do not provide information about the time course of representation building. Students have a kind of dialogue with themselves through their external representations via the construction process and it seemed crucial to find a way of dynamically connecting ER construction events with other aspects of reasoning such as problem comprehension, representational choice and patterns of response. Process data were needed. Capturing students’ interactions with a computer-based system was an obvious way to collect it. We decided to use a computer-based environment (switchER1) to collect data and to use the results to inform the design of a subsequent intelligent version (switchER2). We termed this process ‘iterative learner-centred design’ (Brna and Cox 1998). Together with researchers such as Gilmore (1996) and Soloway et al. (1994), we became persuaded that the interaction design needs of learners differed substantially from those of ‘users’ in the HCI sense. Ease-of-use seemed to be the goal of HCI, whereas ensuring good learning outcomes was our goal. The switchER1 system was then developed—it presented GRE problems and provided a range of ER building tools for tables, text, and diagrams.

The emerging knowledge on the role of external representations in problem solving led to the belief that learners need to learn to improve their ability to select appropriate ERs, and to improve their ER construction and ‘reading’ skills. We also wished to explore the potential of representation switching as an effective reasoning strategy. If this was to happen then we saw AIED systems as potentially having several roles; one key role was to support the learner move easily from one ER to a different one (switching ERs). We also wished to apply AI techniques to keep several forms of ER in synchrony as the student works on just one of them; to guide the student to which of several kinds of ER would most likely be of use; and to help the student use an ER effectively. As an important step, empirical results were needed to determine some of the factors that helped learners to select, construct and make use of an ER.

The Key Drivers

Specificity theory (Stenning and Oberlander 1995) and the seminal paper by Larkin and Simon (1987) were major drivers. These theoretical ideas provided bases upon which representations can be assigned to information, the notion of empowering the learner by supporting external representation construction and facilitating judicious representation switching during problem solving. We felt that the approach was thorough and methodologically innovative and it is hard to see what could have been done better at the time.

Another driver was our strong motivation to work at the intersection of AIED systems and reasoning with external representations (particularly diagrams). A further driver was methodological—to study the process of ER production in detail by capturing student/system interaction data in real-time. We therefore adopted a technology-enhanced research approach in which the system that provided users with the interactive learning environment also served as an instrument for the collection of detailed learning process data. A video screen capture utility was used in the switchER1 study to record students’ diagram drawing behaviour. We also employed retrospective debriefing (Taylor and Dionne 2000). Screen capture videos were replayed to participants after the session and they were encouraged to verbally describe their actions and decisions. On reflection, this retrospective protocol recording procedure may well have had an additional learning benefit for our participants—perhaps akin to the ‘self-explanation’ effect (Chi and Bassok 1989).

Core Contributions

The work shed light on the course of selection, construction and use of multiple ERs. It also provided a basis for constructive discussions about future possibilities for supporting the learner using intelligent learning environments.

Perhaps a key contribution of the paper was the realisation that support for learners who reason with external representations needs to be flexible enough to accommodate wide individual differences in representational knowledge. Individuals were observed to differ in the extent to which they externalised their cognition and in their ER production skills.

Choosing, producing and using ERs is a dynamic process—a self-dialogue involving multiple processes: information translation, (re)representation, and feedback. We saw opportunities for supporting such self-dialogue processes at the level of attention-drawing, hinting and suggesting. Reasoning support interventions by switchER2 included providing feedback to the learner on errors of omission (e.g. leaving part of a problem unrepresented) and on errors of commission (e.g. mistakes in their diagram).

Practical Impact

The work described in the 1995 paper impacted subsequent work as follows:

Research with switchER1 was used to inform the design of switchER2 which was capable of detecting errors in representation building and giving feedback. It also provided a range of subtle support interventions as described above. Switcher2 informed—to some extent—the “Conception” classroom concept mapping tool (Conlon 2006) which is still in use in some UK schools.

The work also influenced the development of a system to support good information visualisation choices—the External Representation Selection Tutor (ERST—Grawemeyer 2006) and an interactive system designed to help young children overcome “graph-as-picture” misconceptions (Garcia Garcia and Cox 2010; Garcia Garcia 2015).

The ER corpus used in the switchER card-sort studies was subsequently used to study ‘graphical literacy’ as a predictor of software debugging skill (Romero et al. 2003) and to explore the extent to which visual cognitive models of object picture processing are useful for understanding how people process non-pictorial external representations such as diagrams (Cox 2014).

Progress Since 1995

Educational Systems

Numerous excellent educational systems have focussed on the use of ERs and visualisations in learning contexts. There has also been significant work on multiple external representations (MERs) such as COPPERS (Ainsworth et al. 1998) and the role played by self-generated ERs in animation comprehension (Mason et al. 2013).

The educational field of geometry is an area in which several pedagogically well-designed interactive graphical systems can be found. In the subfield of plane geometry there are Geometers’ SketchpadFootnote 2 and Cabri Geometry.Footnote 3 In the 3D geometry domain van Labeke (1998) developed the Calques 3D environmentFootnote 4 which contains geometry domain-specific features carefully designed to aid students overcome barriers to 3D visualisation when observing, constructing and exploring geometry diagrams. Systems designed for educational use such as Calques 3D can be clearly differentiated from domain-general 3D graphical design environments such as Autocad.

A few systems detect, analyse and respond to a learner’s representational choices or representation construction. One of them is Tarski’s World, an interactive application for students of logic (Barker-Plummer et al. 2008).Footnote 5 Using Tarski’s World students can build graphical three-dimensional worlds and write linguistic descriptions of those worlds in the language of first-order logic. The system is capable of intelligently evaluating the truth and validity of the sentential descriptions in the graphical world and can provide feedback (in the form of a game) to lead the student to align the two representations in cases where they evaluate as incorrect.

Another example is provided by the work of Noss et al. (2012). They describe MiGen, an exploratory learning system for 11–14 year olds built in the constructivist tradition that supports algebraic generalisation. A component of the system (eXpressor) provides a representation construction environment in which students can construct patterns during their reasoning about algebra word problems. One type of problem involves abstracting a general rule for the number of tiles needed to build footpaths around rectangular fishponds of any size. The student’s activity in the eXpressor environment is monitored by another system component (eGeneraliser) that automatically infers several types of ‘interaction indicator’ from the student’s ER construction activity. Information based on the detected indicators is sent to the teacher and updates a visualisation within a teachers’ student tracking tool.

A third example of an adaptive educational system is the external representation selection tutor (ERST) mentioned above (Grawemeyer 2006). ERST incorporates a Bayesian learner model for individualising the information visualisation options that it presents to users. ERST was shown to be effective in encouraging users to make good information visualisation decisions across a range of reasoning tasks.

SimSketch (Bollen and van Joolingen 2013) represents a different approach to computationally processing learners’ external representations. SimSketch supports computational modelling by allowing students to create informal sketches of multiple agents (e.g. planets, cars, predators) and to then apply behaviors to them in order to build dynamic simulations. The environment does not directly parse the students’ representation but provides tools for turning them into runnable animations.

The large-scale project CogSketch (Forbus et al. 2011) also has a sketch-based educational component. CogSketch provides an environment for users to draw objects (glyphs), to specify relations between them, to organise sets of glyphs into higher-order sets and to label them using a large-scale concept database. CogSketch has been used in geology education to provide immediate feedback to students by comparing their sketch annotations to those of their instructors. Feedback from CogSketch is based on the use of structure-mapping methods to compute a kind of ‘edit distance’ between the student’s representation and the instructor’s.

The lines of research that stem from our work reported in 1995 have arguably been influenced by wider changes in the community. At the structural level, the rise and rise of the web led to much work (of many kinds) being reimplemented in order to deliver systems over the web in many different ways (browsers, LMSs, MOOCs). This, together with the limited functionality of early web systems, certainly caused difficulties for those interested in researching external representations and incorporating methods for detecting learner activity. On the whole these technical problem seem to have been overcome—though there is certainly (and rightly) a rise in the need to ensure that ethical considerations are taken into account in the management of data collected from the learners.

In particular, improvements in both web software and bandwidth has resulted in the growth of educational data mining. This interest was actually present in some early systems—especially those interested in learner modelling or in statistical analysis of learner behaviour—but lack of sufficient data often made it hard to do much work with what there was. The growth in the capabilities of web-based systems and interest in the notion of “Big Data” has enabled faster and more extensive growth. At the same time there have been significant improvements in the design and deployment of sensors that can be used to track learner behaviour. These developments should lead to improvements in data gathering and analytics for systems such as switchER2 and have potential for deeper research into the kinds of use of external representations that we were interested in back in the 1990s.

Theoretical Progress

We select three developments in visual cognition research since 1995 that we believe have significant implications for the design of educational systems for supporting reasoning with ERs.

Significant theoretical progress in the field has been made by Novick and her colleagues (e.g. Novick and Hurley 2001) who have elucidated the ‘applicability conditions’ under which specific properties of matrix, network or hierarchical representations can be optimally matched to information characteristics. This research is important as it provides an principled basis for the design of ER selection support systems. Novick’s work complements the earlier theoretical work on specificity (Stenning and Oberlander 1995) and information equivalence (Larkin and Simon 1987).

A second theoretical development is research on the ‘visual impedance hypothesis’ (e.g. Knauff and Johnson-Laird 2002) which suggests that not all graphical information is created equal when it comes to reasoning. Visual images seem less useful than spatial images in reasoning and can even interfere when visual detail is irrelevant. Educational data mining has been used to study this phenomenon in an educational context. Barker-Plummer et al. (2011) analysed a large corpus of logic students’ submissions to an online grading system. The exercises required students to translate English sentences such as “B is larger than both A and E” into first order logic (Larger(b,a) & Larger(b,e)). Sentences could refer to visual characteristics of objects such as size (e.g. small, large) or shape (e.g. cube, tetrahedron) or to spatial relationships (e.g. adjoins, between). Fewest errors were made on sentences containing only shape, size or space information (shape < size < space where “ < ” = ‘easier than’). Combinations of spatial information with size or with shape were associated with higher error rates. Discovering more about how the visual and spatial information components of external representations differentially affect reasoning is an important future research direction and one with significant implications for the design of educational diagrams. This is a pressing issue because in a survey of diagrams in the UK National Curriculum, Garcia Garcia and Cox (2008) found that the practice of ‘decorating’ diagrams with pictorial elements is frequently seen in those educational guidelines. Garcia Garcia & Cox caution that the practice may engender ‘graph-as-picture’ misconceptions in students.

A third area is embodied cognition, an area of cognitive science research that has emerged since 1995. Theorists such as Andy Clark (2008) discuss the potential for external ‘encodings’ to become “…so deeply integrated into online strategies of reasoning and recall as to be only artificially distinguished from proper parts of the cognitive engine itself” (p77). Antle (2013) discusses the implications of embodiment for child-computer interaction and Dor Abrahamson has designed and studied embodied educational systems (e.g. Abrahamson and Lindgren 2014). The Mathematical Imagery Trainer for Proportion (MIT-P) uses computer displays and position sensors to detect the location of a student’s hands. The student moves his or her hands up and down so as to keep the screen green rather than red. Through the system’s visual feedback the student learns qualitatively—through enaction—to gain insights into concepts of mathematical proportion by moving their hands in continuous space. Abramson terms this pedagogical principle “dynamical conservation”—the learner discovers an action pattern that reflects a mathematical law.

Garcia Garcia (2015) developed an interactive graphing system for younger students. Using touchscreen interactions students can race a car around a virtual race track with a speed/distance graph of the car’s student-driven behavior plotted in real time alongside the racetrack display. An alternative mode of interaction is also available—the graph and track are dynalinked such that the student can draw a graph and watch the car’s racetrack behavior implied by it. The results of two studies (using a battery of graph interpretation assessments) showed that graph interactions (with racetrack behavior visualizations) produced greater improvements in students’ graph interpretation skills than the converse (racing cars and observing automatic graph construction). The significance of the findings is to demonstrate that the educational benefits of embodied interactivity are not necessarily bi-directional…actions on one representational form rather than another can optimise learning outcomes.

The process of producing external representations by drawing diagrams on a computer or on paper entails embodied actions akin to those central to MIT-P. In our view an important agenda for future research should address what kinds of actions should be encouraged or facilitated in students who are producing ERs. The actions and movements of the body need to be seen not as a means to an end (a diagram) but as a crucial and educationally significant component of externalised cognition.

The Future

On the basis of trends identified in the previous section we identify the following areas as ones in which significant future developments will occur:

  • The further development of evidence and theory-based principles for instructing students on how to best match representational systems to forms of information and the embodiment of the principles in interactive learning environments.

  • Developments in our understanding of how the visual and spatial elements of diagrams contribute to visual cognition and use of the findings to inform the design of data visualisations for communication and education.

  • ‘Big data’ approaches to student-system interaction data accumulated by MOOCs and other large-scale e-learning systems. The application of educational data mining & learning analytics to students’ interactions with diagrams will be a key methodology for advancing our knowledge in the two areas above.

  • The exploitation of new forms of touch and haptic interactive screen technologies for supporting educationally powerful forms of interaction and enaction. Well-designed interactive graphical representations on touch screen displays have enormous potential for improving students’ graphical literacy and avoiding students developing misconceptions such as ‘graph-as-picture’. Advances in 3D movement and position sensing technologies (e.g. Microsoft Kinect) also offer exciting scope for developing enactive systems supporting embodied cognition in the style of Abrahamson & Lindgrens’ MIT-P system.

Conclusion

During the early 90s the concept of ‘viewpoint’ was prevalent. There were many, somewhat contradictory, definitions of viewpoint but it can be seen as a way of looking at a situation which is—more or less—consistent.Footnote 6 A shift occurred in AIED during the 90s from an early rigid definition of viewpoints—which held that students must learn to adopt optimal perspectives—to one in which individual differences between learners’ viewpoints were taken seriously. There was a growth of interest in ILEs that could provide learners with multiple viewpoints (Self 1992; Moyse 1992; Laurillard 1992; Nichols 1993). Our work on ERs sought to move away from simply using pre-fabricated ERs to a position in which students might be able to select and construct their own representations. In our view there is room for taking this shift further.

Systems that intelligently parse and respond to input graphics or support ER switching remain relatively rare. Van Labeke and Ainsworth (2002) used the DEMIST system as a tool to investigate, amongst other matters, requests by learners to translate from one representation to another. Rau et al. (2013) address the issue of switching in some detail—perhaps the most determined effort to develop an AIED approach since switchER2, but state:

Unfortunately, we still know little about how best to implement multiple graphical representations in instructional materials, let alone how best to take advantage of the specific opportunities intelligent tutoring systems offer to enhance students’ learning with multiple graphical representations. (p.3)

In terms of developing the necessary understanding, there is a fair amount of work but there is (yet again) a need for a synthesis of what is currently understood about how to take advantage of multiple representations. Certainly Ainsworth (2006) has provided a well-structured and influential framework (DeFT) for learning with multiple external representations and a useful analysis of the problems associated with supporting learners when they switch representations. Mayer (2009) has also published multimedia design guidelines. De Vries (2006) has addressed how learning with multiple representations takes place within design based learning. Schnotz and Bannert (2003) among others have looked at some problems with learning with multiple external representations.

Rau et al. (2013) report that students require prompting to connect information depicted across multiple representations—they do not do so spontaneously. The issue of students’ non-spontaneity in ER use has also received considerable research attention over recent years from Uesaka & Manalo (e.g. Uesaka and Manalo 2014). Those authors have observed that students see diagrams as devices that their instructors might use as teaching aids but not necessarily as tools that they themselves might use in problem solving. The results of these studies are commensurate with our own (1995) observations of wide individual differences between students in their propensity to self-construct ERs. Further study of ‘hesitancy’ on the part of students is highly warranted and further highlights the need for flexibility and sensitivity to individual differences.

As part of their work for the Macdonnell Foundation, McKendree et al. (2002) studied the use of external representations across a school curriculum and found many problems which might well be addressed by teachers providing an “emphasis on the skills of selection of, construction with, and reasoning about commonly occurring forms of representation.” The notion of teaching graphical literacy skills explicitly to students has also been proposed by Fry (1981) and Cox (1999). The idea is also supported by Schwonke et al. (2009). Another approach is to take a more situated line—as in switchER2—which encourages thought about representations in the context of current problems. This suggests that students would have to confront significant cognitive loads as well as breaks in the flow of their problem solving—challenging, but potentially very worthwhile. Another recent initiative is the University of Colorado’s beSocraticFootnote 7 system. Students learn to interpret cartesian plots of data (e.g. scatterplots) and are prompted to reflect on representational issues via questions such as “What does best fit’ mean to you?”.

To conclude, the work in the original paper contributed by highlighting the need for AIED systems to take account of external representations. At the same time, it contributed to a growth of a specialised community of researchers from many disciplines interested in multiple external representations. Significant research over the last two decades has led to a situation in which there are numerous empirical studies but very few significant attempts to realise the specific insights in AIED systems.

This situation is not so surprising: the commitment to build an AIED system requires far more resources than small (or even quite large) empirical studies. And—usually—the motivation for funding such endeavours results from perceived needs at a national level.Footnote 8 Yet these kinds of pressures tend to generate demands for systems targeted at topics that respect the ways in which disciplines are prioritised by funders, politicians, educational systems and so on. The study of external representations cuts across such interests (and curricula), and the importance of developing systems like switchER2 are seen as subsidiary to the main educational/societal goals. And yet, the skills targeted by our work in 1995 are ones that are potentially transformative for 21st Century students.