Keywords

1 Introduction and Background

In this study, we characterise an e-Examination (e-Exam or e-exam) as a “timed, supervised, summative assessment conducted using each candidate’s own computer running a standardised operating system” [1]. We would add that the use of authentic software applications as part of the examination environment is an important element of our approach. As such, we distinguish our approach to computerised examinations from ‘online assessments’ that are limited to the test or quiz functionality of a learning content management system (Moodle, Blackboard) or specialised testing software (TCExam, QuestionMark Perception, ExamSoft) that may or may not be directly supervised by human invigilators.

This study is part of a wider project [2,3,4] investigating authentic approaches to supervised high stakes assessment typically carried out in examination halls and classrooms suited to the Australian higher education context. In this respect, our paper takes a departure from other trials we have conducted in that we focus here on e-exam use in a pre-tertiary pathway college context.

In this paper, we explore the literature related to e-exams, including matters relating to student choice and acceptance of the e-exam approach. Data were collected on the students’ impressions of the process as expressed through written comments and selected response items in a pre- and post-assessment surveys.

2 Literature

Computerised examinations have been increasingly gaining attention in the last decade. Whilst one of the first reported uses of computers for assessment was in 1965 [5], little movement away from pen-on-paper examinations has occurred in the higher education or school systems around the world. Only recently has attention shifted to modernising the examination room. Examples of efforts underway in higher education and other sectors were reviewed [1, 6]. The ‘Dublin Declaration’ developed by the International Federation of Information Processing Technical Committee Three for Education conference sets a future direction for computers in assessment [7] (pp. xvii–xviii):

“To see computers used effectively in education, it is necessary to develop fair, reliable and resilient computer-based assessment methods. Assessment methods must go far beyond imitating paper-based assessment, and prioritise the pedagogical affordances of computers over administrative convenience. The use of computers in timed, supervised assessments offers the chance to transform curricula in the light of computational thinking”.

In particular, the Declaration recommends that e-Exams must be:

“authentic assessment that matches modern workplace practices and many student learning experiences” ([7], p. xviii).

We particularly find resonance with the idea of promoting authentic assessment [8] in high stakes examinations. One way to enable such assessment is to provide a rich array of software tools of the trade to candidates in the examination room. Doing so opens up the possibility of designing complex constructed assessment tasks to be done under supervised conditions. This enables assessment designers to push into the Modification and Redefinition stages of the Substitution, Augmentation, Modification and Redefinition (SAMR) model [9, 10] or to target ‘higher order thinking’ of Bloom’s taxonomy [11] with respect to pedagogical efficacy. Systems such as ‘Secure Exam Environment’ and the work described in [4] place authentic assessment at the heart of the project. The e-Exam platform [4] used for this study uses a bring-your-own laptop approach and provides the same full operating system and application suite that includes an office suite, multimedia tools and optional discipline-specific applications (e.g. mathematics, computer aided design (CAD), chemistry, accounting). In this instance, we used the fully functional word processor as the question presentation and response environment, thus providing an authentic tool typically used to produce essays and reports.

The perceptions and attitudes of users with respect to ease of use and usefulness (being fit for purpose) have been found to be important factors in people accepting new computer technology [12, 13], but not the only factors at play [14]. This particularly applies to the students as the people most directly impacted by e-exam systems, although they often do not have a strong voice when it comes to selecting software deployed in education institutions. Therefore, it is important to ensure their views are heard if we desire smooth acceptance and operation of a high-stakes e-exam system. A survey of students [15] at the University of Bradford in the United Kingdom (UK), following their use of QuestionMark Perception, included a range of topics. Other studies include the use of Examsoft in pharmacy courses in Canada [16] and an institution-wide survey [17] capturing students’ hopes and fears prior to the trialling of e-Exams at The University of Queensland in Australia. e-Exam trials followed the latter study, exploring the students’ experiences of the process [18]. These studies on student perceptions of the e-exam process informed construction of survey tools in this study.

Another characteristic is how each system architect treats the idea of technology reliability. Where an e-Exam solution has a heavy reliance on a network during the examination, the risk of a ‘single-point of failure’ impacting on a whole cohort of students is increased and so the need to ensure extra redundancy measures is heightened. One study [17] reported that the fear of technology failure is a major barrier to the adoption and intention to use e-Exams by students. A recently publicised case of failure during a national high stakes Medical board e-exam event in Australia [19, 20] highlights the critical need to ensure a robust system and to avoid ‘single point of failure’ designs. Earlier online assessment systems tended to stop working the moment the network dropped out. Advances in web technologies mean that some systems may handle or mitigate network outages of a short duration (e.g. auto-save for Moodle quiz) but extended outages will result in an unscheduled end to the session. Only a small number of e-exam systems are able to continue to operate and successfully complete the e-exam session without a network connection. This includes the commercial product ‘Examsoft’ [21] and the e-Exam platform [2, 4] used in this study. Avoiding system-wide failures means that any technical issues that do occur are likely to be isolated to a single student. Therefore, an issue can be managed according to existing examination protocols with respect to individual interruptions, breaks and extra time.

This review of prior work has outlined several areas of concern that will serve as a focus for our evaluation in this study. These are summarised in Table 1.

Table 1. Areas for investigation

3 Study Context

In Australia, the lead author is conducting a nationwide project investigating the scalable provision of authentic assessment in the examination room using BYO laptops [2, 4]. The study reported in this paper investigates if prior work carried out in the higher education sector would work successfully in the pre-university context. The study was undertaken at Monash College, Australia within the ‘Foundation Year’ [22] programme. This programme is at the equivalent level as an Australian year 12 high school leaving certificate or the International Baccalaureate. The study was run in conjunction with the second author who is a unit coordinator and teacher in the two units in which trials were conducted. The trials were carried out using in-class supervised written assessments. These took the form of a couple of mini-cases that included photographs, charts and data tables, each with one or several questions requiring a short text or essay-style response.

4 Method and Approach

This study examined two live trials of the e-Exam system and approach in two separate units at the College, involving 128 students. The units selected were geography (Geo) in semester 1, 2016 and globalisation (Glo) in semester 2, 2017. The process used to run each trial is represented in Table 2.

Table 2. e-Exam trial process

The formative ungraded practice session was run in-class time with all students participating. Students were free to choose typing or handwriting for the real examination.

Selected response survey questions as shown in Tables 3, 4 and 5 were analysed using SPSS v24 using an alpha level of .05. Likert data pertaining to students’ opinions were treated as non-parametric [23]. Another study [15] did the same when analysing students’ perceptions of their experience with an e-Assessment system. Mann and Whitney’s U test [24] was used to test the variance between groups (typists versus hand-writers) on Likert items. When comparing paired pre-post Likert items, a Wilcoxon Signed Ranks Test [25] was used with the requirement of a normal distribution of differences met. Chi-squared was also used to test if experiencing a technical issue impacted in the decision to type the examination.

Table 3. Pre-examination survey responses by text production mode
Table 4. Post-examination survey responses regarding the e-exam system
Table 5. Post-examination survey future intention to use

It is important to note that participants were not randomly assigned to the typing or hand-writing groups so results are only descriptive of this group. As per [15], we take the stance that statistical tests serve as a tool to summarise the body of students’ opinions rather than to be representative of an objective truth.

5 Findings

The trials involved 128 pre-tertiary students; 65% were female and 35% were male. We examined the students’ opinions regarding their first encounter with the e-Exam system in terms of differences between those that went on to type the examination and those that handwrote the examination using a Mann-Whitney U test. Table 3 displays the results from Likert items (strongly agree = 5, neutral = 3 and strongly disagree = 1) collected on the pre-examination survey (done at the practice session). The strongest difference was for “I would like to use a computer for exams in the future” (U = 842.5, p = <.001). Means and standard deviations are provided in the tables for clarity.

Following the examination, typists (52%) were asked to reflect on the e-Exam system itself with regard to suitability, usability and reliability (see Table 4 and Fig. 1). The majority of items received positive agreement, most with mean agreement ratings of 4 or above out of 5. The sentiment within the group was relatively uniform, as evidenced by the small standard deviations (Table 4) and boxplots in Fig. 1.

Fig. 1.
figure 1

Opinions of the e-exam system

Typists were asked “Did you experience any technical difficulties during this exam?” Responses yes (n = 17, 24%) and no (n = 53, 76%) were gathered via a comment box and a list covering usability, technology and logistics. It should be noted that all those who typed successfully completed and submitted their work. A comparison of problems encountered in the pre- and post-sessions is shown in Fig. 2. A Chi-squared test indicated that there was no statistically significant relationship between encountering a problem in the practice session and electing to type or handwrite the examination (χ2(1) = 0.003, p = 0.956). This could indicate that the practice session did its job in preventing serious problems from reaching the examination room, or that problems were considered to be minor by those that encountered them.

Fig. 2.
figure 2

Reported issues

A comparison between typists’ and hand-writers’ intentions to use a computer for future examinations following the examination event (see Table 5) showed a significant Mann-Whitney U test result (U = 160.5, p = <.001).

Finally, we examined if students’ declared future use intentions may have changed between pre- and post-examination surveys for typists and hand-writers using the question “I would like to use a computer for examinations in the future”. Those that typed the examination were in slightly stronger agreement following the examination (n = 61, M = 4.2 SD = 0.7) than prior (n = 55, M = 4.0, SD = 1.0). However, the difference was not significant when tested with a Wilcoxon Signed-Ranks test, Z = −1.763, p = 0.078. Both pre- and post-median agreement was 4. Those that handwrote the examination became more negative following the examination to a significant extent (Z = −3.757, p = >.001), with the median agreement 3 prior and 2 following the examination. For clarity, mean agreement for hand-writers pre-examination was M = 3.0 (SD = 1.2, n = 56) and post-examination M = 2.1 (SD = 1.0, n = 53).

6 Discussion

In the practice session, most students were able to successfully undertake the steps required for doing the e-exam using their laptop, although some did require assistance. This included starting up their laptop from the USB stick and using the software (see Table 3). The Chi-squared result shows that encountering a problem in the practice session did not impact the decision to type or handwrite. The practice session appeared to resolve most serious problems (see Fig. 2) before they reached the examination itself as seen by the reduced number of issues reported between the pre- and post-sessions. Most problems that remained related to user familiarity with the software or process (i.e. forgetting the boot key, not realising that short-cut keys behaved like ‘Windows’ rather than Apple OSX) or minor hardware incompatibility (i.e. their laptop touchpad being too sensitive; although a wired mouse would have solved the issue). However, the persistence of these issues indicates that further opportunities for practice and increased awareness of the option to bring a wired mouse were needed.

Those that went on to type the examination, not surprisingly, expressed stronger agreement in being able to undertake the practical steps of the e-exam process but the differences in opinions with hand-writers were not statistically significant. However, the items reflective of confidence “I feel confident I will be able to do these steps in a real examination”, “I now feel relaxed about using the e-Exam system for my examination” and future intentions “I would like to use a computer for examinations in the future” did show a significant difference between the groups. This gap between their perceptions of process and their expressed levels of confidence or intentions could be indicative that other matters beyond those surveyed played a role in students’ decision making. Additional findings related to students’ preferences with respect to writing styles, behaviours and proficiency, where a link to their selected text production mode was found to be stronger, are reported separately [26].

Following the examination event, the results in Table 4 and Fig. 1 showed that a large majority of students who typed the examination were satisfied that the assessment task was suited to computerisation, they appreciated being able to use their own computer, and that the system was easy to use, reliable and secure against cheating. Most also agreed that they would recommend the e-exam system to others. This was consistent with prior work in the university sector [18].

A moderate, but not statistically significant, divergence of opinion between hand-writers and typists emerged across most items in the pre-examination survey. Overall, it would appear that people tended to reaffirm their choice to type or handwrite in terms of their future intentions following the examination. This divergence can be seen when looking at future intentions stated prior to the examination with difference in agreement of 1 widening to 2.1 in the post-examination survey. It would appear that students’ opinions ‘hardened’ once the real examination was over, in that typists became more positive about their future intentions to type an examination and hand-writers more negative.

Finally, the decision to allow students to self-select typing or handwriting served to lessen the stress for students, but it also limited the degree of task sophistication that was possible (i.e. keeping to the lower levels of SAMR). However, this can only ever be a temporary state of affairs if we want to progress up the SAMR ladder to include re-designed, higher order assessment tasks that assume sophisticated tools will be available. To take advantage of the affordances of modern software, means all students must ultimately use a computer in the examination. Our work on e-Exams is also about providing a strategy [4] for moving from paper-equivalent e-exams to sophisticated post-paper e-exams where all must type. This phased strategy, along with associated support, will be important in helping staff and students to make this transition.

7 Conclusion

We have successfully completed two trials of e-exams centred on the use of a fully featured word processor in two different units within a pre-university context. From this point of view, we broadly achieved what we set out to do in that the e-Exam technology and BYO laptop centric processes were shown to have worked in this context. We have also seen that most students were satisfied with the approach to doing e-Exams within a classroom setting. The strength of opinions regarding the process and technology between those that typed the examination and those that elected to hand-write were not significantly different, although there was a general trend towards typists holding more positive opinions. Their levels of confidence did differ significantly and this likely played a role in their choices. The results are not at all surprising given the self-selecting nature of the groups who went on to type the examination. However, it does reinforce the need to ensure adequate support is available to students who are not all equally prepared for the computerisation of high-stakes examinations.

Future work will involve comparisons with similarly run examinations in the university system and within different discipline contexts. The next phase will be to trial e-Exams using post-paper, higher-order tasks where all members of the class will type. Further technical work on the e-Exam system is progressing that will see integration with the Moodle quiz tool alongside the ability to use authentic software tools in a manner that is robust against network outages [27].