1 Introduction

A long-lasting evolutionary process is leading communications and human interactions to be more and more computer-mediated. This is true in the educational field as well, since more than two decades [1].

Thus, online learning has become a very common practice for millions of students all over the world, especially in recent years, also due to the Covid-19 pandemic situation. This trend is likely to lead to substantial and permanent modifications in the current practices of learning and teaching, as well as in the inherent methodologies. Indubitably, a major factor responsible for this change is the ever-increasing appreciation for videos in every field of application, which are progressively gaining popularity among Internet users, at the expense of other media formats. The ease of both video recording and watching anywhere, anytime, and with any device certainly contributes.

With regard to education, learners and teachers are still riding the wave of Massive Open Online Courses (MOOC). The extensive experience with such environments allowed finding advantages but also raised issues in using video lessons for learning activities [2]. Despite several approaches have been proposed to address such issues, e.g., to detect and enhance engagement [3], some critical aspects remain and, to the aim of improving learning experience and, consequently, to get a more effective education and training, this paper reports the design of novel tools that can improve the way students are consuming video lessons.

Typical problems that can be encountered are that (i) it is hard to navigate video contents, browse them, and search for specific fragments —this issue is particularly critical when videos are long, especially when they are the footage of face-to-face lectures; (ii) the information within videos is unstructured; (iii) it is difficult to highlight the most relevant contents to get something like a quick recap; (iv) it is difficult to identify the more discursive parts, often poor in content, which may be useless and time-consuming.

To address these limitations, methods to enhance the learner experience have been studied for long, starting from methods available in the existing literature and designing novel approaches. These methods exploit speech-to-text techniques, natural language processing and semantic analysis [4], techniques for information extraction and annotation of concepts and prerequisite relations [5], methods for information visualization, such as concept maps, long studied in education, which have already proved their worth and validity, demonstrating that their adoption can improve the learning experience of students, also including the ones with special needs [6].

In this vein, this paper presents the outcome of the activities carried on in the framework of a larger project called Edurell. The project aims to (i) automatically derive concepts from within video lessons, (ii) classify the occurrence of such concepts in the video stream as concept definition or in-depth explanation, (iii) identify their relationships in terms of prerequisites, and (iv) represent the final graph as a semantic knowledge graph [7]. Hence, the whole system should be considered a hypervideo service and it can be managed through an ad-hoc visual interface. In a broader vision, the project aims to spread the use of augmented video services within educational environments. In fact, despite the effectiveness of augmented video services has already been demonstrated [8, 9], they are not widely adopted, since their development is time-consuming and requires a huge manual effort from subject matter experts. Specifically, in this paper the use of two types of Interactive Visual Tools (IVT) is proposed: (i) an augmented transcript and (ii) a dynamic concept graph (map of concepts’ flow). For each of these, some particular learning contexts were outlined, which can be considered as use cases in order to evaluate the possible enhancement of the learning experience. In the following, the use cases are referred to as First watch and Rewatch learning contexts. The former is mostly oriented toward comprehension, and the latter toward deeper understanding and learning [10].

Then two studies are presented. The first one is a heuristic evaluation with experts carried out to identify the main usability issues. The second one is a user study carried out with higher education students, which are the target end-users of the Edurell Video Augmentation platform.

The contributions of this work can be summarized in the following points: (i) Edurell provides learners with an interface for enhancing their learning experience through the above-mentioned IVTs; (ii) two different learning contexts are addressed; (iii) the findings from the user study confirm the perceived usefulness of the developed interactive tools with enhanced hypervideo capabilities for video-based learning.

The remainder of the paper is organized as follows: In Section 2 a selection of related works is presented to support the design decisions, in Section 3 the proposed hypervideo model is described. Then, Section 4 describes the methodology and the results of evaluation through experts, while the user study is reported in Section 5. Conclusions follow in Section 6.

2 Related works and background

Over the last decade the use of videos in education has been quite extensive, thanks largely to the success achieved by the MOOC platforms. Prior to the Covid-19 pandemic, the digital transition promoted the raise of a blended learning approach [11, 12], but the pandemic situation forced institutions to move quickly towards a full remote model for education. As a consequence, in those years, recordings of lectures have grown exponentially, and numerous related studies have been carried out to understand the point of view of students and their degree of engagement. Recent studies show that video-based learning fits positively into the student’s perception, also taking into account gender and digital inequality issues [13]. However, the lack of engagement is still an open issue, especially when watching a video is a passive activity with no or little chance of interaction. To overcome this limitation, numerous solutions have been experimented in research studies such as, e.g., the use of gamification [14], the possibility of building suited environments for collaborative annotation [15], the use of self-assessment quizzes to verify learning [16, 17], the adoption of interactive annotation to encourage soft skills learning [18].

Fig. 1
figure 1

Screenshot of the Edurell login screen

With regard to the issue mentioned in the introduction in terms of lack of support for navigation and content structuring, the direction taken by video-based learning is the one of the so-called hypervideo (HV). The definition of HV has a long history [19], as early as 2004 Zahn et. al. [20] identify the HV as a “combination of digital video and hypertext, which draws largely upon audiovisual media as central parts of their structure. They consist of interconnected video scenes containing ‘dynamic’ hyperlinks that are available during the course of the video scenes and that refer to further information elements (such as texts, photos and graphics)”. Although the definition of HV has not yet been fully formalized, it is now common practice to refer to HV when there is some reference to interactivity such as, e.g., with control features, hyperlinks, collaboration options [8, 21,22,23]. Besides, the need to provide video augmentation services is closely related to another research stream that had a discrete success, owing to its positive impact on learning, that is the use of knowledge or concept maps. The research in this field has extensively tested that the application of concept maps both in different scientific domains [24,25,26,27] and at different levels of education (i.e., from primary school to university) [28, 29] can impact positively on learners, also considering students with special needs and/or specific learning disorders [6, 30]. The way concept maps are integrated into HV services is strictly related to information visualization and to the importance of the content presentation, in order to have a certain effectiveness, in the consolidated perspective that, in the educational field, the learner can be considered as a prosumer [31]. Many projects based on a data-driven approach have explored different possibilities for improving the navigation experience, such as, e.g., data-enhanced transcript search and keyword summary, automatic display of relevant frames, visual summary representing points with high learner activity [32], non-linear consumption of videos using personalized fragment navigation [33], and exploration of e-learning contents via small screens [34]. The idea of using concept maps to support video navigation is present in some works since the beginning of the new millennium [35], but the novelty of this contribution can be identified in the possibility of automating the creation of concept graphs, hopefully going in the direction of creating an on-the-fly service for video augmentation. Details about the automatic creation of the prerequisite knowledge graph to be used as knowledge base for providing video augmentation services are available in [7]. Moreover, in the Edurell project, the concept graph supporting the video must be regarded as interactive and dynamic, as, respectively, the interaction with it produces actions on the user interface and its view is strictly related to the contents presented within the video, which are automatically highlighted by querying the RDF knowledge graph that describes the contents within the video lesson.

Fig. 2
figure 2

Screenshot of the users’ home screen of the Edurell application

3 Description of the system

The Edurell Video Augmentation system is an enhanced viewer for students watching video lessons, which offers additional functionalities in side panels, to enrich the learners’ experience. The splash screen presents a quick overview of the internal system with a relevant description of its functionalities. Specifically, following the numbered red spots in Fig. 1, Edurell allows users to:

  1. 1.

    sign in/log in the system from the LOGIN button;

  2. 2.

    have an overview of the features, i.e.,

    • see the explained concepts in a graph,

    • search for specific concepts within a video,

    • exploit the video fragment navigation;

  3. 3.

    get information about concepts using the Search bar;

  4. 4.

    edit the user profile;

  5. 5.

    navigate through the video thanks to the fragments and see the viewing progress;

  6. 6.

    write down notes about the lesson;

  7. 7.

    download the transcript or enlarge the relevant panel to navigate through the lesson;

  8. 8.

    get some help with the conceptual map (one click takes to the concept’s video timestamp and shows more details);

  9. 9.

    download the graph.

Moreover, the welcome page in Fig. 2 shows the browsing history for the current user and a list of other videos available within the system.

Fig. 3
figure 3

The video player with the transcript panel

The novel concept of the interactive knowledge graph representation required the realization of suited functionalities induced by such a new HV interpretation. To enable users navigating videos according to a variety of criteria, a suited user interface (UI) was designed to fully exploit such new capabilities. Specifically, the proposed UI merges different canvases as shown in Figs. 3 and 4 (follow the numbered red spots):

  1. 1.

    the player for the video, based on YouTube video player;

  2. 2.

    a frame for the video transcript;

  3. 3.

    a frame for the access to the interactive knowledge graph (map of concepts’ flow);

  4. 4.

    a bar for navigating the video through indexed fragments;

  5. 5.

    a suited space for students to take notes.

Moreover, the progress bar of the video player is enhanced with some markers in correspondence of video highlights and links to the concepts represented in the knowledge graph. It is worth noting that the transcript of the video should be considered as an Augmented Transcript. In fact, it is not the mere transcription of the speech within the video, yet it is a hypertext where the domain concepts are highlighted (see Fig. 3) and linked to the relevant video fragments.

Also, the knowledge graph is actually a Map of Concepts’ flow, since it is a dynamic graph that represents the prerequisites relations between concepts (see Fig. 4) and its nodes become highlighted as concepts appear in the lecture. In short, this acts as a contextual help for the concepts explained by the lecturer who recorded the video, allowing students to navigate the subject on the basis of their individual level of knowledge. To this aim, the knowledge graph panel shows the prerequisites that the student has to know in order to get the current concept. The same panel also provides anchor points to that concept in the video.

The source code of the system is made available on GitHubFootnote 1.

The major contribution with respect to the literature is that, since concepts evolve as the video flows (as their explanation goes deeper), they are initially presented with lower complexity, resulting in a contextual help, which shows a simple knowledge graph. Later on in the video, the same concepts may be deepened with additional notions (concepts) that will have been expressed in the meantime.

Fig. 4
figure 4

The video player with the knowledge graph panel

This is reflected in the dynamic contextual help, where the graph is progressively enriched and becomes more complex (see Fig. 5). Thus, concepts do not have a static set of prerequisites during the whole video and, consequently, their contextual help evolves dynamically, in accordance with the video flow. This has relevant potential applications towards personalized contextual help when the domain knowledge graph is matched against the knowledge model of the learner.

Fig. 5
figure 5

Contextual help pop-up over the map providing hypervideo links

The suitability of these tools was evaluated by assessing the usability of the UI and the students’ experience with it in learning tasks. To this aim, (i) a heuristic evaluation with HCI experts and (ii) a user-based evaluation were performed. These two test sessions were based on similar scenarios, which will be described in the following.

4 Evaluation through experts

In this section, the expert evaluation of the Edurell platform is reported. It was aimed to investigate the usability issues of the user interface and how the identified issues could be addressed.

4.1 Method

Scenarios Three scenarios were defined, and a panel of experts was asked to face the three situations described as follows:

  1. 1.

    First-time viewing (First watch): “Imagine you are a bachelor’s degree student in Archaeology, whose video lectures are included in the Edurell system. Imagine that, within your course of study, face-to-face courses, online lessons, and courses on MOOC platforms (as additional activities to get credits) are provided. Then you decide to attend the ‘Forensic Archaeology and Anthropology Course’ in autonomous mode. The Edurell platform, with its hypervideo functionalities, will provide you with a support to follow the course on your own”. Following on this, the scenario continued by providing the expert with the description of the intended support for the given learning context, including the Transcript and the Interactive Concept Graph, and by indicating specific actions required to the expert in order to use all the functionalities intended to support the first watch.

  2. 2.

    Second time viewing (Rewatch): “In the above-described scenario, imagine you have already watched all the videos of the course but you want to browse again the first lesson to improve your understanding of some concepts therein”. As above, specific actions were required to the experts to simulate this scenario, including the interaction with the Concept Graph and its related features for detailed view and concept search.

  3. 3.

    Focus on content with and without hypervideo support: In the third scenario, the experts were asked to switch between the hypervideo functionalities provided by the Edurell platform while watching the videos and perform some actions aimed to make them focus on the content of the video. Then they were also asked to write down in the note panel the prerequisite concepts of a given concept. The goal is to make experts reflect on which features allow students to better understand the concepts and the relationships between them in an educational video.

Further details on the scenarios can be found in [36].

Methodological considerations The ISO 9241-11 standard provides a general definition of usability that is widely applied in different fields [37]. This standard defines usability as “the extent to which a product can be used by specified users to achieve specified objectives with effectiveness, efficiency and satisfaction in a specific context of use” [38]. The ISO standard also provides a description of the terms in the definition, as follows:

  • User the person who interacts with the product;

  • Objective: the expected result;

  • Effectiveness: accuracy and completeness with which users obtain specified goals;

  • Efficiency: resources spent in relation to accuracy;

  • Completeness: with which users reach the objectives;

  • Satisfaction: freedom from discomfort and positive attitudes towards the use of the product;

  • Context of use: users, activities, equipment, and the physical and social environments in which a product is used.

However, the complex nature of usability [39] and its implications in different domains make the concept fluid and addressed with different approaches. Even the mentioned standard is still under review, in order to include new lessons learned on usability since 1998 and new elements that have emerged in relation to the very concept of usability [40].

Given these premises and recalling that there can be no mathematical methods to make rigorous and accurate measurements, the typical approach to assess usability is through a series of usability inspections or usability tests that take into account different criteria. In this respect, one of the most widely used techniques is to carry out a heuristic evaluation with experts to find any usability problems. This method is based on the so-called “heuristic principles” or “usability heuristics” to evaluate usability.

Even though, as said, each specific domain should have an adequate set of usability heuristics since the more generic or traditional ones will not be able to correctly evaluate the specific characteristics of the different types of software and applications, there is a general agreement about the usefulness of adopting a preliminary heuristic evaluation to identify a priori any usability problems before performing further evaluations, including extensive usability tests with final users [41]. This is the chosen approach, following the model proposed by Nielsen and Molich [42] that involves usability experts who inspect the interface of a tool for possible usability issues. Hence, a heuristic study was performed, by selecting 5 experts in HCI and usability tests.

The experts were provided with the description of the scenarios and left free to use the web application as they preferred. Then, they were asked to follow the Nielsen heuristics to analyze and evaluate the Edurell platform. The remarks and concerns with the current release of the system collected from the experts they were carefully analyzed.

4.2 Results

Table 1 summarizes the results achieved with the experts’ evaluation, based on the above-cited Nielsen usability heuristics [42]. The table reports the experts’ comments about ten usability issues and the suggestions for counteracting the highlighted issues. It is worth noting that the responses given by the experts were mostly consistent, especially for the points where they reported “No issues" and for the points for which they gave no suggestions. Moreover, suggestions reported in the relevant column put together recommendations proposed by different experts.

Table 1 Experts’ feedback with reference to the Nielsen usability heuristics

Besides, the experts also noticed that exporting data from the platform is possible in the json format, however, depending on the system browser/OS combination, it may happen that the file is not downloaded, yet visualized on the screen, which resulted in a blank, empty page.

As a limitation of the study and possible bias, it should be observed that the purpose of the scenarios —especially the first one— of making the experts aware of the platform features might have had an influence on the experts comments, in particular concerning item 6 about the recognition of function keys, which did not raise any issues.

5 User study

The goal of this user study was to evaluate the usability and the learning experience perceived by the intended users of the Edurell interface, i.e., higher education students. The user study was performed after addressing some of the weaknesses identified from the heuristic evaluation. Specifically, the interventions mainly concerned issues 1 (Visibility of the system status) and 10 (Help and documentation). With regard to the former, a video catalogue has been added together with the user’s video history, while for the latter help messages were improved by adding mouseover events that show explanations for the Concept Graph functions. In addition, the concepts of the interactive transcript have been yellow highlighted in order to make them more visible. Further revisions have been left to be done in a post user study redesign.

5.1 Method

This user study was divided in two parts.

  1. 1.

    The first part (Part 1) was designed as a user experiment comparing between the ‘Augmented Transcript’-only and ‘Concept Graph’-only tools to investigate the benefits and limitations of each tool in terms of user experience of the learning environment in the first watch learning context.

  2. 2.

    The second part (Part 2) was designed to evaluate the two IVT tools in the rewatch learning context and also to evaluate the usability of the overall Edurell interface that combines the two.

The description of the participants and the setting for the user study that are common in both parts follows. Then, the details of the study design of each part are provided.

Participants

For the experiment 69 university students attending an introductory computer science course at the Faculty of Foreign Languages were recruited, 51 females and 18 males, in an age range between 20 and 31 (AVG:23.6, SD:2.2). All of them have at least intermediate English skills, and no one is native English speaker. None of the participant declared learning difficulties or eyesight problems.

Setting for the user study

The web user interface merges different canvases containing respectively the video player, the video transcript, the map of concepts’ flow, a bar for navigating the video through indexed fragments, a suited space for students to take notes. The frames containing the transcript and the map can be toggled on and off according to the user preferences. For this experiment, participants are required to use only the video player, the video transcript and the map of concept’s flow. For the experiment, to make sure that the evaluation of the IVTs is not compromised by the accuracy of the knowledge extracted automatically from the video, both the transcript and the graph have been post-processed after their automatic extraction.

5.1.1 Part 1

The aim of the first part of the study was to compare the two IVT tools: augmented transcript and concept graph. The main research question in this part was:

RQ1: How is each tool perceived to support the first-watch learning experience?

Experimental design This part of the study was designed as a between-subject experiment with the following conditions: augmented transcript, concept graph, and the video player without any visual cues as control condition. Each participant used only one of the tools.

The scenarios were analogous to the ones described in the heuristic evaluation. With specific regard to Part 1, the user study matches the ‘First-time viewing (first watch)’ scenario provided to the experts, with the difference that each user was required to use just one of the tools, among the three conditions. Another difference is that, since students involved were from an informatics class, the videos used in the study were not about archaeology, yet computer science. The videos, three short introductory video lessons, were the same for all the students in order to better compare the results, avoiding effects due to the difficulty of the content. They are three YouTube videosFootnote 2, made available on the Edurell platform. All the videos are in English.

As dependent variable for this study, the subjective feedback was used in each condition, collected through questionnaire responses.

Procedure The experiment was conducted online, as a class activity, thus reproducing the usage condition of the Edurell interface for online video learning.

  • The participants were first introduced to Edurell Video Augmentation platform and then were invited to try the user interface and the IVTs. To this aim, they were guided to play one of the videos already available in the platform and to use each tool, switching between the video transcript and the concept graph. Each tool was used by participants for about 5 minutes.

  • The participants were then split in 3 groups, one for each experimental condition, and instructed about the tasks they had to perform. The tasks consisted in:

    1. 1.

      Create a new account and log in to the system;

    2. 2.

      Set the Edurell interface according to the requirements for each group: Group1 with both the transcript and the map toggled off, i.e., just the video player (C1), Group2 with the augmented transcript toggled on and the map off (C2), Group3 with the map of concepts’ flow toggled on and the transcript off (C3);

    3. 3.

      Create a personal identification code to be used in the next steps and answer a few profiling questions, including one about learning or eyesight problems;

    4. 4.

      Watch Video1 trying to understand its content using, as a support to this end, if needed, the features available in the given set-up (i.e., C2 and C3 hypervideo features, and the video player progress bar for C1) ;

    5. 5.

      Answer a set of Comprehension Questions;

    6. 6.

      Repeat the same steps for Video2;

    7. 7.

      Answer a set of Subjective Questions about the user experience and the learning experience;

    8. 8.

      Watch Video3 without answering any questionnaire (this activity will be used for the rewatch task in Part2 of the user study).

5.1.2 Part 2

The aim of the second part of the study was to investigate the user experience in the rewatch learning context and the overall usability of the user interface. The main research questions in this part were:

RQ2: How is each tool perceived to support the rewatch learning experience?

RQ3: How is the usability of the Edurell Video Augmentation platform perceived?

Experimental design Part 2 of the study was structured in 2 Sessions, aimed to address RQ2 and RQ3 respectively, as detailed below.

2.1 For the first session a between-subject design was used, with the same conditions as for Part1. The scenario for this part of the user study was analogous to the one named ‘Second time viewing (rewatch)’ provided to the experts for the heuristic evaluation.

2.2 The second session involved all the participants, who were asked to evaluate the overall usability of the platform.

Procedure

The experiment was conducted online, as a class activity in subsequent later days thus reproducing the learning context of rewatch experience.

  • Session 2.1. Participants were split in 3 groups and instructed about the tasks they had to perform. The tasks consisted in:

    1. 1.

      Log in to the system using the account created in Part1;

    2. 2.

      Set the Edurell interface according to the requirements for each group: Group1 with both the transcript and the map toggled off, i.e., just the video player (C1), Group2 with the augmented transcript toggled on and the map off (C2), Group3 with the map of concepts’ flow toggled on and the transcript off (C3);

    3. 3.

      Rewatch Video3 trying to understand its content using, as a support to this end, if needed, the features available in the given set-up (i.e., C2 and C3 hypervideo features, and the video player progress bar for C1). Given the rewatch context, students were instructed to be free to skip parts they had already understood and that they remembered, or rewatch the video entirely, as they preferred, using the interactive features available in the respective conditions, with the goal of understanding and learning the content matter of the lesson;

    4. 4.

      Answer a set of Comprehension Questions;

    5. 5.

      Answer to Subjective Questions about the user experience and the learning experience.

  • Session 2.2 Participants were provided with a set of tasks that required them to use all the tools described in the third scenario, using Video3. Then they were required to:

    1. 1.

      Answer the questions of the System Usability Scale (SUS) Questionnaire [43];

    2. 2.

      Answer the questions of the Post-Study System Usability Questionnaire (PSSUQ) [44].

5.2 Results and analysis

In this section, the results of Part 1 and Part 2 are reported and discussed for:

  • The Subjective Questions on User Experience in first-watch and rewatch learning context scenarios;

  • The Usability tests (SUS and PSSUQ).

Results for Subjective Questions on User Experience (First-watch learning context) In this section the opinion of participants about their experience under the three conditions C1, C2, and C3 are reported. C2 and C3 questions, about the interactive transcript and the concept graph respectively, are designed to be the same in order to compare the Edurell IVTs directly, using also statistical analysis. To this aim the Mann-Whitney U test [45] was used, for two independent samples with a significance criterion (alpha) of 0.05. The questions are grouped into two categories and analyzed accordingly. One is focused on the learning experience with the specific IVT in terms of usefulness to support understanding and focused attention on the video content (named “Perceived usefulness”), while the other is focused on the perceived ease of use of the IVT and proper layout (named “UI suitability”).

Conversely, C1 questions are analyzed as qualitative control questions in relation to C2 and C3 answers for the first watch scenario. For example, where C2 and C3 questions ask, respectively, about the usefulness of the interactive transcript and the concept graph to understand the content, C1 questions ask the students if, while watching the video, they feel the need of tools as those tried in the training phase for improved highlighting and structuring of concepts, and navigation support. The description of the Subjective Questions and their average results on a 5-point Likert scale (1 disagree - 5 agree) are reported in Appendix.

Fig. 6
figure 6

Results for Subjective Questions grouped for perceived usefulness (Q-C2/3(1).1-.3) and UI suitability (Q-C2/3(1).4-.6), in the first watch learning context

Fig. 7
figure 7

Results for Subjective Questions grouped for perceived usefulness (Q-C2/3(2).1-.5) and UI suitability (Q-C2/3(2).6-.8), in the rewatch learning context

Analyzing C2 questions compared to C3 (after reversing the scale of negative items), it is possible to notice that both the IVTs are perceived as helpful to support the first watch learning experience, being all of them well above the intermediate score. As shown in Fig. 6, among the two, the augmented transcript performed better, with scores that are over 4pt for all the questions in the “Perceived usefulness” category (Q-C2(1).1, .2, .3 questions) and an average score 13.3% higher than the Concept graph (C2 AVG:4.23, SD:0.91; C3 AVG:3.74 SD:1.02). In terms of “UI suitability” (Q-C2/C3(1).4, .5, .6 questions), the two tools gain almost the same average score (C2 AVG:3.84 SD:1.1; C3 AVG:3.81 SD:0.96). The highest difference (0.29) is found for the position of the IVT (Question 5), being the position of the concept graph (lower right corner of the UI) be perceived less suitable than the transcript (next to the video player). However, the statistical analysis shows that none of the differences between the IVTs is significant for neither of the questions in the two categories, except Question 2 (Q-C2/C3(1).2) according to which the augmented transcript provides a greater support (+0.77) to understand the content in a non-native language (U=124.5, p<0.05).

The answers of the students that used only the video players (C1 questions) confirm that they would benefit from tools that provide the augmented transcript (Q-C1(1).1 AVG:4) and that outline the relevant domain concepts in the first-watch context (Q-C1(1).1 AVG:4), while the perceived benefit from an interactive map synchronized with the video is a bit lower (Q-C1(1).1 AVG:3.5).

Results for Subjective Questions on User Experience (Rewatch learning context) The rewatch learning context requires to analyze the IVTs by considering the specific requirements for that learning context: while the first watch is focused on viewing and comprehend, which are mostly linear activities, rewatch is focused on learning, thus involving higher interaction, exploration, and navigation within the video content for deep understanding and retention. To investigate the usefulness and suitability of the distinctive features of each IVT in this learning context, specific questions were designed for each condition. Description and results for the Subjective Questions are reported in Appendix.

Figure 7 provides an overview of the results for C2 and C3, grouping them for “Perceived usefulness” and “UI suitability”, as in the previous chart (Fig. 6), after reversing the scale of negative items. Note, however, that, differently from the previous situation, questions for rewatch have been designed in order to get specific feedback for the two tools as explained above, this is the reason for different numbering of questions in the figure, even though the total number of questions is the same for the two IVTs in the two categories.

Considering C2, the augmented transcript is found to be perceived highly useful for the rewatch context, with all its questions scoring above 4.2. In particular, it is perceived useful for deep understanding and content navigation questions. The overall category of “Perceived usefulness” obtains high average score and low standard deviation, indicating a general agreement in responses (C2 AVG:4.46, SD:0.77). Results for C3 condition are less homogeneous than those for C2, likely due to the fact that the concept graph tool includes several features, addressed in different questions and resulting in different usefulness perception. Note that also agreement in responses is lower, showing that some features (in particular the ‘search for concepts’ feature addressed in Question 4 (Q-C3(2).4)) were scored as useful less consistently among students. Conversely, Questions 1, 2 and 3 ((Q-C3(2).1,.2,.3), concerning helpfulness for concept navigation and retrieval of concepts’ definition and concepts’ explanation gain, respectively, the highest scores and the lowest disagreement. As a whole, C3 “Perceived usefulness” scores AVG:4.24, with SD:0.91.

Concerning the “UI suitability” of the two IVTs, results under C2 condition show that the augmented transcript is positively rated for all the questions (average AVG:4.35, SD:0.84), while results under C3 condition show that ratings are not homogeneous (AVG:3.95, SD:1.02). In particular, Question 7 (Q-C3(2).7) shows lower score and higher disagreement (AVG:3.57, SD:1.25). Notably, it concerns the ‘search for concepts’ feature, the same feature that was scored similarly in terms of usefulness.

Finally, for the group under C1 condition, whose participants performed the rewatch task using the video player-only, questions investigated whether they feel the need of IVT tools for this task. Findings show that the benefits expected from participants are higher for such tools in the rewatch phase than in the first watch phase. Indeed, for rewatch the scores are higher than 4.2 for all the questions (average AVG:4.4, SD:0.85), thus expressing higher need of tools for interactive navigation and content structuring in this learning context. The expectations found in C1 seem satisfied by the results obtained under C2 and C3 conditions that showed an actual perceived usefulness from IVTs, as from the analysis above.

Results for SUS and PSSUQ Questionnaire

Results for SUS Questionnaire have been collected with no distinction among the three groups. The overall SUS score for Edurell platform was 82.9 (excellent, rate A).

Results for PSSUQ Questionnaire are as follows: Overall AVG: 3 on a 7pt-scale (1 agree-7 disagree). The scores for the three dimensions of the tests are: System Usefulness:2.8, Information Quality:2.6, Interface Quality:3.75. Also, it was worth investigating if any effect could occur on the perceived usability, based on the group the participant belonged for conditions C1, C2, C3. However, no significant difference has been found neither for SUS, nor for PSSUQ results, computed using Mann-Whitney U test for each pair of conditions: C1-C2, C2-C3, C1-C3.

5.3 Discussion

Findings in the previous sections show overall results ranging between good and very good for the user experience and usability using Edurell Video Augmentation platform in the first watch and rewatch learning contexts. However, the more in-depth analysis is articulated and requires some discussion in order to answer to the Research Questions for this user study.

Considering RQ1 (How is each tool perceived to support the first watch learning experience?), results show very good perceived usefulness of both the IVTs, with higher performance of the augmented transcript for all the questions concerning the perceived usefulness in this learning context. However, the difference is not statistically significant. The only case in which the difference is significant is for the support provided by the transcript in case of video lessons whose content is in a language different from the one used by the student. In addition, it can be noticed that the expected benefits from IVTs from students using video player-only (C1 condition) is 4.1 on average, meaning that the perceived need for first viewing is high, but not of utmost importance.

Considering RQ2 (How is each tool perceived to support the rewatch learning experience?), for the group under C1 condition, it is interesting to note that the average score of the perceived usefulness of IVTs in the rewatch context (AVG: 4.4) is higher than the usefulness perceived in the first watch context, and it is especially high for the features provided by the concept graph. This seems to suggest higher need for this kind of support in the rewatch context. When these findings are compared with the responses to the subjective questions about the actual usefulness of these tools, after using them, i.e., under C2 and C3 conditions, it can be observed an actual higher average score (AVG: 4.46) under C2 and a slightly lower average value (AVG:4.24) under C3.

These results would seem to confirm on the one hand the need of interactive tools for video-based learning and show on the other hand that Edurell interactive visual tools mostly satisfy such need, even though there is still room for improvement, in order to fill the gap between expected support for rewatch (4.4) and the actual perceived usefulness (4.24).

To check the reliability of the other questions under both C2 and C3 conditions, a control question was added: “The IVT did not add any value to my experience of rewatching the video”. It obtained high disagree (AVG: 1.57, SD:1.03) and disagree (AVG:2.14, SD:1.28) respectively in C2 and C3 conditions. This is coherent with the slightly higher perceived usefulness of augmented transcript compared to the concept graph, somehow confirming the reliability of results. Moreover, considering the result of UI suitability under C3 conditions (AVG: 3.95), it furthermore confirms the room for improvement for C3.

Note, however, that the averages compared in this discussion with regard to the rewatch context come from different sets of questions, as explained in Section 5.2, thus their match can be used as a general indication, while their worth is in the information brought by each specific question, widely analyzed in that Section.

Considering RQ3 (How is the usability of the Edurell Video Augmentation platform perceived?), taking into account the results from the Usability tests, it can be observed an overall excellent usability according to SUS, but lower rates when considering PSSUQ, which splits the results in three dimensions. In particular, the lower result is obtained for the user interface. This confirms what discussed above for the map of concepts’ flow and related tools. This outcome is consistent with the results from the Expert Evaluation that provided suggestions for improving the user interface, emphasizing in particular the need for more control from the user and the need of system feedback about the functions available in the tool. This seems particularly critical for the Map of concepts’ flow, that includes more features and students are not accustomed to such kind of tools.

It should also be noted that the performed user study did not take into account users with learning difficulties or disabilities. While both the augmented transcript and the concept graph could help in some cases, and indeed are widely adopted, depending on the disability, they could be a further barrier. This is especially true for the concept graph that requires interactions that could represent a problem for people with motor limitations and for the use of screen readers for people with sight problems.

Finally, it is worth noting that, while in this paper the learning outcome itself was not the focus, since the goal was to assess the user experience and the usability of the UI, the use of Comprehension Questions in the procedure forced the students to focus their attention on the learning task, in order to obtain a real usage scenario that allowed testing the actual user experience and usability more properly. In this line of reasoning, considering that efficiency is about the resources —not only time but also effort, needed by users to achieve their goals, the perception of the tools as useful for learning can lead to assume an improvement in the learning process. This is also supported by the authors’ previous studies in the first watch scenario that showed a significant positive effect of both the augmented transcript and the concept graph on comprehension [46].

6 Conclusions

In this paper, the Edurell Video Augmentation platform was presented together with the evaluation of the user experience and usability with HCI experts and with users in the learning contexts of first time viewing (first watch) of a video lecture and second time viewing (rewatch). The platform includes several interactive visual tools designed with the goal of improving video-based learning by providing a structure to video contents and enhanced hypervideo capabilities. The platform is included in a wider project that aims at automating the extraction of concepts and relations from video lecturer, represent them as a semantic knowledge graph, and exploit the knowledge graph for building interactive hypervideo tools for learning support.

The Expert study presented in this paper highlighted some issues concerning the usability of the Edurell Video Augmentation platform, that were partially addressed before undertaking the user study.

The User study involved the students of a university course and it was run as part of a class activity in order to obtain more reliable results in terms of learning experience. In order to avoid, or at least reduce, complacency biases, the activity was carried out anonymously.

The findings from the user study confirm the results from the literature of almost two decades about the need of tools that can improve the learning experience with video-based learning, by improving interactive and navigation capabilities, breaking the linearity and sequential viewing. Specifically, these findings confirm that this is needed in particular in the rewatch learning context. Moreover, the findings show that the Augmented transcript and the Map of Concepts’ flow tools are perceived as useful tools to address such needs and the overall user experience is good. However, in order to improve it, especially for what concerns the rewatch context, some usability interventions are needed.

The results achieved through this experimentation give directions for the developments of the system. Specifically, future work is planned to improve the platform functionality and usability, addressing the issues highlighted by the experts in their evaluation, and keeping into account the results of the user study, with specific regard to the improvements needed for the Map of Concepts’ flow. In addition, the evaluation needs to be extended in order to study the effect of the considered IVTs and the whole Edurell interface on students in various domains and considering different learning tasks, after addressing the issues emerged with this user study and the expert evaluation. By automating the development of this kind of interactive hypervideo tools, the goal is to make them available in educational learning environments and MOOCs.

From the research side, this work puts in evidence that future studies are still needed to investigate new methods and tools to enhance existing learning environments, especially focusing on the possibility to develop video augmentation services that break the linear-sequential structure of video lectures and are able to meet the specific needs of learners. The current platform offers dynamic and interactive tools that support learners to customize their learning experience and track their progress, but it does not offer adaptive and personalized tools. In this respect, a possible plan for future work is to design and make available adaptive and personalized features to support users in different learning tasks.