Abstract
The number of Institutional Repositories (IRs) as part of universities’ Digital Libraries (DLs) has been growing in the past few years. However, most IRs are not widely used by the intended end users. To increase users’ acceptability, evaluating IRs interface is essential. In this research, the main focus is to evaluate the usability of one type of IR’s interface following the method of Nielsen’s heuristics to uncover usability problems for development purposes. To produce a reliable list of usability problems by applying the heuristic evaluation approach, we examine the impact of experts and novices on the reliability of the results. From the individual heuristic analyses (by both experts and novices), we distilled 66 usability problems. Those problems are classified by their severity. The results of applying the heuristic evaluation show that both experts and non-experts can uncover usability problems. We analyzed the differences between these types of assessors in this paper. Experts tend to reveal more serious problems while novices uncover less severe problems. Interestingly, the best evaluator is a novice who found 21 % of the total number of problems. The ability to find difficult and easy problems are recorded with both types of evaluators. Therefore, we cannot rely on one evaluator even if the evaluator is an expert. Also, the frequency of each violated heuristic is used to assigned priority to the uncovered usability problems as well as the severity level. The result of the heuristic evaluation will benefit the university through improving the user interface and encouraging users to use the library services.
You have full access to this open access chapter, Download conference paper PDF
1 Introduction
The user interface of Open Access (OA) repositories has an effect on their users’ performance and satisfaction. To add to the ongoing development of these types of repositories, usability evaluations need to be implemented on the user interface. There are two foci of this research: to evaluate the usability of Institutional Repositories as part of universities’ digital libraries interface using Nielsen’s heuristics to uncover usability problems and to examine the differences between user-interface experts and non-experts in uncovering problems with the interface.
1.1 What Is Usability?
In 1998, the term “user friendly” reached a level of vagueness and subjective definitions, which led to the start of the use of the term “usability” instead [1]. The International Standards Organization (ISO) [26] defines usability as “the extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use.
Nielsen [2] suggests that usability cannot be measured by one dimension; these five attributes are associated with the usability components, which include learnability, memorability, efficiency, error recovery, and satisfaction. While Hix and Hartson [3] suggest that usability relies on the following factors, which include first impression, initial performance, long-term performance, and user satisfaction. Also, Booth [4], Brink et al. [5] share similar viewpoints that define usability as the effectiveness, efficiency, ease to learn, low error rate and pleasing. Nielsen’s and ISO’s usability definitions are the most widely used [6, 27, 28].
1.2 Usability Evaluations
Evaluation is considered as a basic step in the iterative design process. There are varieties of approaches to follow in evaluating the usability, which include formal usability inspection by Kahn and Prail [7], the cognitive walkthrough by Wharton et al. [8], heuristic evaluation by Nielsen [2, 9, 10], Contextual Task Analysis [11], paper prototyping by Lancaster [12].
1.3 What Are Institutional Repositories?
Institutional repositories are popular among universities worldwide [13]. IR as a channel allowing the university structuring its contribution to the global community, there exists the responsibility for reassessment of both culture and policy and their relationship to one another [14].
Over the past fifteen to twenty years, research libraries have been used to create, store, manage, and preserve scholarly documents in digital forms and make these documents available online via digital Institutional Repositories [15]. IRs host various types of documents [15]. An Example of IRs is DSpace [16]. In 2000, the Hewlett-Packard Company (HP) at MIT Libraries was authorized to, cooperatively, build DSpace, which is as Institutional Repository for hosting the intellectual output of “multi-disciplinary” organizations in digital formats [17].
1.4 Nielsen’s List of Heuristics
The set of heuristics was constructed from some usability aspects and interface guidelines [18]. The heuristics include visibility of the system status, match between system and the real world, user control and freedom, consistency and standards, error prevention, recognition rather than recall, flexibility and efficiency of use, aesthetic and minimalist design, help users recover from errors and help and documentation [9].
2 Related Work
Ping, Ramaiah, and Foo [19] tested the user interface of the Gateway to Electronic Media Services system at the Nanyang Technological University. The researchers’ goal was to apply Nielsen’s Heuristics to find strengths and weaknesses of the system. In their findings, they determined that the heuristic evaluation helped to uncover major problems such as being not able to have search results as desired. Researchers suggested that the uncovering of these problems ensures that the GEMS system needs development.
Qing and Ruhua [20] point out that the usability evaluation of Discipline Repositories offers the digital library developers a critical understanding of four areas: understanding the target users’ needs, finding design problems, create a focus for development, and the importance in doing so to establish a valid acceptability of such educational interactive technological tool. Three DRs were evaluated include arXiv,Footnote 1 PMCFootnote 2(PubMed Central) and E-LIS.Footnote 3 The three DRs are different in the subject domain and their design structures. The findings show that DRs inherit some of the already successful features form DLs. The three digital repositories provide limited ways, regarding the advanced search tools, to display and refine the search results.
Hovater et al. [21] examined the Virtual Data Center (VDC) interface that is classified as an open access web-based digital library. VDC collects and manages the research in the social science field. The researchers conducted a usability evaluation followed by a user testing. They found minor and major problems that included “lack of documentation, unfamiliar language, and inefficient search functionality”.
Zimmerman and Paschal [15] examined the digital collection of Colorado State University by completing some tasks that focused only on the search functions of the website. The talk-aloud approach was used to observe participants. Researchers found that two-fifths of users had problems downloading documents, which would discourage them from using the service. The findings suggest that the interface should be evaluated periodically to ensure the usability of the features.
Zhang et al. [22] evaluated three operational digital libraries, which include the ACM DL, the IEEE Computer Society DL, and the IEEE Xplore DL. Heery and Anderson’s [23] conducted a review to form a report on Digital Repositories sent to repository software developers. Heery and Anderson [23] impart, that engaging users is vital during developing open access repositories.
3 Heuristic Evaluation Study
The heuristic Evaluation study was conducted on a DSpace as an extension of university library services that enables users to browse the university’s collections and academic scholarly output. Our focus on evaluating Institutional Repositories (IRs) is motivated by the need to focus on the usability of the interface while the concept of usability evaluation implemented on IRs is fairly new. The research objectives of evaluating the university repository interface include:
-
To determine the usability problems of the University Repository Interface
-
To provide solutions and guidelines regarding the uncovered problems.
-
To provide the development team in the University with the suggested solutions to be used in the iterative design process for development purposes
Two key aspects are investigated: Does the expertise and number of evaluators affect the reliability of the results from applying the heuristic evaluation to the user interface? To answer the first of those general questions, we consider the following hypotheses:
-
Severe problems will be uncovered by experts while the minor problems will be uncovered by novices
-
Difficult problems can only be uncovered by experts and easy problems can be uncovered by both experts and novices
-
The best evaluator will be an expert
-
As Nielsen and Mack [24] reported for the traditional heuristic evaluation, experts will tend to produce better results than novices
-
The average of number of problems uncovered by experts and novices will differ. Experts are expected to find more problems than novices
To answer the second of those general questions, does the number of evaluators affect the reliability of the results? we consider the following hypotheses:
-
A small set of evaluators (experts) can find about 75 % of the problems in the user interface as Nielsen and Mack [24] suggest.
-
More of the serious problems will be uncovered by the group (experts or non-experts) with the most members
3.1 Participants
To produce a reliable list of usability problems, having multiple evaluators is better than only one because different people uncover different problems from different perspectives. A total of 16 participants were recruited and were university students who were divided into three groups 9 regular experts, four amateur and 2 novices.
3.2 Tasks
The tasks were designed according to most important elements in the interface that should be examined according to the result from previous study called user personas [29]. Each task is designed to describe the following:
-
The goal of the task;
-
The type of the task, is it regular, important, critical task;
-
The actual steps that a typical user would follow to perform the task;
-
The possible problems that users might face during performing the task;
-
Time for expert to reach the goal;
-
And the scenario.
3.3 Methodology
We started with conducting a tutorial lecture about the heuristics and how evaluators should apply them on the interface dialogs during the evaluation session. Examples usually are better than just lecturing. The researcher explained each heuristic’s main concept and gave examples. This was meant to help in carrying out the evaluations without having problems while referring to the heuristics. Evaluators who have not performed a heuristic evaluation before were required to attend the lecture to increase their knowledge about heuristics and the overall method. Other evaluators, who have experience in heuristic evaluation, would not need to review the heuristics, but they would need to be trained in using the interface. Therefore, the objective of this lecture is to increase evaluators’ knowledge about how to applying the heuristics.
The study lasted for 120 min. Participants started with the training session followed by the evaluation session. Then the severity rating was assigned for each uncovered usability problem. Finally, the solutions session was conducted to discuss problems and propose guidelines for the uncovered problems.
4 Results and Discussion
4.1 Problems Report
A report that describes each uncovered problem was delivered to the developers of the University DSpace for development purposes. We believe that uncovering these problems would benefit any university that utilizes a DSpace Repository as part of the digital library to maintain its scholarly output.
4.2 Number of Problems
For each problem and evaluator, data were coded as 1 for detected and 0 for not detected. Table 1 shows that the average number of problems found by experts was 6.8 while the average number of problems found by amateurs and novices were 3.5 and 2.5 respectively. 4.57. There is no significant difference between the means (F(2,13) = 3.205, p < .075) with an effect size of η2 = .330. Some would say that it the effect is marginal. The lack of significance combined with the reasonable effect size, is likely due to the small sample sizes.
As would be expected, the largest differences were found between Experts and Novices. However, further analyses indicated that Experts were not different from Amateurs (F(1,12) = 3.639, p < .081, η2 = .233), that Experts were not different from Novices (F(1,10) = 3.141, p < .107, η2 = .239) and that Amateurs were not different from Novices (F(1,4) = 1.333, p < .970, η2 = .195)
Not surprisingly, the best evaluator was an expert, (evaluator ID 10) with a total of 21 % of the all problems (note, the total number of problems is the final number after applying the aggregation process, not including the “non-issues”). However, the best amateur found only 7.6 % of the total and the best novice only found 4.5 % of the total. The worst expert, amateur, and the novice found just 3 % of the total.
4.3 The Severity of Uncovered Problems
Of that 66, 17 were classified as Catastrophic (Level 4), 17 as Major (Level 3), 21 as Minor (Level 2) and 11 as Cosmetic (Level 1). Minor problems were the most common, but this difference was not significant using a chi-square analysis (χ2(3) = 3.09, p < .377). The lack of more severe (catastrophic or more) problems is likely attributable to the fact that the DSpace website has been in use for a number of years. It is likely that the majority of major and catastrophic problems have been uncovered and fixed.
4.4 The Severity by Expertise Interaction
Nielsen [18] suggested that usability specialists are better in uncovering problems than novices. To examine that, I compared the type of usability problems that were uncovered by both experts and novices. Each level of severity (Catastrophic, Major, Minor, Cosmetic, not including Non-Issues) was considered in isolation. The full analysis is a mixed ANOVA with one between subjects factor (Groups) and one within subjects factor (Severity). This analysis indicated that there were no differences for groups (F(2,13) = 3.205, p < .075, η2 = .330, as noted above), no differences for Severity (F(3,39) = 1.375, p < .698, η2 = .051) and no interaction (F(3,39) = 0.521, p < .265, η2 = .039) . However, one must again be mindful of the small sample sizes. The means are provided in Table 2 and Fig. 1.
Because we were more concerned about the severity of the problems found by each group of evaluators, specific tests for each level of severity were computed. For Catastrophic problems (Level 4), the number of problems detected by Experts was higher than the number of problems detected by Amateur and Novices, but the difference was not significant. Further analyses revealed that Experts were not different from Amateurs (F(1,12) = 3.377, p < .091, η2 = .220), that Experts were not different from Novices (F(1,10) = 0.714, p < .418, η2 = .067) and that Amateurs were not different from Novices (F(1,4) = 1.333, p < .312, η2 = .250). The same results held for Major (Level 3), Minor (Level 2) and Cosmetic (Level 1) problems. For Major problems, Experts were not different from Amateurs (F(1,12) = 2.455, p < .143, η2 = .170), Experts were not different from Novices (F(1,10) = 4.276, p < ..127, η2 = .217), and Amateurs were not different from Novices (F(1,4) = 0.333, p < .506, η2 = .118). For Minor problems, Experts were not different from Amateurs (F(1,12) = 0.489, p < .498, η2 = .039), Experts were not different from Novices (F(1,10) = 0.542, p < .478, η2 = .051) and Amateurs were not different from Novices (F(1,4) = 0.038, p < .855, η2 = .009). Finally, for Cosmetic problems, Experts were not different from Amateurs (F(1,12) = 0.023, p < .822, η2 = .002), Experts were not different from Novices (F(1,10) = 0.437, p < .524, η2 = .042), and Amateurs were not different from Novices (F(1,4) = 1.091, p < .355, η2 = .214).
4.5 Does One Need Experts Amateurs and Novices?
Even though the differences were not significant, Experts consistently found more problems than Amateurs, and Amateurs consistently (excepting catastrophic) found more problems than Novices.
Clearly, it would seem that experts will find “most” of the problems, and experts will find more of the serious problems. However, the simple presentation of Table 3 confounds the fact that there were more Experts (n = 10) than Amateurs (n = 4) or novices (n = 2). That is, more people imply that more problems can be found. As such, the analysis presented in Table 7 is a better measure of the capabilities of a single evaluator. However, this data does provide the opportunity to estimate the number of each category that would be required to find all problems. That is, using simple linear extrapolation (i.e., ratio), as shown in Table 4, one could conclude that it would require 17 Novices, or 24 Amateurs or 12 Experts to find all the Catastrophic problems.
Implications: This is consistent with the notions of Nielsen [25]. The severity of problems uncovered by experts is higher than the severity of problems uncovered by the novices. Hence, one could conclude that a small set of expert evaluators is needed to find severe usability problems.
4.6 Difficulty of Uncovering Problems
The performance of evaluators can be rated according to the difficulty of uncovering problems in the DSpace interface. We mean that an Easy problem is one that is found by many evaluators, whereas a Hard problem is one that is found by a few evaluators, or even just one evaluator.
One can also rate the ability of each evaluator to find usability problems from Good to Poor. An evaluator who found many problems would have high ability whereas an evaluator who found few problems would have low ability. These two factors were investigated.
Some might think that experts can only uncover difficult problems and both experts and novices can uncover easy problems. This raises three questions: do experts, who are presumed to have a high ability to uncover problems, find only difficult problems? Do novices uncover only easy problems? Most importantly, can novices, who have presumed to have lower ability, find difficult problems? To address these questions, Fig. 2 summarizes the ability of evaluators to uncover problems. The blue diamonds represent the Novices, the red squares represent the Amateurs and the green triangles represent the Experts. Red Xs represents experts. Each row represents one of the 66 problems, and the column represents one of the 16 evaluators.
We can see from Fig. 2 that the two types of evaluators are fairly interspersed. In this, one must be mindful of the fact that there are ties (e.g., three evaluators found 2 problems, two found 3, 4 and 5, three found 6, and one found 7, 9, 10 and 16). However, in the top rows, one can see that both Amateurs and Experts found the hardest problems, and both all groups found the easiest (lowest rows) problems. Generally, the Experts cluster to the upper right while the Novices and Amateurs cluster to the lower left.
4.7 The Violated Heuristics and Type of Problems
It was essential to investigate the number of times each heuristic was violated. Figure 2 provides the same information graphically.
Figure 3 shows the recommended priority levels for violated heuristics starting by problems associated with heuristic 4, 8, 3, 5, and 7 respectively (Fig. 4).
4.8 Duplicate Problems with Different Severity Ratings
In some conditions, two or more evaluators found the same problems but assigned different severity ratings to those problems of this type were found. For the purpose of analysis, we considered the duplicates as new problems under each problem category with clear indication that these problems are duplicates.
5 Conclusion
Two main contributions were derived from the heuristic evaluation study. First, we added to the literature in cooperating the results from a previous study “user personas” to focus on some important elements on the interface and study users’ needs. Second, we have added to the traditional heuristic evaluation by separating the sessions and add a new session, which is the proposed solutions session. The results from the study show that applying the heuristic evaluation on DSpace produced a large number of usability problems that will improve the service if fixed. The findings from the heuristic evaluations study suggest a list of usability problems classified depending on their severity ratings. Two key aspects are investigated: Does the expertise and number of evaluators affect the reliability of the results from applying the heuristic evaluation to University DSpace user interface? To communicate the initial hypotheses with the findings, I examined the evaluators’ performance according to three factors: the number of problems found by each evaluator and the severity of the uncovered problems. The best evaluator among the group of evaluators (both experts and novices) is an amateur who found 21 % of the total number of problems. The best expert found 13 %. This contradicts the initial hypothesis that the best evaluator will be an expert. From this point, I conclude that only one evaluator cannot find all the usability problems even if this evaluator is an expert, which agrees with Nielsen suggestion (1994) that it is advisable to have more than one evaluator to inspect the interface. Compared to Nielsen’s finding, one evaluator can find 35 % of the usability problems in the user interface while, from the study findings, 21 % of the total number of problems was uncovered by the best evaluator. We conclude that the majority of the problems found by experts were serious (catastrophic and major). Finally, we believe that applying the heuristic evaluation methodology to Institutional Repositories as apart of the Digital Libraries and based on Dspaces would uncover usability problems and, if fixed, increase the libraries’ usability.
References
Bevan, N., Kirakowskib, J., & Maissela, J.: What is usability? In: Proceedings of the 4th International Conference on HCI, September 1991
Nielsen, J.: Usability Engineering. Academic Press, Boston (1993)
Hix, D., Hartson, H.R.: Developing user interfaces: ensuring usability through product & process. Wiley, New York (1993)
Booth, P.A.: An Introduction To Human-Computer Interaction. Psychology Press, Hove (1989)
Brink, T., Gergle, D., Wood, S.D.: Design Web Sites That Work: Usability for the Web. Morgan-Kaufman, San Francisco (2002)
Jeng, J.: What is usability in the context of the digital library and how can it be measured? Inf. Technol. Libr. 24(2), 47–56 (2005)
Kahn, M.J., Prail, A.: Formal usability inspections. In: Nielsen, J., Mack, R.L. (eds.) Usability Inspection Methods, pp. 141–172. Wiley, New York (1994)
Wharton, C., Rieman, J., Lewis, C., Polson, P.: The cognitive walkthrough method: a practitioners guide. In: Nielsen, J., Mack, R. (eds.) Usability inspection methods, pp. 105–140. Wiley, New York (1994)
Nielsen, J.: How to conduct a heuristic evaluation (1994). http://www.useit.com/papers/heuristic/heuristic_evaluation.html
Nielsen, J.: The Usability Engineering Lifecycle. Academic Press, Boston (1993)
Usability Methods: Contextual Task Analysis: Usability First (Accessed 2 April 2013). http://www.usabilityfirst.com/usability-methods/contextual-task-analysis
Lancaster, A.: Paper prototyping: the fast and easy way to design and refine user interfaces. IEEE Trans. Prof. Commun. 47(4), 335–336 (2004)
Bailey Jr., C.W.: Institutional repositories, tout de suite (2008)
Lynch, C.A.: “Institutional Repositories: Essential Infrastructure for Scholarship in the Digital Age.” ARL no. 226, pp. 1–7 February 2003. http://www.arl.org/resources/pubs/br/br226/br226ir.shtml
Zimmerman, D., Paschal, D.B.: An exploratory usability evaluation of Colorado State University Libraries digital collections and the Western Waters Digital Library web sites. J. Acad. Librarianship 35(3), 227–240 (2009)
DSpace (2012). <http://dspace.org>
Smith, M., Barton, M., Bass, M., Branschofsky, M., McClellan, G., Stuve, D., Walker, J.H.: DSpace: an open source dynamic digital repository (2003). http://www.dlib.org/dlib/january03/smith/01smith.html
Nielsen, J. Molich, R.: Heuristic evaluation of user interfaces. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 249–256. ACM (1990). http://doi.acm.org/10.1145/97243.97281
Ping, L.K., Ramaiah, C.K., Foo, S.: Heuristic-based User interface evaluation at Nanyang Technological University in Singapore. Program 38(1), 42–59 (2004)
Qing, F., Ruhua, H.: Evaluating the usability of discipline repositories. In: 2008 IEEE International Symposium on IT in Medicine and Education, ITME 2008, pp. 385–390. IEEE, December 2008
Hovater, J., Krot, M., Kiskis, D.L., Holland, I., Altman, M.: Usability testing of the virtual data center. Ann Arbor 1001, 48109–2122 (2002)
Zhang, X., Liu, J., Li, Y., Zhang, Y.: How usable are operational digital libraries: a usability evaluation of system interactions. In: Proceedings of the 1st ACM SIGCHI Symposium on Engineering Interactive Computing Systems, pp. 177–186. ACM, July 2009
Heery, R., Anderson, S.: Digital Repositories Review (2005). http://www.jisc.ac.uk/uploaded_documents/digitalrepositories
Nielsen, J., Mack, R.L. (eds.): Usability Inspection Methods, pp. 203–233. Wiley, New York (1994)
Nielsen, J., Hackos, J.T.: Usability Engineering, vol. 125184069. Academic Press, Boston (1993)
Nielsen, J.: Enhancing the explanatory power of usability heuristics. In: Proceedings of the SIGCHI Conference On Human Factors In Computing Systems: Celebrating Interdependence, pp. 152–158. ACM, April 1994
Jeng, J.: Usability assessment of academic digital libraries: effectiveness, efficiency, satisfaction, and learnability. LIBRI 55(2–3), 96–121 (2005)
International Standards Organization ISO 9001: 1994 (E): Quality systems: Model for quality assurance in design, development, production, installation and servicing. ISO, Geneva (1994)
Aljohani, M., Blustein, J.: “Personas help understand users’ needs, goals and desires in an online institutional repository”. Int. Sci. 9(1) (2014)
Acknowledgment
This research was supported and funded by the Saudi Cultural Bureau in Ottawa-Saudi Royal Embassy. Special thanks to the supervisor Dr. J. Blustein, for the valuable comments
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Aljohani, M., Blustein, J. (2015). Heuristic Evaluation of University Institutional Repositories Based on DSpace. In: Marcus, A. (eds) Design, User Experience, and Usability: Interactive Experience Design. DUXU 2015. Lecture Notes in Computer Science(), vol 9188. Springer, Cham. https://doi.org/10.1007/978-3-319-20889-3_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-20889-3_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-20888-6
Online ISBN: 978-3-319-20889-3
eBook Packages: Computer ScienceComputer Science (R0)