E-learning interventions are comparable to user's manual in a randomized trial of training strategies for the AGREE II
Practice guidelines (PGs) are systematically developed statements intended to assist in patient and practitioner decisions. The AGREE II is the revised tool for PG development, reporting, and evaluation, comprised of 23 items, two global rating scores, and a new User's Manual. In this study, we sought to develop, execute, and evaluate the impact of two internet interventions designed to accelerate the capacity of stakeholders to use the AGREE II.
Participants were randomized to one of three training conditions. 'Tutorial'--participants proceeded through the online tutorial with a virtual coach and reviewed a PDF copy of the AGREE II. 'Tutorial + Practice Exercise'--in addition to the Tutorial, participants also appraised a 'practice' PG. For the practice PG appraisal, participants received feedback on how their scores compared to expert norms and formative feedback if scores fell outside the predefined range. ' AGREE II User's Manual PDF (control condition)'-- participants reviewed a PDF copy of the AGREE II only. All participants evaluated a test PG using the AGREE II. Outcomes of interest were learners' performance, satisfaction, self-efficacy, mental effort, time-on-task, and perceptions of AGREE II.
No differences emerged between training conditions on any of the outcome measures.
We believe these results can be explained by better than anticipated performance of the AGREE II PDF materials (control condition) or the participants' level of health methodology and PG experience rather than the failure of the online training interventions. Some data suggest the online tools may be useful for trainees new to this field; however, this requires further study.
KeywordsTraining Condition Training Intervention Practice Exercise Expertise Reversal Effect Prior Knowledge Learner
Evidence-based practice guidelines (PGs) are systematically developed statements aimed at assisting clinicians and patients to make decisions about appropriate healthcare for specific clinical circumstances  and to inform decisions made by policy makers [2, 3, 4]. While PGs have been shown to have a moderate impact on behavior , their potential for benefit is only as good as the PGs themselves [6, 7, 8]. The AGREE II, a revised version of the original tool , is an instrument designed to direct the development, reporting, and evaluation of PGs [10, 11, 12, 13]. The AGREE II consists of 23 items grouped into six quality domains, two overall assessment items, and extensive supporting documentation to facilitate its appropriate application (i.e., User's Manual).
International adoption of the original AGREE Instrument and interest in the revised version has been significant, and attests to the potential value of this tool . The AGREE II was designed for many different types of users and for users with varied expertise. Given the breadth and heterogeneity of the AGREE II's stakeholder group, efforts to promote and facilitate its application are complex. The internet is a key medium to reach a vast, varied, and global audience. However, passive internet dissemination alone, even with a primed and interested audience, will not fully optimize its application and use. Our interest was to explore educational interventions and to leverage technical platforms to accelerate an effective application process.
E-learning (internet-based training) provides a potentially effective, standardized, and cost-efficient model for training in the use of AGREE II. A recent meta-analysis and systematic review showed large effect sizes for internet-based instruction (clinical and methodological content areas) on outcomes with health-profession learners [15, 16]. Improved learning outcomes seemed to be associated with designs that included interactivity, practice exercises, repetition, and feedback. Thus, e-learning appeared to be a promising solution for our context. While the evidence base underpinning the efficacy and design principles of e-learning training materials are well established [17, 18, 19, 20, 21, 22, 23], there remain questions regarding the optimal application and combination of these principles for particular interventions. In this study, we wanted to design and test two e-learning interventions, a tutorial alone versus a tutorial plus an interactive practice exercise, against a more traditional learning form to determine their impact on outcomes related to the AGREE II.
Our primary research question is, whether compared to just reading the User's Manual, does the addition of an online tutorial program, with or without a practice exercise with feedback, improve learners' performance and increase learners' satisfaction and self-efficacy with the AGREE II? Based on the results of systematic reviews [15, 16], we hypothesized the training platform that included the tutorial plus the practice exercise with feedback would be superior to the User's Manual alone. For exploratory purposes, we also examined whether differences existed across the outcome measures between the two e-learning intervention groups.
This study was funded by the Canadian Institutes of Health Research and received ethics approval from the Hamilton Health Sciences/Faculty of Health Sciences Research Ethics Board (REB #09-398; McMaster University, Hamilton, Ontario, Canada). Key evidence-based principles in the science of technical training, multimedia learning, and cognitive psychology were used to develop the two training platforms [17, 18, 19, 20, 21, 22, 23].
Study design and intervention
Participants received access to a password-protected website where they were presented with a seven-minute multimedia tutorial presentation with an overview of the AGREE II conducted by a 'virtual coach.' Following the tutorial, the participants were granted access to a PDF copy of the AGREE II and were instructed to review the User's Manual before proceeding to the test PG.
Tutorial + practice exercise
Participants received access to a password-protected website where they received the same tutorial presentation described above and access to the AGREE II User's Manual. They were then presented with the practice exercise that required participants to read a sample or 'practice' PG and appraise it using the AGREE II. Upon entering each AGREE II score, participants were provided immediate feedback on how their score compared to the mean of four experts. If their score fell outside a predefined range, participants received two-stage formative feedback to guide the appraisal process. At the conclusion of their review, participants received a summary of their performance in appraising the practice PG compared to expert norms. Participants then proceeded to read and appraise the test PG.
Participants assigned to the control condition received PDF copies of the AGREE II User's Manual for review before proceeding to the test PG. The User's Manual is a 56-page document. It provides an overview of the AGREE enterprise and general instructions on how to use the tool. Then, for each of the 23 core items, it presents a definition of the concept and examples, advice on where the information can be found within a PG document, and the specific criteria and considerations for scoring. It concludes with the two global rating measures.
Participants and process
Following our sample size calculation reported in the detailed protocol previously published , we required 20 participants per group to have at least 80% power to detect a performance advantage of as little as ± 0.79 standard deviations for either of the intervention groups compared to the passive learning group. Methodologists, clinicians, policy makers, and trainees were sought from guideline programs, professional directories, and the Guidelines International Network (G-I-N) community. Because our previous research showed virtually no differences in AGREE II performance as a function of type of users, we did not account for this factor in our study design [11, 12, 13].
A total of 107 interested individuals registered with the Scientific Office. After receiving a letter of invitation and screening for their eligibility, 87 participants were randomized to one of the three training conditions using a computer-generated randomization sequence (1:1:1 ratio). Individuals were eligible for study participation if they had no or limited experience and exposure to the original AGREE Instrument or the AGREE II. To assess this, participants were asked to first complete an online eligibility questionnaire. Here, they were asked about the type(s) of previous experience they had with the original AGREE and AGREE II (as a tool to inform guideline development, guideline reporting, guideline evaluation, and other) and the extent of this experience (never, 1 to 5 guidelines, 6 to 10 guidelines, 11 to 15 guidelines, 16 to 20 guidelines, 20+ guidelines). They were also asked if they had participated in any AGREE-related research study previously (yes, no, uncertain). Participants who answered they had not participated in an AGREE-related research study and who had little to no AGREE or AGREE II experience (defined as never using either instrument or using it on a maximum of 1 to 5 guidelines) were eligible to participate.
These individuals were then randomized to group and received access to an individualized password-protected web-based study platform. Participants completed their specific training intervention, evaluated one of ten test PGs using the AGREE II, and completed a series of post-test Learner's Scales and a demographics survey. Participants were blinded to the study conditions, our research questions, and hypothesis.
Materials and instruments
Eleven PGs were selected for this study: one served as the practice PG for participants randomized to the Tutorial + Practice Exercise group and, to facilitate generalizability of results, the remaining ten were selected for the test PGs. Participants were randomized to one of the ten test PGs. Practice guideline was not a factor of analytic interest. Eligibility criteria for the 11 PGs are described in detail in the previously published protocol and include: English-language documents published from 2002 onward; were within the clinical areas of cancer, cardiovascular, or critical care; were 50 pages or less; and represented a range of quality .
AGREE II performance
The AGREE II consists of survey items and a User's Manual [11, 12, 13]: twenty-three items are grouped into six domains of PG quality: scope and purpose, stakeholder involvement, rigour of development, clarity of presentation, applicability, and editorial independence. Items are answered using a 7-point response scale ('strongly disagree' to 'strongly agree'). Standardized domain scores are calculated enabling construction of a performance score profile permitting direct comparisons across the domains or items. The AGREE II survey items conclude with two global measures answered using a 7-point scale: one item targeting the PG's overall quality and the second targeting the appraiser's intention to use the PG. The User's Manual provides explicit direction for each of the 23 and two overall items, as noted above. Participant performance served as the primary outcome.
In addition to the primary outcome of performance on the test PG, a series of secondary measures, known as the Learner's scale, were also collected. This scale was comprised of Learner Satisfaction scale (i.e., satisfaction with learning opportunity), Self-Efficacy scale (i.e., belief one can succeed), Mental Effort scale (i.e., cognitive effort to complete a task), and Time-on-Task. With the exception of Time-on-Task, which was a self-report measure, a 7-point response scale was used to answer the remaining items. The questions included in the Learner's scale were inspired by previous work done in this field [17, 18, 19, 20, 21, 22, 23]. Specific reliability and validity testing of the items and subscales was not undertaken.
AGREE II perceptions
Participants were asked to rate the usefulness of the AGREE II (for development, reporting, and evaluation) and the User's Manual using a 7-point scale.
Demographics and AGREE II Experience scale
Participants were asked about their backgrounds including experience with the PG enterprise, the original AGREE instrument and the AGREE II.
Outcomes and analyses
Two performance measures served as the primary outcomes. First, the Performance - Distance Function calculates the difference between the domain scores of the participants from those of expert norms. Expert norms were derived by members of the AGREE Next Steps research team who appraised the test PGs used in this study. Four expert appraisers rated each guideline. Mean standardized scores were used to construct the expert performance score profiles. Thus, the measure of distance (i.e., difference in scores between participants and experts) for each AGREE II domain was calculated by squaring the difference between the participants' profile domain ratings from the experts' profile domain ratings. A series of one-way analysis of variance tests were subsequently calculated to examine differences in distance function as a function of training intervention.
Second, performance was measured by examining the proportion of participants who met minimum performance competencies with the AGREE II tool . A Pass/Fail algorithm designed for another study  was used here to calculate the performance level for participants randomized to the condition with the practice PG.
The Learner's scale served as the core secondary measure. To this end, a series of multivariate one-way analysis of variance tests were conducted to examine differences in participants' satisfaction, self-efficacy, and mental effort as a function of training intervention. A series of analysis of variance tests were conducted to examine differences in participants' self-reported Time-on-Task and in participants' reported perceptions of the AGREE II.
There were no changes to any of the outcomes once the trial commenced.
18 to 24
25 to 34
35 to 44
45 to 54
55 to 64
Allied Health (e.g., PT, OT, RT)
Other (non specified)
% with health research methods training
Use of AGREE as a tool to inform PG development
1 to 5 times
Use of AGREE as a tool to inform PG reporting
1 to 5 times
Use of AGREE as a tool to evaluate PG
1 to 5 times
Use of AGREE II as a tool to inform PG development
1 to 5 times
Use of AGREE II as a tool to inform PG reporting
1 to 5 times
Use of AGREE II as a tool to evaluate PG
1 to 5 times
Letters of invitation were sent to 107 participants, of which 87 were eligible to participate (12 were excluded based on past experience with the AGREE Instrument and eight were non-respondents to the letter of invitation). Sixty participants completed the study (response rate = 69%), 20 per condition. The majority of participants were female, between the ages of 25 and 65, and with some level of health methods training.
Performance - distance function (Table 2)
Distance function (mean (standard deviation))*
Domain 1. Scope and Purpose
Domain 2. Stakeholder Involvement
Domain 3. Rigour of Development
Domain 4. Clarity of Presentation
Domain 5. Applicability
Domain 6. Editorial Independence
There were no significant differences in any of the domain distance functions between the three training groups (p > 0.05 for all comparisons).
Performance - pass/fail criteria
86% of the individuals in the Tutorial + Practice Exercise training intervention arm passed the online training with the practice PG.
Training satisfaction and self-efficacy (Table 3)
Training Satisfaction and Self-Efficacy Ratings (1 to 7 scale; means and (standard deviations)).
Training Satisfaction and Self-Efficacy
Training Satisfaction (MANOVA, p > 0.05)
The training exercise was conveyed at the appropriate level
The training exercise was a valuable learning experience
The training exercise was a positive experience
The training exercise was completed in a reasonable amount of time
The training exercise has increased my understanding of the content of the AGREE II
The training exercise has increased my confidence to assess the quality of PGs using the AGREE II
I was able to navigate the training exercise with ease
The information in the training exercise was logically grouped together
The training exercise achieved its stated objectives.
The training exercise was relevant to my practice/goals and my learning needs.
Overall, I was satisfied with my AGREE II training experience
Self-Efficacy (MANOVA, p > 0.05)
I am confident in my ability to use the AGREE II to assess PGs
I am comfortable with the structure of the AGREE II
I am comfortable with the content of the AGREE II
I am confident in applying my AGREE II skills
Participants reported high levels of training satisfaction (means 6.0+) and self-efficacy (means 5.4+). There were no significant differences in any measure as a function of training condition (p > 0.05 for all comparisons). The Tutorial, Tutorial + Practice Exercise, and review of the PDF training options were recommended by 80%, 60%, and 60% of participants, respectively (p > 0.05 for all comparisons).
Mental effort (Table 4)
Mental Effort Ratings (1 to 7 scale; means (standard deviations)).
Mental Effort (MANOVA, p> 0.05)
Mental effort tutorial: The AGREE II Overview Tutorial was mentally demanding
Mental effort tutorial: The pace of the AGREE II Overview Tutorial was hurried/rushed
At the end of the AGREE II Overview Tutorial, I was discouraged
Reviewing the AGREE II was mentally demanding
At the end of reviewing the AGREE II, I was discouraged
The interactive practice exercise was mentally demanding
At the end of the interactive practice exercise, I was discouraged
Rating and assessing the practice guideline with the AGREE II was mentally demanding
Rating and assessing the practice guideline with the AGREE II was very hard work
At the end of rating and assessing the practice guideline with the AGREE II, I was discouraged
The multivariate analysis of variance failed to show a difference in participants' reporting of mental effort as a function of training condition. With the exception of one measure (the AGREE II was mentally demanding), the univariate analyses of variance also failed to show significance differences.
Time-on-task (Table 5)
Time-on-Task (minutes; means (standard deviations)).
User's rating of how long it took to overview PDF copy of AGREE II
User's rating of how long it took to do interactive practice exercise
User's rating of how long it took to read and rate PG
There were no significant differences as a function of training condition in the time spent by participants reviewing either the PDF version of the AGREE II or in the time taken to complete the test PG (p > 0.05 for all comparisons).
AGREE II perceptions (Table 6)
AGREE II Perceptions (1 to 7 scale; means and (standard deviations)).
AGREE II Perception
I believe the AGREE II will be a useful tool to inform practice guideline development
I believe the AGREE II will be a useful tool to inform practice guideline reporting
I believe the AGREE II will be a useful tool to evaluate practice guidelines
I believe the User's Manual enhanced my skill in use of applying the AGREE II
Participants reported favourable perceptions about the AGREE II as a tool to facilitate the development, reporting, and evaluation of PGs; they also reported favourable perceptions about the AGREE II User's Manual in enhancing skills with its application. No significant differences were found for any outcome as a function of training intervention conditioSn.
In this study, we tested two internet-based electronic training interventions against a traditional training method using a PDF version of the User's Manual to determine their effects on various measures related to performance on and attitudes toward the AGREE II. The goal was to identify the best strategy to facilitate the AGREE II's appropriate and effective uptake by its stakeholders. In contrast to our hypotheses, participants randomized to the training condition that included the Tutorial + Practice Exercise did not demonstrate superior performance with the AGREE II, greater satisfaction with the training experience, higher levels of self-efficacy, or more positive attitudes toward the tool than did participants randomized to the other two conditions.
One potential explanation is that our randomization did not work properly, and there were differences in experience participants had in health research methodology and/or the AGREE or the AGREE II. Our demographic data (see Table 2) suggest participants allocated to the control condition may have been more apt to have had minimal exposure than no exposure to the tools than were participants allocated to the other conditions. The inclusion of direct pretest measures to more accurately capture guideline performance before training exposure and to ensure baseline characteristics of the participants do not vary on this factor may be warranted in future studies.
A second potential explanation for our findings is that our interventions did not work. This explanation, however, is not well supported. First, each intervention arm aligned with design characteristics found in other studies and systematic reviews to be effective training features, such as immediate feedback, interactivity, and repetition [15, 16]. Second, albeit the data are subjective, they do show that participants liked all of our interventions; for example, satisfaction measures and self-efficacy measures are extremely high, well above the mid-point of the 7-point response scale. To that end, one may conclude then, that our control condition (i.e., review of the PDF version of the AGREE II only) was very effective, and that there is a ceiling effect on performance measures and other outcomes.
Exploring these conclusions further, a significant component in the revision of the AGREE II was the reworking of the User's Manual and its written training resource component. As described, the document provides descriptions, examples, and explicit direction for how to evaluate a PG report using AGREE II. The comprehensive nature of the PDF version of the AGREE II User's Manual may be quite sufficient for many potential users. In fact, previous research, as was found in this study, demonstrates high support for the User's Manual by participants .
While this study failed to demonstrate superiority of the online electronic training interventions, we do not believe they should be abandoned all together. While we were successful in screening participants so that they had little-to-no experience with the AGREE II or the original version of the tool, virtually all participants had some experience in health methods (e.g. systematic review, critical appraisal) and many had experience with the PG enterprise (see Table 1). This selection bias may represent a limitation to the study that also compromises the interpretability of the findings. Specifically, it may be that the online training interventions would be of benefit to the truly novice participant: individuals with no experience with the AGREE II, PGs in general, or health research methodology--for example, trainees and students in the field of health services research. There are some previous data to support this. In the separate project that developed the pass-fail algorithm used in this study, most of the participants were trainees early on in their post-graduate career with considerably less experience in health methods or PGs. In contrast to pass rates of 86% reported in this study, the initial pass rates for those participants was 73%, suggesting the training may be better suited for novice users. Future research studies recruiting these types of participants are warranted.
Indeed, educational research supports the notion of adapting instructional methods based on individual differences in prior knowledge. In general, the literature suggests that good instructional design techniques may be of more importance for low prior knowledge than for high prior knowledge learners [19, 22]. Redundant content should usually be eliminated for more experienced learners. It is possible that the more knowledgeable learners in our study experienced unnecessary extra cognitive load from the additional e-learning instructional interventions, when the control materials of the User's Manual were sufficient. There may even be expertise reversal effects, where a given instructional method that works well for novice learners  is less effective or even detrimental for individuals with more expertise . In this study, it is possible that either the ceiling effect or detrimental effects of redundancy may have led to no difference from the control condition. Further investigation is required to assess whether efficient instruction on the AGREE II for more advanced learners will require different methods than training designed for entry-level learners.
In summary, our study did not demonstrate our two online AGREE II electronic training interventions improved outcomes over the control condition. We believe this can be explained in part by the better than expected performance of the control condition (i.e. current standard of the PDF AGREE II, namely the User's Manual) and in part by the level of experience among the participants with health methods and PGs. Future research may demonstrate that the two online training interventions may be best suited to and effective tools for very novice users, new to the area of PGs and the AGREE II Enterprise. The training interventions are available through the AGREE Enterprise Web site .
The authors wish to acknowledge the contributions of the members of the AGREE A3 Team who have participated in the AGREE A3 Project. The authors wish to acknowledge the contributions of Chad Large and Steve McNiven-Scott of the Division of e-Learning Innovation at McMaster University for their contributions to the development of the web-based platform used for the study interventions. This study is funded by the Canadian Institutes of Health Research and has received ethics approval from the Hamilton Health Sciences/Faculty of Health Sciences Research Ethics Board (REB #09-398; McMaster University, Hamilton, Ontario, Canada).
- 1.Committee to Advise the Public Health Service on Clinical Practice Guidelines, Institute of Medicine, Field MJ, Lohr KN, (Eds): Clinical practice guidelines: directions for a new program. 1990, Washington: National Academy PressGoogle Scholar
- 4.Browman GP, Brouwers M, Fervers B, Sawka C: Population-based cancer control and the role of guidelines-towards a 'systems' approach. Cancer Control. Edited by: Elwood JM, Sutcliffe SB. 2010, Oxford: Oxford University PressGoogle Scholar
- 6.Grimshaw JM, Thomas RE, MacLennan G, Fraser C, Ramsay CR, Vale L, Whitty P, Eccles MP, Matowe L, Shirran L, Wensing M, Dijkstra R, Donaldson C: Effectiveness and efficiency of guideline dissemination and implementation strategies. Health Technol Assess. 2004, 8 (6): iii-iv. 1-72. [Review]CrossRefPubMedGoogle Scholar
- 9.Streiner DL, Norman GR: Health Measurement Scales. A practical guide to their development and use. 2003, Oxford: Oxford University Press, 3Google Scholar
- 10.Cluzeau F, Burgers J, Brouwers M, Grol R, Makela M, Littlejohns P, Grimshaw J, Hunt C, for the AGREE Collaboration: Development and validation of an international appraisal instrument for assessing the quality of clinical practice guidelines: the AGREE project. Qual Safe Health Care. 2003, 12: 18-23.CrossRefGoogle Scholar
- 11.Brouwers M, Kho ME, Browman GP, Burgers JS, Cluzeau F, Feder G, Fervers B, Graham ID, Grimshaw J, Hanna S, Littlejohns P, Makarski J, Zitzelsberger L, for the AGREE Next Steps Consortium: AGREE II: Advancing guideline development, reporting and evaluation in healthcare. Can Med Assoc J. 2010, 182: E839-E842. 10.1503/cmaj.090449.CrossRefGoogle Scholar
- 12.Brouwers MC, Kho ME, Browman GP, Burgers J, Cluzeau F, Feder G, Fervers B, Graham ID, Hanna SE, Makarski J, on behalf of the AGREE Next Steps Consortium: Performance, Usefulness and Areas for Improvement: Development Steps Towards the AGREE II - Part 1. Can Med Assoc J. 2010, 182: 1045-1052. 10.1503/cmaj.091714.CrossRefGoogle Scholar
- 13.Brouwers MC, Kho ME, Browman GP, Burgers J, Cluzeau F, Feder G, Fervers B, Graham ID, Hanna SE, Makarski J, on behalf of the AGREE Next Steps Consortium: Validity Assessment of Items and Tools To Support Application: Development Steps Towards the AGREE II - Part 2. Can Med Assoc J. 2010, 182: E472-E478. 10.1503/cmaj.091716.CrossRefGoogle Scholar
- 17.Dick W, Carey L, Carey JO: The Systematic Design of Instruction. 2005, Boston; PearsonGoogle Scholar
- 18.Clark RC: Developing Technical Training. 2008, San Francisco: John Wiley & SonsGoogle Scholar
- 19.Clark RC, Nguyen F, Sweller J: Efficiency in Learning. 2006, San Francisco: John Wiley & SonsGoogle Scholar
- 20.Clark RC, Mayer RE: E-Learning and the Science of Instruction. 2007, San Francisco: PfeifferGoogle Scholar
- 21.Clark RC: Building Expertise. 2003, Silver Spring: International Society for Performance ImprovementGoogle Scholar
- 26.AGREE Enterprise Website. [http://www.agreetrust.org/resource-centre/training/]
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.