Introduction

The clinical use of lung ultrasound (LUS) in emergency departments, critical care units as well as in respiratory departments has increased substantially. LUS has an excellent diagnostic accuracy for many of the most common causes of acute respiratory failure (e.g., cardiogenic pulmonary edema, pneumonia, pleural effusion, and pneumothorax) and increases the proportion of patients receiving a correct diagnosis and treatment [1,2,3,4,5,6]. Furthermore, LUS is a rapid, bedside, non-invasive, radiation-free diagnostic tool, which the clinician can use as an integrated part of the initial clinical assessment as well as for monitoring purposes. However, the value of LUS is dependent on competent operators performing the examination.

Several societies, e.g., the European Federation of Societies for Ultrasound in Medicine and Biology, British Thoracic Society and European Association of Cardiovascular Imaging, have clear guidelines and descriptions of logbook, number of performed supervised examinations needed, and basic knowledge curricula, which must be obtained before performing unsupervised lung ultrasound examinations [7,8,9]. However, no clear evidence-based guidelines or recommendations exist on the training needed to obtain adequate skills for performing an LUS examination.

Like other procedures and treatments, LUS education and certification should be based on best available evidence, and with gathered validity evidence in learning- or clinical studies. The aims of this systemic review were to provide an overview of the literature published in learning studies in clinical LUS, and to explore and collect evidence for future recommendations in lung ultrasound education and competency assessment.

Materials and methods

The systematic review was performed according to the Preferred Reporting Items for Systematic Review and Meta-Analysis (PRISMA) guidelines [10]. A systematic literature search was conducted in PubMed, Embase, and Cochrane Library in collaboration with a research librarian from the Medical Research library at Odense University Hospital, Denmark. Terms used: lung OR lungs OR pulmonal OR pulmonary OR thoracic OR thorax OR thoracal OR mediastinal OR mediastinum, ultrasound OR ultrasonic OR ultrasonography OR ultrasonics OR sonography OR sonographic, medical education OR education OR learning OR training OR clinical competences OR curriculum including MeSH terms. The search was completed on March 7, 2017. The inclusion criterion was: learning- or education studies in lung or thoracic ultrasound. No exclusion criteria were provided within languages, animal studies, etc.

After removing duplicates, all titles and abstracts were screened by two authors (PP and KRM). All articles that potentially met the broad inclusion criterion or indeterminate articles were assessed with full article reading. Abstracts regarding the following studies were excluded: ultrasound education in other organ systems or anatomical structures than lungs or thorax, cost–benefit analysis, case reports, author responses, letter to the editor, and comments. Diagnostic accuracy studies were excluded from this review, except from those, which also included a learning study or had objectives or outcomes that assessed training or development of competencies in LUS. The same two authors then subsequently read all eligible articles, and each article was discussed until consensus. In case of disagreement, a third reviewer (CBL) was conferred. Hand search was conducted on references of included full articles. Level of evidence was categorized using the Oxford Centre for Evidence-Based Medicine (OCEBM) system for Level of Evidence [11]. Bias in each included article were discussed and marked according to Cochrane Collaboration risk of bias [12].

Results

Search strategy

The initial search yielded 7796 publications. After removal of duplicates, author responses and conference abstracts, 4656 publications remained. Of these, 4622 were excluded. Most of the excluded studies did not meet the inclusion criterion at all, and comprised complete different topics, aims, and objectives than education or assessment in LUS or thoracic ultrasound. Because of the wide search strategy, the amount of publications not relevant for this systematic review was large. Figure 1 presents the eligibility process and exclusion of articles. Causes of the full-text exclusions were: diagnostic accuracy studies (n = 6), testing the effectiveness and use of different models/phantoms or hands-on facilities for LUS (n = 7), describing implementation, use and feasibility of LUS (n = 3), train-the-trainer course (n = 1), and assessment of respiratory therapists’ theoretical and clinical skills in LUS (n = 1). The reference lists of included papers were screened without leading to inclusion of further studies. Study design, participants, learning strategy, hands-on facilities, and assessment are described below. Additional information is shown in Tables 1 and 2.

Fig. 1
figure 1

Flowchart of search strategy, and selection process based on the Preferred Reporting Items for Systematic Reviews and Meta-analysis (PRISMA)

Table 1 Publications on education in lung ultrasound: study characteristics
Table 2 Publications in education in lung ultrasound: study statistics and conclusion

Study design

In total, there were 12 pre- and post-test studies that used improvement in written test scores to evaluate the educational Cochrane [13,14,15,16,17,18,19,20,21,22,23,24]. Five of the pre- and post-test studies had a follow-up time from 1 week to 6 months, average 13 weeks ± 4.83 [14, 16, 18, 20, 25], and one recorded number of scans performed from baseline to follow-up [20]. Three descriptive studies were identified [25,26,27] and one randomized controlled trial [28]. Five of the studies (31%) were courses in general critical care ultrasound, or basic skill ultrasound, where thoracic or lung ultrasound was a specific and independently evaluated topic [17, 19,20,21, 24].

Participants

Most study participants were ultrasound novices, and especially novices in clinical LUS, and varied from medical students to respiratory therapists, emergency department residents, and anesthesiologists. Three studies also included other healthcare professionals as prehospital providers, nurses, and veterinarians [18, 22, 24]. Two studies excluded participants with the previous ultrasound certification or attendance in a formal critical care ultrasound course within 12 months [20, 28], and two studies only included a study population with no experience [21, 24].

Learning strategy

Learning strategies in the studies included were heterogeneous in both time spent on lectures, theoretical presentation, and method used for assessment. The most commonly used educational tool used was didactic lectures (n = 12, 75%), with a variation of time spent from 30 min sessions [26] to 2.5 h sessions [15]. Abbasi et al. presented a single topic course (detection of pneumothorax with LUS), and time spent on didactic lecture was 30 min. This study was the only single topic course that used didactic lecture as educational tool [26]. Remaining studies introduced classroom-based learning covering a more comprehensive introduction to full LUS, primarily with 15–30 min education in each of the main topic. Some studies had a clear overview and description of topics included in the didactic lectures, whereas other studies only stated the overall general topics (Table 1).

Four studies describe a full day to 3 days courses with alternating theoretical and hands-on sessions [14, 19, 20, 24]. Four studies incorporated live ultrasound examinations by instructors in the theoretic session to combine the theoretic and practical understanding [19, 20, 24, 26]; otherwise, images and video clips were frequently used in the lectures.

Web-based learning or online presentations were used in 7 (44%) studies [16, 19, 21, 23, 25, 27, 28]. Four of those had only online presentations or web-based learning modules without didactic lectures or hands-on sessions [16, 25, 27, 28]. Cuca et al. studied a web-based learning program evaluated by nine experts of the international lung ultrasound consensus committee [16], and used the same written tests, topics, and curriculum as the study by Breitkreutz et al. [15]. Cuca et al. compared the results from the two studies. Krishnan et al. [25] presented a 5 min online presentation in the use of ultrasound as a diagnostic tool to confirm pneumothorax. Gargani et al. had a 26 min online presentation with primary focus on b-line presentation, interpretation, and the possibility of real-time demonstrations or meeting with instructors on Skype. Subsequently, participants were to upload seven LUS examinations for evaluation. When the instructors had approved the seven videos, the participants could proceed to the second part of the training, including a set of 44 videos with the focus of counting b lines [27]. In the randomized trial by Edrich et al., one of the study groups received a web-based educational learning program and had no hands-on session, another group had a 45 min classroom-based lecture and 20 min hands-on, whereas the control group had no lectures at all. The participants were evaluated with a pretest, post-test, and 4 week retention test [28].

Hands-on training facilities

Twelve of sixteen studies included hands-on sessions in the educational program [13,14,15, 17, 19,20,21,22,23,24, 26, 28]. Simulators were used in three studies [19, 20, 26], and healthy live models in eight studies [14, 15, 19,20,21, 24, 26, 28]. In five studies, emergency department patients or patients with respiratory failure in other departments were assessed as a part of the training program [15, 17, 23, 26, 27], including three studies, where LUS video clips from patients hospitalized were obtained and used in the assessment [13, 18, 25]. Porcine models were used in two studies [14, 22]. Four studies combined the use of different models, patients and/or simulators [14, 15, 19, 20, 26].

Assessment

Thirteen studies used written examinations to assess theoretical knowledge obtained at the educational programs [13,14,15,16,17,18,19,20,21,22,23,24,25]. They all used multiple-choice items format covering true/false questions, one-best-answer questions, single-correct-answer questions and multiple-response questions, all included images and/or video clips in the questions. None of the studies described gathering validity evidence for neither the pre- and post-tests nor the practical skill assessment tools. One study, however, had the multiple-choice questions (MCQs) peer-reviewed by the instructors ahead of the study [20], but the vast majority of the assessment checklists, written tests, and curricula were described as based on the international consensus recommendations for point-of-care lung ultrasound by Volpicelli et al. [29].

Eleven studies assessed participants’ practical skills [14, 15, 17, 19,20,21,22,23,24, 26, 28]. The most common method used for evaluation and assessment of practical skills was observer checklists but varied greatly. Participants in See et al. [23] scanned 12 zones with an instructor bedside, who was allowed to comment or help if needed, videos were stored, and participants then interpreted the clips in front of the instructor. Connolly et al. [19] assessed the participants’ practical skills by letting participants scan four windows, and videos were stored and rated by blinded instructors. Breitkreutz et al. [15] had 16 predefined sonoanatomical structures that participants should present and were then rated on a standardized sheet. Respectively, 46 and 84 checklist items were to be scanned in Hulett et al. and Dinh et al. [17, 20] and were evaluated regarding image acquisition and interpretation. Furthermore, Dinh et al. presented four cases with 20 case questions each [20]. Heiberg et al. [21] performed online testing of the students’ practical skills by correct/incorrect and offline evaluation of image quality and interpretation. Greenstein et al. used 20 standardized examination tasks and 20 video-based examinations [24], whereas Oveland et al. presented scans on porcine models with confirmation or validation of pneumothorax, oral feedback from instructor and yet another scan session [14].

Level of evidence of the included studies is presented in Table 2 according to OCEBM guidelines, and assessment of risk of bias in Table 3. No studies scored the highest level of evidence, one study scored 2, remaining part of the studies scored 4. Bias was assessed as high in the majority of the studies (Table 3).

Table 3 Scores of the Cochrane Collaboration risk of bias assessment tool [12]

Discussion

The vast majority of the currently published LUS learning studies are one-group pre- and post-tests studies with low level of evidence. This study design can just inform us that trainees learned something from the specific intervention, but does not provide any evidence on how to build a curriculum [30]. The studies are heterogeneous in choice of: educational program, teaching methods, participant assessment, and study outcome. In addition to conventional classroom-based didactic lectures, web-based learning was often chosen as an alternative or additional method and was used in 7 of the 16 included studies [16, 19, 21, 23, 25, 27, 28], but only one study measured the effect of the two educational methods, and compared the results from the two groups in a randomized controlled trial [28].

Web-based learning strategies have been proven to have several advantages. Ruiz et al. describe increased accessibility and flexibility as important advantages. It standardizes course content and delivery independent of teacher presentation and variation. Students are in control of their learning sequence and learning pace, and web-based learning can be designed to include outcome assessment [31, 32]. Furthermore, it is possible to implement different types of multimedia such as graphics, videos, animations, and texts to increase learning ability. A meta-analysis by Cook et al. [33] proved that medical web-based learning was significantly superior to no intervention, and participants could achieve results similar to traditional learning methods like classroom-based learning in numerous diagnostic and therapeutic content areas. Edrich et al. [28] correspondingly found the same improvement. Since web-based education has similar outcome as classroom-based lectures, it would be obvious to include other parameters like maintenance of both theoretical and practical skills with follow-up assessments, time efficiency, and user satisfaction surveys. The meta-analysis, like this systematic review, suffers from considerable heterogeneity in study participants, learning methods, and outcome measures.

Web-based learning in general point-of-care ultrasound has advantageously been evaluated in several studies [34,35,36]. In Kang et al. [36], outcome measures were not only improvement in test score, but also hours spent on organizing the course and course costs. In both cases, web-based learning was more cost-effective. None of the studies included in this systematic review incorporated cost–benefit analysis, but one concluded that an ultrasound symposium requires a massive setup and great financial resources because of the number of ultrasound machines, phantoms, volunteers, instructors, and rooms. When building a theoretical curriculum in medical education, the teacher:student ratio can be low without affecting the learning ability significantly. However, when training practical skills, it requires a closer relation and interaction between instructor and trainee, and the most optimal trainee to instructor ratio is as close as 1:1 as possible. Oveland et al. [14] also discussed cost–benefit issues and concluded that porcine models as simulators and animal laboratory training in general, combined with ethical considerations, may be an option but have time, venue, and cost dilemmas.

The practical skill assessments of course participants in the included studies diverge in amount of checkpoints and topics. Even though the studies included used various checklists to keep the assessment as objective and standardized as possible, only two studies had blinded reviewers scoring the stored images or ultrasound sequences afterwards [19, 28], and no validity evidence was provided for any checklists.

LUS imaging and examinations differ from other point-of-care ultrasound examinations, because image interpretation and pathological recognition are based on sonographic artifacts instead of directly imaging diagnostics as, e.g., thickening of gallbladder wall, pericholecystic fluid, and sludge as a sign of acute cholecystitis. Therefore, there is a great need for a standardized and validated tool for assessing the understanding of LUS, image acquisition, and image interpretation, additionally, to demonstrate the capability to correlate the patterns and interpretations to lung pathology and physiology.

In general, when introducing a new assessment tool, validity evidence should be gathered, to ensure the reliability, and to make it possible for meaningful interpretation. Today, one of the most described and recognized frameworks for validity testing is by Messick [37]. Five distinct sources of validity evidence in scientific experimental data have been discussed; content, response process, internal structure, relationship to other variables, and consequences [38]. Some types of assessment demand a stronger emphasis on one or more sources of evidence depending on the curriculum, consequences, and properties of inferences. All sources should be researched with the highest level of evidence possible, but within this setting, an assessment tool should emphasize content-related evidence with some evidence of response quality, internal structure, and consequences.

A new study have constructed and gathered validity evidence for an instrument to assess LUS competences by obtaining international consensus by experts in multiple specialties [39]. The objective structured assessment of lung ultrasound skills (LUS-OSAUS) could form the foundation of further and more homogeneous studies in the future.

The theoretical assessment was a preferred method for measuring the degree of obtained theoretical knowledge before and after a course, but single-group pretest post-test design suffers from minimal internal and external validity. In the case of evaluating medical education through this set-up, it would be surprising if an increased post-test score was not found. This setup has been discussed and criticized for decades and is today considered obsolete [30, 40, 41]. A single topic curriculum like presented in Krishnan et al., where participants were presented for a 5 min online presentation in detection of pneumothorax with LUS, and assessed theoretical with 20 videos, proves that even a very short theoretical session leads to increased knowledge and pattern recognition. However, it does not provide any guarantee that the trainees can obtain the ultrasound images themselves, or connect the patterns to relevant differential diagnosis in a clinical setting.

One study reported that their theoretical test was validated, but did not describe how this was done [18]. Another had the questions peer reviewed by authors of the study [20]. Written tests, in general, are proven to be authoritative motivating, facilitating the learning process and cost-effective [42]. Disadvantages of using the same theoretical test as pretest, post-test, and follow-up test are recall bias or “learning the test” [43, 44]. The majority of the studies have tried to eliminate this bias by changing the order of questions as well as the order of answers. None of the participants in the included studies were blinded to the studies. Since the participants knew that they were being evaluated, they may have been more motivated to enhance their performance in the tests.

There were large differences in the use of healthy live models, patients with respiratory failure or lung diseases, phantoms/simulators, or porcine models for the hands-on training. The overall conclusion was that all models could contribute to increased hands-on competencies. Summarized, the different models could contribute to different aspects of the learning process; healthy live models were well suited for getting comfortable with the ultrasound devices, learning advantages and disadvantages of various transducers, improving image optimization, and learning hand–eye coordination. When using porcine models, it was possible to create pneumothoraces or pleural effusions allowing trainees to train the visual understanding of these diagnoses, but as discussed animal laboratory models have several other limitations. Dinh et al. [20] discuss the use of patients in an educational setting, and found it difficult to incorporate and standardize live pathology given the logistical challenges of recruiting patients with specific diseases and sonographic pattern. See et al. [23] reported problems with only a minority of the trainees scanned patients with pneumothorax due to a low prevalence of pneumothoraces. In addition, it is crucial not to delay diagnostic or initial treatment when using admitted patients in a learning study. Two studies used simulators for learning pathological patterns; both found simulators useful, and state that with the use of simulators, the students engage in both acquiring image and interpreting the abnormal finding while assimilating muscle memory with cognitive learning [20].

We acknowledge that the literature review was constrained by the quantity and quality of available evidence. Three databases were searched, decided being relevant for the topic, but a broader search strategy could potentially reveal more studies eligible for this systematic review, and we did not include data that were not published. However, all reference lists of publications eligible for full-text reading were searched with no additional findings. A minor part of the excluded publications contains education in lung ultrasound in context with ultrasound in other organ systems, e.g., abdominal ultrasound or eFAST (extended focused assessment with sonography for trauma). Different alternative expanded protocols for lung ultrasound or combined ultrasound have been developed and anchored in different specialties, and the evaluation of education of these different protocols was beyond the aim of this study. Therefore, studies were only included if the educational outcome was based on lung ultrasound separately.

The included studies failed to contribute to compelling body of evidence to support the educational evidence in LUS, and a meta-analysis was not possible to conduct because of the differences in assessment tools, and lack of comparability.

Standardized recommendations for education and certification in LUS is not possible to establish based on published studies because of heterogeneity in study design, low evidence-level, and high risk of bias among included literature. All courses showed progress in both theoretical and practical skills no matter which educational method used. If recommendations should be assigned from the current studies included in this systematic review and existing medical education literature, it would be ideal to use a three-step mastery-learning approach. First, trainees should obtain theoretical knowledge through either classroom-based education or web-based lectures with a curriculum based on experts’ opinion and a validated post-test with a pass–fail standard to ensure sufficient theoretical knowledge. Second, focused hands-on sessions on simulators, pigs, or healthy subjects until competency are demonstrated in the training environment using a performance test with solid evidence of validity. Third, supervised scanning of real patients with feedback from a trained instructor who preferably uses an assessment tool to decide when the trainee is ready for independent practice. Virtual-reality simulators could play an important role in the training of LUS, especially of pathologic cases, and could also provide standardized and objective assessments of competence. As far as we know, no studies have developed valid simulator-based tests of competence in LUS, even though simulators are commonly used in other specialties and are demonstrated to have a great potential for reproducible and objective assessment and effects on skill and behavior [45,46,47].

In conclusion, more uniform, competency-based training programs and assessment tools are needed to ensure a higher standard of education and assessment in LUS. Furthermore, simulation training could potentially `bute to the hands-on training in a calm environment making it possible to train high-risk cases without putting patients in risk.