Introduction

To ensure the key goals of mammography are achieved, quality standards should be adopted. Ideally, these should be wide in scope and address the various aspects with impact on the mammography imaging process (e.g. technical, clinical and training).

A systematic approach for assessing critical performance indicators can be achieved through the implementation of a quality assurance (QA) program. QA provides a framework for constant improvement through a feedback mechanism. It allows the identification of deviations from optimum performance of mammographic equipment, suboptimal clinical practice and training needs [13].

An effective QA program should be practical to implement in a clinical setting. Adequate test equipment is necessary as well as standard methodology that provides ability to obtain the relevant objective, and subjective metrics of quality. Also, an effective QA program should be implementable at a low or moderate cost [4].

The testing of equipment should address the various critical stages of the imaging chain (acquisition, processing and display) and be implemented in a multidisciplinary team approach by trained staff (radiographer, medical physicist, radiologist) [3, 5].

In the past 20 years, several guidance documents have been developed nationally and internationally to promote quality in mammography. The scope of the guidance documents varies with some focused on technical aspects [4, 610], whereas others include also clinical aspects (e.g. epidemiology, interventional, pathology, surgery) [7, 11]. The developments in digital mammography over the last 10 years have resulted in developments in QA programmes and promoted the recommendation of new tests and procedures for quality control [12].

This study aimed to identify, analyse and compare selected protocols currently available for QA in mammography, and to discuss their contribution to harmonise practices in mammography worldwide.

This review aims to provide useful guidance to countries aiming to implement (or further develop) a QA program in mammography.

Methods

An extensive search was performed to identify guidance documents and protocols for QA in mammography. Sources used included scientific databases, organisations of national healthcare systems (hospitals, regulatory bodies, etc.), international agencies (e.g. International Atomic Energy Agency [IAEA], International Commission on Radiological Protection [ICRP]), professional colleges (e.g. American College of Radiology [ACR], Royal College of Radiologists [RCR]) and scientific associations (e.g. Institution of Physics and Engineering in Medicine [IPEM], American Association of Physicists in Medicine [AAPM]). The search returned various documents published in English, French, Portuguese, Spanish, German, Italian, Swedish and Dutch. Only documents published in English or French were considered for comparability issues, as other languages were not mastered by the team.

The guidance documents identified were reviewed and compared for structure, editorial details, target staff profiles, technologies addressed and type of guidance (technical and clinical). Comparative tables are presented summarising the most relevant findings.

Results

Guidance documents for QA and quality control (QC) in mammography

Fourteen guidance documents for QA and QC in mammography published between 1991 and 2011 were identified (Table 1). Two are recommended by European bodies (European Reference Organisation for Quality Assured Breast Screening and Diagnostic Services [EUREF] and European Commission [EC]), three are internationally proposed by the IAEA and ten have national or regional scope (United States of America [USA], Canada, Australia, United Kingdom [UK], Ireland, Nordic) by governmental bodies, professional and/or scientific organisations.

Table 1 Guidance documents for quality assurance and quality control in mammography

Guidance documents for QA and QC in mammography—scope and professional groups targeted

Four documents address both conventional and digital mammography. All documents are primarily focused on providing technical guidance. Three documents include both technical and clinical guidance.

Thirteen documents are targeted at medical physicists and nine also include guidance for radiographers and radiologists. One protocol is specifically targeted at radiographers (Table 2).

Table 2 Guidance documents for QA and QC in mammography: target staff profiles, technologies and guidance type (clinical and/or technical)

The EC protocol, Australian and Irish protocols are broader in scope and include guidance to epidemiologists, nurses, oncologists and surgeons.

Performance testing of mammographic systems and breast dose assessment

Most documents (exceptions are the European Protocol [EP] and IAEA-D protocols) recommend performance testing of the three main stages of the mammography imaging chain (Tables 3, 4, 5, 6 and 7):

  1. 1.

    Image acquisition (the stage with more intensive testing)

  2. 2.

    Image processing (following the manufacturers’ recommendations)

  3. 3.

    Image display (includes monitor and printer testing)

Table 3 Recommended tests for QC for image detection and acquisition in mammographic systems (include testing the x-ray generation and image receptor)
Table 4 Recommended test for QC: image processing stage in mammographic systems
Table 5 Recommended test for QC: image display stage in mammographic systems
Table 6 Recommended tests for dosimetry in mammography: technical aspects, dose estimation conversion factors and reference dose per projection
Table 7 Overview of recommended tests for image quality assessment in mammography (for further detail on the methodology please refer to the original guidance documents)

Testing the image acquisition system

X-ray production system

All documents recommend testing the generator and X-ray source, the Automatic Exposure Control (AEC) and the breast compression systems. Recommended tests include (1) alignment of X-ray field/light field/image receptor area, (2) repeatability and accuracy of tube output exposure, (3) half-value layer (HVL), (4) AEC response versus breast thickness and tube voltage compensation and (5) alignment of the compression plate.

Breast dose

Table 6 reviews the guidance for dosimetry testing. All guidance documents provide recommendations for assessment of breast dose and two (i.e. EP; IAEA-D) are dedicated to this topic and include detailed methodology.

The mean glandular dose (MGD) is the recommended parameter for assessing the risk of radiation-induced cancer in mammography. Proposed methodologies for MGD assessment (reviewed in Table 6) include:

  • Measurements on patients using a thermoluminescent dosimeter (TLD) (a minimum of ten patients is recommended)

  • Dose estimation from clinical exposure data (10–60 patients recommended)

  • Dose estimation using test objects/phantoms (the entrance surface air kerma without backscatter (ESAK) should be measured and multiplied by a conversion factor, which compensates for X-ray beam quality, breast thickness and composition (percentage glandularity)

The ESAK is required to calculate MGD and can be measured with a calibrated ionisation chamber (IC), semiconductor dosimeter or TLD material (Table 6). If measurements include the effect of backscatter (e.g. TLDs), an appropriate correction factor should be applied [6]. The recommended phantoms to perform dosimetry testing vary between the protocols (Table 6).

Also, the various protocols propose different methodologies to measure the required data for MGD calculation (e.g. the ACR, Canadian and UK/IPEM propose measurements to be performed at 40 mm from the chest wall edge, whereas the EP recommends 60 mm).

Since the conversion factors used to estimate the MGD from the incident air kerma depend on the X-ray beam quality, it is necessary to keep track of the target/filter (T/F) combination and tube voltage used in the experimental procedure, as well as the half-value layer (HVL) of the X-ray beam.

The EC protocol proposes conversion factors by Dance et al. (1990) and Dance et al. (2000), whereas the ACR uses factors by Dance et al. (1990); Wu et al. (1991) and Sobol et al. (1997) [9], the Canadian protocol uses Stanton et al. (1984) and Wu et al. (1991) [13] and the Nordic protocol propose conversion of Rosenstein et al. (1985) [14].

Image receptor

The most frequently recommended tests for digital mammography include (1) the system’s response function, (2) image noise, (3) missed tissue at chest wall edge, (4) signal homogeneity and (5) image artefacts (Table 3).

Some protocols propose specific tests for CR systems, namely (1) inter-plate sensitivity variations, (2) image artefacts, (3) evaluation of the influence of secondary sources of radiation and (4) fading of the latent image signal. Guidance is also included for testing the scanning mechanism of the CR plate and the efficiency of the erasure cycle. Specific tests for SFM are proposed in the older protocols (EC protocol, UK/IPEM, Canada, IAEA-SF) (Table 3).

Quality of the acquired image

Table 7 summarises eight groups of tests for assessment of IQ recommended in the guidance documents reviewed. The tests address technical and clinical IQ criteria using test objects and phantoms.

Phantoms and test objects

The recommended phantoms to produce the images for low contrast IQ assessment vary between the protocols. CDMAM is frequently recommended in Europe (EC PROTOCOL, UK/IPEM, UK/NHSBSP and Ireland) whereas the ACR phantom is the standard in use in the US and Canada.

IAEA does not recommend a particular phantom but highlights the importance of using a phantom that contains structures able to mimic those typically found in the breast.

For high-contrast IQ assessment the MTF is the key recommended parameter. The MTF bar pattern method is more straightforward to implement than the calculation of the MTF using the edge phantom.

Image processing

Image quality is affected by the processing stage. For SF systems the guidance reviewed recommends testing the performance of the chemical processor (e.g. time, temperature, base and fog levels). The EC guidelines highlight the importance of testing image processing. For digital mammography systems, the manufacturer’s guidance should be followed because image-processing algorithms are manufacturer-specific.

Artefacts

Artefact analysis is an important test recommended in all guidelines reviewed. For SFM it focuses on artefacts resulting from the chemical processing or from the degradation of the screen-film detector characteristics. In digital systems, artefact analysis is focused on investigating problems originating in the image acquisition system and during plate handling and processing (CR systems). Testing includes assessment originated by printing devices (e.g. laser printers). A clinical evaluation protocol (type testing) is available in the EUREF website (www.euref.org) and repeated/rejected analysis is recommended on the IAEA-DM protocol.

Image display

QA guidelines for testing image display systems (Table 5) refer to the AAPM report Task Group 18 [15] for testing electronic monitors and printers. The testing of light boxes is included in the QC guidance for SFM systems [11, 12, 16].

Test frequency and reference (or limiting) values

All guidance documents provide recommendations on the frequency of the tests (Table 1). A number of tests are recommended at acceptance only. Others should be performed periodically (yearly, 6-monthly, monthly, weekly or daily). Intermediate testing should be performed when necessary (e.g. following major equipment repair).

The guidance documents also provide reference values and pass/fail criteria. These originate from manufacturer recommendations, expert knowledge, survey QC data, baseline values and national policies (e.g. existing dose reference levels). A critical aspect is to ascertain when the measured (including uncertainties) is substantially lower than the reference/limiting value. As an example, UK/IPEM guidance recommends that measured values for the relevant performance indicators not exceed one-third of the range proposed for the limiting or remedial values.

Discussion

The study showed that in the last 20 years comprehensive guidance documents have been developed worldwide to support the implementation of QA in mammography.

Target technology

The IAEA-DM protocol (edited 2011) is the most up-to-date guidance and is dedicated to digital mammography. The UK/IPEM, EC, IAEA-SF and ACR protocols are well-established documents originally developed for SFM that have been adopted in many countries worldwide. The EC guidelines were updated and an addendum on digital mammography was included [1, 17]. At the date of submission of this paper, an updated version of the ACR protocol is known to be in progress to include guidance specifically targeted at digital mammography. Also, as per information available on the EUREF website, a revised edition (5th) of the EC Guidelines is in development [18].

As new techniques in digital mammography are becoming widespread, it is expected that revised versions of the existing protocols will be produced, including guidance for testing the capabilities of state of the art technology (e.g. tomosynthesis, dual-energy contrast-enhanced digital subtraction mammography).

Professional targets

The EC and Irish protocols are wider in scope and may be useful to a broader range of healthcare professionals. Other protocols focus on dosimetry and IQ assessment and are targeted at medical physicists, radiographers and breast radiologists. Hendrick et al. [5] showed that the profile of staff performing QA testing differs between countries. Often, radiographers are in charge of the most frequent tests (daily, weekly), whereas medical physicists perform in-depth technical performance assessment (e.g. collimation, X-ray tube output, and AEC testing). In Japan, radiographers perform all QC testing, whereas in Finland, Iceland and Hungary the service engineers tend to be in charge of the QC tasks. As highlighted in the IAEA-DM protocol a critical aspect is that QC testing is delegated to staff holding appropriate expertise and training [4].

QA testing of mammographic systems and breast dose assessment

Image detection and acquisition system

All protocols reviewed recommend testing the X-ray source (tube voltage and HVL) and the AEC system. AEC testing is one of the most important procedures due to its direct impact on IQ and breast dose [19]. It should consider the effects of variations in object/attenuator thickness and radiation beam quality. Hendrick et al. [5] compared QC practices in 22 countries (affiliated with the International Breast Cancer Screening Network) and concluded that this test that was performed in all countries.

Breast dose

The recommended methodologies for breast dose estimation vary (Table 6). Measurements using test objects and breast phantoms are frequently recommended and more practical to implement than measurements based on TLD techniques.

Dose assessment with a standard test object/phantom facilitates the comparison of different mammographic techniques and the investigation of the impact of technical settings on breast dose [20, 21]. Clinical dose assessment (using clinical exposure data) provide valuable information on the clinical practice and takes into account the influence of breast thickness and composition on dose [6].

Variations in dosimetry techniques in mammography may prevent a robust comparison of breast dose in mammography between countries and between radiology departments [2224].

Dance et al. [16] also highlighted that national protocols adopt different phantoms, optical densities, measurement points and conversion factors, which make it difficult to compare the doses estimated with different protocols.

Hemdal et al. [23] measured the impact of variations in experimental technique (e.g. positioning of the dosimeter, compression plate in or out of the beam) on MGD values and found noticeable variations.

When the European protocol was used, the value of the MGD increased by 5 ± 2 % (total variation 0–9 %) at clinical settings and by 9 ± 3 % (4–17 %) compared with the use of the Nordic protocol [21]. The same authors also compared measurements with different dosimeters (ionisation chambers vs solid-state detectors) [23]. They concluded that HVL measurements can be performed accurately with a sensitive solid-state detector and a collimated radiation field, correcting for energy dependence.

This review showed variations in the conversion factors used in the estimation of breast dose (to account for X-ray spectrum characteristics and breast composition) amongst the guidance documents.

Zoetelief and Jansen [25] compared protocols for dosimetry in mammography and concluded that the use of different radiation transport codes and different spectra could cause differences in the conversion factor g by up to about 7 %. They also showed that inclusion of the compression plate in the beam results in a 4.5 ± 1.5 % smaller g value for the same HVL. Also, when breast thickness increases from 2 cm to 8 cm, the g value decreases by a factor of 4.

Tsai et al. [15] showed that the MGD calculated using Dance’s method is 9–21 % higher than that using Wu’s method. Jamal et al. [24] also compared MGD per film considering eight different studies using different protocols and conversion factors and found MGD values with noticeable variations for a same breast thickness.

The MGD critically depends on the X-ray spectrum generated by the TF combination and tube voltage used. Modern digital mammography systems offer innovative TF combinations (e.g. W/Ag, W/Al) and new conversion factors have been developed [24, 26, 27]. The protocols reviewed do not yet include the most recent published data.

Quality of the acquired image

All guidance reviewed recommends performing low-contrast threshold detection testing, breast lesion visualisation (e.g. simulated in phantoms) and artefact analysis. Compression force, image noise and spatial resolution testing are also recommended with variations in the proposed methods and test materials.

The EC protocol recommends assessment of image quality of digital mammographic systems using images produced with a specific low-contrast-detail test object (CDMAM) [28, 29], which is a costly tool not readily available in all imaging departments. The UK/IPEM and ACR protocols recommend alternative test objects to CDMAM, namely TOR (MAM) and the ACR accreditation phantom, respectively. The choice of a suitable IQ phantom should take into consideration the technology to be tested (screen-film of digital). Huda et al. [30] examined the effectiveness of the ACR phantom to assess image quality in digital mammography and concluded that it is unsatisfactory due to an inappropriate range and sensitivity to characterise simulated breast lesions.

Variations in recommended test objects originate differences in reference/tolerance values (Table 7). The number and type of recommended IQ tests varied (between 1 and 9) as well as the recommended methodologies. Examples of methods found in the guidance for rating IQ include absolute, or relative, scales (e.g. five-step scale, 1 (worst) to 5 (best); two-step scale with 1 (criterion was fulfilled) and 0 (criterion was not fulfilled); four-step scale as designed by PGMI scale (perfect, good, moderate and inadequate).

The guidance documents reviewed do not include recommendations on observer training for IQ assessment. This could be useful to reduce inter-observer variability in the assessment of IQ.

Also, breast compression force is influenced by breast thickness and composition. However, no recommendations are provided to promote the optimisation of compression force according to individual characteristics of the breast (compressibility, composition and thickness) [31, 32]. Maximum values for compression in mammography are recommended [7, 11, 33, 34].

The composition of breast tissue is an important issue because increased breast density is known as a risk factor for developing breast cancer [35]. Nevertheless, in the reviewed QA guidance for IQ assessment breast density was not used as a standard.

In 2011, an addendum to the EC protocol, containing guidance for clinical evaluation of mammographic images, was published promoting harmonisation in image quality analysis. Clinical IQ assessment conducted by experienced radiologists is important because it takes into account the effects of image processing which may directly affect the visibility of relevant features and the subsequent diagnostic outcome [36].

Image display/presentation and processing

All protocols including guidance for digital mammography recommend testing monitor displays and printers (Table 2). No recommendations are provided regarding the format for delivering mammography examinations/images to the patient and practices vary—some healthcare institutions deliver the examination in hardcopy (paper or film), whereas others provide digital images on CD.

Despite the potential critical impact of image processing in the quality of the final image the testing of image processing tools in still at early states (compared with testing of hardware). Most protocols for testing digital mammography systems recommend testing based on raw image data and do not include recommendations for testing post-processing algorithms used in clinical images. Establishing testing protocols for post-processing tools in digital mammography is a challenging task as processing tends to be manufacturer-specific and frequently manufacturers are reluctant to reveal details of the post-processing algorithms incorporated in their systems. A recent publication [37] addresses briefly the challenges of testing post-processing in tomosynthesis. Work is in process in collaboration with the manufacturers of digital breast tomosynthesis imaging systems to identify a method for technical evaluation incorporating the clinically used tomosynthesis reconstruction technique. Testing post-processing tools in DM is a topic that requires further research.

Conclusion

In this study the published guidance for QA in mammography was reviewed. The recommended performance tests for image acquisition, processing and display systems were discussed and compared. Noticeable variations exist in the proposed methods, test objects and phantoms. Also, reference values and acceptability criteria vary between protocols, which raises the question of whether it would be possible to have a mammography system complying with a test procedure and acceptability criteria, whereas using another test procedure the system would fail.

Harmonisation and best practices in mammography would benefit from more detailed guidance on the experimental methods for QC testing and recommendations of more affordable test equipment and materials that could be acquired by the majority of X-ray departments.

When a recommended protocol cannot be implemented in full, a selection of tests may be adequate. Selection criteria should take into consideration resources and expertise available and the relevance of the tests to local practices. It should be noted to highlight the value of testing the AEC system, which is a simple procedure to implement that provides valuable information on the overall performance of the mammography system.

A key factor to promote the success of a QA program for mammography is teamwork and the collaboration of all key staff (e.g. radiographers, radiologists, medical physicists and healthcare managers). Training and continuous feedback mechanisms are essential to improve the testing procedures and strengthen the outcomes of the program.

Also, the use of professional networks and special interest groups to exchange experiences with colleagues worldwide can be of great value in the initial phases of implementation of a mammography QA program.