Central laboratory monitoring of concentration, homogeneity and stability of gDNA
To assign nominal DNA copy number concentration to the gDNA material and to define the homogeneity of the gDNA units, initial monitoring was carried out in laboratory 1. Here, five gDNA units (units H1–H5) were analysed in duplicate on a QX100 system immediately after the gDNA extraction from a single WVM unit (unit W10). The stability of the test batch was checked during the inter-laboratory study with the analysis of three gDNA units (units G1–G3) that had been stored for 4 months at −20 °C. The mean DNA copy number concentration of the gDNA test material (combined DNA concentrations from units H1–H5 and G1–G3) (±expanded standard error; k = 2.78) was estimated at 979 (±59) cp/μL (ESM, Fig. S2), and the gDNA units H1–H5 were homogenous in terms of the DNA copy number concentration (p > 0.41; ANOVA 95% confidence level). By comparing units G1–G3 with units H1–H5, the stability of the gDNA units after the 4 months of storage at −20 °C was also confirmed (p > 0.9; ANOVA 95% confidence interval) (ESM, Fig. S2).
The two different HCMV test materials comprised the locally extracted gDNA from the purchased WVM units (e.g. Fig. 1A, units W1–W9), and the centrally prepared gDNA units (e.g. Fig. 1B, units G1–G9) that were distributed to the participating laboratories. These were each quantified in three different laboratories using two or three different dPCR platforms (Table 1). In each laboratory, three WVM units and three gDNA units were tested. From each WVM unit and gDNA unit, six aliquots were locally prepared, three aliquots for subsequent analysis on one dPCR platform (laboratories 1–4) and three aliquots for analysis on another dPCR platform (laboratories 1, 2). For each dPCR instrument, three aliquots derived from each of the three WVM units and three gDNA units were each tested in duplicate in one of three consecutive experiments (e.g. Fig. 1A, aliquots 1–3 from unit W5 were tested in experiments 1–3), to determine the intermediate precision and the inter-unit variability within each dPCR instrument and laboratory. Additionally, for each dPCR instrument, the mean DNA copy number concentration and the corresponding expanded measurement uncertainty were estimated. For each of the three WVM units and three gDNA units, three of the four participating laboratories reported complete data for the three units tested in three experiments on one or two dPCR platforms (ESM, Tables S1-S6). The exception here was laboratory 1, which reported only two complete sets of data due to technical problems with the Biomark system (ESM, Table S1). With all of the instruments, the NTCs and negative extraction controls were negative, except for the QuantStudio3D system, where one false positive partition was noted in two NTCs. This can occur due to cross-contamination between samples or because of non-specific binding of primers . However, due to the high DNA copy number concentrations used in this inter-laboratory study, the occurrence of a single false positive partition in the NTCs did not bias the subsequent interpretation of the data.
Intra-experiment variability, intermediate precision and agreement between experiments
For each dPCR instrument, three experiments were performed, each with one of three aliquots derived from each of three WVM units and three gDNA units, analysed in duplicates. Grubbs tests were used to determine the outliers and exclude these from the further analysis (ESM, Table S12). With each of these HCMV test materials, low CVs related to intra-experiment variability were observed among the different dPCR instruments, as the great majority of the duplicates (e.g. Fig. 1A, two replicates of aliquot 1 from unit W5) had CV <10%, with some CVs between 10 and 40% mostly for the Biomark system. The low CVs related to the intra-experiment variability of the QX100 system and the Biomark 37K array are in agreement with other reports using HCMV DNA and bacterial DNA [26, 27]. The higher CVs observed with the Biomark 37K array compared to the other two dPCR platforms might have been due to the >25-fold smaller number of analysed partitions, and/or to pipetting errors related to the smaller sample volumes [23, 27].
With all of the dPCR instruments, low CVs related to intermediate precision were noted (CVs below 25%). Moreover, with each HCMV test material tested with each dPCR instrument, there were no statistically significant differences in the mean DNA copy number concentration between the three consecutive experiments, when for each all six measurements (three units in duplicate) were taken into account (Fig. 2, ESM, Table S13). The low CVs related to intermediate precision are in agreement with previous reports from individual laboratory data for HCMV DNA and different bacterial DNA template types [26, 28]. This finding provides additional support for the indications that dPCR is suitable as a reference measurement procedure, as it provides very precise quantification of viral DNA within and between experiments.
With the gDNA test material, each of the four participating laboratories measured three different gDNA units that were centrally prepared in laboratory 1. In contrast, for the WVM test material, the DNA from three WVM units was locally extracted and analysed in each of the three laboratories. With all of the dPCR instruments, the centrally prepared gDNA units showed only minor inter-unit variability, as the differences in mean DNA copy number concentration between these gDNA units were below 14% and were mostly not statistically significant (Table 2; ESM, Fig. S3). On the other hand, for the WVM units, where the gDNA was extracted locally, there was higher inter-unit variability compared to the centrally prepared gDNA units, with the highest difference between the WVM units with the Biomark 37 array from laboratory 2, with 55% higher mean DNA copy number concentration obtained from unit W4 compared to unit W6 (Table 2; ESM, Fig. S4). The low CVs related to inter-unit variability of the centrally prepared gDNA units confirms their homogeneity when analysed in each of the participating laboratories. This is in agreement with other reports where simple DNA templates that did not require DNA extraction were used (e.g. gDNA, plasmid DNA) [28, 29]. Despite the statistically significant differences between the centrally prepared gDNA units tested on the QuantStudio 3D system, there were low CVs related to intra-experiment variability (CVs for duplicates, <10%) and intermediate precision (CVs for three experiments, <1%), which explains the statistical significance of <12% difference between these gDNA units. With the WVM material tested in laboratories 1 and 2, most of the differences between the three analysed WVM units were constant, as they were observed with both dPCR platforms. With the low filling-volume related uncertainty and high stability of the WVM units that are claimed by the manufacturer, it is reasonable to assume that the inter-unit variability was introduced during the DNA extraction, as the DNA was locally extracted from each individual WVM unit. This is in agreement with previous studies where up to 50% difference was noted between DNA-extraction replicates quantified using the same qPCR assay [30, 31]. With PCR-based DNA quantification, estimation of DNA copy number concentration can be influenced by variable DNA recovery and/or insufficient removal of PCR inhibitors during DNA extractions . However, as dPCR platforms are considered to be relatively robust to potential inhibitory substances that might have remained during the DNA extraction [11, 12], it is likely that the differences in the estimated mean DNA copy number concentration between the WVM units were mostly caused by variable DNA recovery upon extraction. This is in agreement with previously reported data with HCMV, where intermediate variability was noted between extraction replicates analysed by dPCR within a single laboratory [20, 31]. With the QX100 system, the differences between the WVM units had higher statistical significance in comparison to the Biomark 37K array. This finding suggests that the QX100 system provides more precise discrimination between the WVM units. This is due to the lower CVs related to the intra-experiment variability of the QX100 system compared with the Biomark 37K array, as previously discussed.
Measurement uncertainties of each dPCR instrument
With each HCMV test material and dPCR instrument, the mean DNA copy number concentrations and corresponding expanded measurement uncertainties were calculated by taking into account the three WVM units or gDNA units, each of which was divided into three aliquots that were each measured in one of three experiments. For every dPCR instrument, small expanded measurement uncertainties (<18%) were obtained for the centrally prepared gDNA test material, whereas with a more complex material (i.e. WVM) that requires local DNA extraction, higher expanded measurement uncertainties (<28%) were noted (Fig. 2, Table 3). This is in agreement with a previous inter-laboratory study on bacterial DNA . Additionally, in two other assessments using simple and well-defined DNA templates [29, 33], lower expanded measurement uncertainties (<6%) were observed for the QX200 system, the Biomark 12.765 arrays and other dPCR platforms compared to this study. Hence, dPCR offers a very precise estimation of the DNA copy number concentration. However, the final measurement uncertainty is dependent on the complexity of the DNA material, with the local DNA extraction resulting in additional uncertainty components, i.e., leading to higher measurement uncertainty. In the present study, within laboratories 1 and 2, smaller measurement uncertainties were seen for the QX100 system compared to the Biomark 37K array (Fig. 2, Table 3), which is in agreement with another report where a QX100 system and a Biomark 12.765 array were compared . This might arise from the higher CVs related to the intra-experiment variability that was noted on the Biomark 37K system. As no such differences between the QX100 and the Biomark 37K arrays were observed in the inter-laboratory study on bacterial DNA, this might be attributed to the study setup and pipetting errors .
In laboratories 1 and 2, two different dPCR platforms were used. For both HCMV test materials analysed on the Biomark 37K array in laboratory 1, the mean DNA copy number concentration was approximately 8% higher than that measured on the QX100 system (Table 3). The opposite was noted in laboratory 2, where both of the HCMV test materials showed 17% lower mean DNA copy number concentrations when measured on the Biomark 37K array compared with those measured using the QX100 system. The high intra-laboratory agreement between the QX100 system and the Biomark 37K array has already been observed in two other studies [23, 27]. Additionally, low discrepancies were observed between the other dPCR platforms [14, 20, 33].
Within laboratories 1 and 2, the differences between the QX100 system and the Biomark 37K array were very consistent, as the differences in the DNA copy number concentration between both of these platforms were similar, regardless of the HCMV test material used. Similar consistency between these two platforms has already been noted for three different types of bacterial DNA .
However, there was disagreement noted between laboratories 1 and 2, where for the QX100 system, higher (laboratory 2) and lower (laboratory 1) mean DNA copy number concentrations were measured compared to the Biomark 37K array. Although in both laboratories differences between those two platforms were smaller than expanded measurement uncertainty of each platform, this pattern was noted with both test materials. Similar inconsistent data have been reported previously for plasmid DNA , which suggests that such discrepancies between platforms are not always systematic, but can be random; however, the reasons for such random discrepancies are not yet completely understood. As the same assay, and the same HCMV test materials and cycling conditions were used in all of the laboratories, over-estimation and under-estimation of the DNA copy number concentrations and discrepancies between laboratories might be due to either the use of different master mixes, or to incorrectly assigned partition volumes [15, 23, 27]. For both dPCR platforms, various lot numbers of the particular master mixes were used in the different laboratories, which might partially contribute to these observed discrepancies between the platforms. With the QX100 system and the Biomark 12.765 array, >10% difference in partition volume was reported from an independent assessment carried out in several laboratories [14, 33–35]. Furthermore, with the Biomark 12.765 array, around a 7% difference in chamber volume was found between two arrays measured in the same laboratory . The discrepancy between the QX100 system and the Biomark 37K array observed in the present study might therefore arise from variable partition volumes of the different Biomark 37K array lots and/or differences between droplet volumes generated and analysed for the QX100 systems from different laboratories.
Inter-laboratory agreement and mean DNA copy number concentrations
The mean DNA copy number concentrations for each HCMV test material were measured on each dPCR instrument (i.e. two QX100 systems, two Biomark systems, one QuantStudio 3D) from the four laboratories. With the gDNA test material, the differences between the laboratories did not exceed the differences within each laboratory, as the maximum difference in mean DNA copy number concentration between the dPCR instruments from two laboratories was <20% (Table 3). Furthermore, no statistically significant differences were observed between the majority of the instrument pairs (ESM, Fig. S5A). Between laboratories 1 and 2, there was only a minor difference in mean DNA copy number concentrations when the mean DNA copy number concentrations from the dPCR platforms within each laboratory were taken into account. Conversely, with the more complex DNA material of WVM, a 62% difference in the mean DNA copy number concentration was noted between the two Biomark instruments from different laboratories (Table 3), with statistically significant differences between most of these instrument pairs (ESM, Fig. S5B). Furthermore, in laboratory 1, approximately 40% higher mean DNA copy number concentrations were noted when compared to laboratory 2.
With the gDNA test material, the good agreement between the laboratories additionally demonstrated the high stability of the gDNA units distributed to the participating laboratories. The reasons for minor discrepancies between instruments and laboratories are still not well understood; however, they were probably primarily caused by each individual dPCR instrument, due to over-estimation or under-estimation of DNA copy number concentration, and are not directly influenced by factors related to the different laboratories. In contrast to the centrally prepared gDNA test material, the WVM required local DNA extraction before DNA quantification. DNA extraction has already been demonstrated to introduce an additional variability (CV up to 50%) due to differences in DNA recoveries from extraction columns of the same manual extraction kit used by one operator within one laboratory [20, 30, 31]. To the best of our knowledge, no inter-laboratory assessment of the same DNA extraction method for quantification of viral DNA has been performed. However, it can be speculated that different operators from different laboratories would contribute to this variability. Another source of variability might be differences in composition between the suggested in-house prepared PBS (laboratory 1) and the purchased commercial PBS (laboratories 2–4), as it has been shown that the matrix can have an impact on the DNA recovery and the variability of DNA extractions . Therefore, it is reasonable to assume that in the present study, the local DNA extractions from WVM performed individually in each laboratory contributed to the higher discordance between the laboratories and instruments than for those observed with the centrally prepared gDNA.
To additionally demonstrate the suitability of dPCR as a candidate reference measurement procedure of higher metrological order, its applicability for value assignments of different virus reference materials was determined. For each material, the Vangel-Ruhkin estimator was selected based on several criteria published in the CCQM guidelines  (ESM, Method S3, Tables S14, S15). With both materials, the data from all five of the dPCR instruments fell within narrow expanded measurement uncertainties (WVM, 15%; gDNA, 6%) of the mean DNA copy number concentrations (Fig. 3).
As the low measurement uncertainties observed in the present study are in agreement with two other inter-laboratory assessments that used bacteriophage DNA and different types of bacterial DNA [28, 29], we can conclude that dPCR offers good reproducibility for quantification of DNA of different complexities (e.g. plasmid DNA, gDNA, whole bacteria and viruses) and from different sources (e.g. viruses, bacteria, bacteriophages). To determine the suitability of dPCR as a reference measurement procedure and for characterisation of certified reference materials, the performance of dPCR should be compared to that of the qPCR method that is currently used for characterisation of virus reference materials . The use of qPCR in several inter-laboratory studies for the quantification of HCMV resulted in more than 100-fold differences between laboratories in terms of the DNA copy number concentration [2, 8]. The main reason for this variability is most probably the disparity in the quantification procedures between the participating laboratories, as different assays and DNA extraction methods, and variable calibration procedures, can exacerbate the agreement between laboratories in terms of estimated DNA copy number concentration. In contrast to qPCR, several dPCR platforms have already been shown to be resilient to inhibitors and resistant to the influence of different PCR components, hence allowing for more accurate and robust quantification of DNA than is possible with qPCR [11, 23, 27]. The variability caused by the DNA extraction method in particular should receive special attention when dPCR is considered as a candidate for a reference measurement procedure and for characterisation of reference materials. The accuracy and robustness of dPCR-based DNA quantification can be further improved by preliminary selection of the DNA extraction method with the highest DNA recovery, while extraction replicates would probably reduce the influence of inter-column variability. Moreover, DNA recovery of the selected extraction method should be assessed to allow the inclusion of extraction related variability into the measurement uncertainty of quantification of the whole-virus reference material. Furthermore, where possible, dPCR-based direct quantification can bypass most of the mentioned problems, including inter-laboratory variability caused by the different operators of the extraction and clean up procedures, as this does not require DNA extraction and has been demonstrated to provide accurate and repeatable quantification of DNA derived from different whole-virus reference materials .