Introduction

The GeneXpert assay (Cepheid Inc, Sunnyvale, CA, USA) is capable of simultaneously detecting Mycobacterium tuberculosis (MTB) and rifampicin (RIF) resistance, which was endorsed by the World Health Organization (WHO) since 2010 [1]. It has been demonstrated that the GeneXpert MTB/RIF (Xpert) and its updated version, GeneXpert MTB/RIF Ultra (Ultra) can detect MTB in common extrapulmonary samples [2], such as urine, cerebrospinal fluid (CSF) [3], and stool [4]. As a designated hospital for infectious diseases, the Xpert assay has been performed at the Shanghai Public Health Clinical Center since April 2017. The failure of the Xpert tests, which occurs infrequently, has become a point of contention in our ongoing debate. However, study on unsuccessful Xpert tests in routine practice is lacking. Additionally, the Xpert assay is a rapid, automatic and cartridge-based nucleic acid amplification test that requires a limited number of manual steps to be performed. In such unsuccessful cases, technicians are usually faced with the decision of either retesting the remaining sample and sample reagent (SR) mixture or collecting a new sample for retesting, perhaps more than two hours later (the cycle time of the Xpert assay is approximately two hours). Limited evidence is available to guide this decision, and the causes and potential solutions for unsuccessful Xpert tests in clinical settings remain unclear. To address this issue, we conducted a retrospective analysis of the unsuccessful tests and an experiment on Xpert to investigate the effect of preprocessing clinical sputum after 15 min, 3 h, and 6 h. This may help resolve samples that have yielded unsuccessful results on initial Xpert testing.

Methods

Analysis of the unsuccessful tests

The analysis of the unsuccessful tests in detecting MTB by the Xpert and Ultra assays is a retrospective study. Test data were collected at the Shanghai Public Health Clinical Center from April 2017 to April 2021. Records of unsuccessful tests from common clinical samples, such as sputum, urine, CSF, and stool, were collected and analyzed. Unsuccessful testing of the Xpert assay include “Error,” “Invalid” or “No result” as reported by the instrument. The details and codes of the unsuccessful Xpert tests were recorded as part of the instrument description.

All processes and operations involving clinical samples were performed according to Cepheid’s instructions. Raw sputum was used for the experiment without decontamination with N-acetyl cysteine-NaOH and concentration. The urine and other body fluids were concentrated by centrifugation at 3000 × g for 15 min and leaving at least 1 mL for testing. A ratio of at least 1:1 of sample reagent buffer containing NaOH and isopropanol was added to the clinical samples. The ratio of SR to samples depends on the type of sample: 1:1 for thin specimens, such as CSF and saliva sputum, 2:1 for other types of samples. To fully liquefy the sample, the mixture was mixed and incubated at room temperature for 15 min. After complete liquefaction, at least 2.0 mL of the mixture was transferred into each Xpert cartridge and loaded onto the Xpert instrument. Stool samples from children (aged < 15 years) were processed as previously described [4].

Prolonged preprocessing test

In order to investigate the effect of prolonged preprocessing in clinical sputum cases, we further collected and tested clinical sputum samples from the recruited participants. The sputum samples underwent preprocessing for different durations (15 min, 3 h, and 6 h) at room temperature before being tested using the Xpert assay. In the experimental group, we collected positive clinical sputum samples based on three criteria: (1) the original specimen should be sufficient to support three tests in Xpert, (2) MTB detection should be positive in the first Xpert test, (3) at least 50% of positive samples should be tested as “Low” or “Very low” (VL) to maximize the chance of detecting a minimal drop in preprocess. The negative control group should meet the following criteria: (1) the original specimen should be sufficient to support three tests in Xpert, (2) the first test should yield a negative result for MTB detection in the Xpert assay, (3) sputum samples should be collected from patients with pulmonary tuberculosis or those with a positive history of Xpert testing. All the enrolled samples were tested using Xpert with the lot number 1,000,155,396.

Related literature screening

To gain a better understanding of the unsuccessful tests of Xpert, we conducted a systematic review of the literature available on PubMed (https://pubmed.ncbi.nlm.nih.gov/). We retrieved relevant literature published up to 1 May 2021 using the search terms “Xpert” and “tuberculosis” along with the keywords “unsuccessful” or “failure”. Studies that did not provide a comprehensive description of Xpert failure were excluded from the analysis.

Statistical analysis

The study used the Mann-Whitney U test to assess the significance of continuous variables and the Pearson chi-square test for categorical variables. Statistical significance was set at p < 0.05. Logistic regression analysis was employed to evaluate the risk factors associated with unsuccessful Xpert tests. All calculations were performed using SPSS version 19.0 software (IBM, Armonk, NY, USA). Please contact the authors for access to the study data.

Results

“Error” was the primary cause of unsuccessful Xpert tests

Over the last four years, a total of 11,314 clinical tests were performed, with 10,912 conducted by Xpert and 402 by Ultra. Among these tests, 268 (2.37%) yielded unsuccessful results. Specifically, there were 247 (2.26%) failures in Xpert and 21 (5.22%) failures in Ultra (Table 1). The version of all Xpert cartridges counted in our lab was G4.

Table 1 Unsuccessful tests of Xpert and Ultra in our laboratory from Apr 2017 to Apr 2021

The instrument most frequently reported “Error” as the reason for the unsuccessful tests, accounting for 221 (82.36%) of the cases and followed by “Invalid” at 43 (0.38%) and “No result” at 4 (0.04%). Notably, all four “No result” tests occurred in sputum samples. The failure rate of Ultra was statistically significantly higher than that of Xpert (p < 0.001).

The failure rate of Xpert varies depending on the samples being tested

We analyzed the performance of the Xpert tests by sample types, as shown in Table 2. Sputum was the most common clinical specimen in Xpert tests, accounting for 5,247 cases with 114 (2.17%) unsuccessful tests. Another respiratory sample was bronchoalveolar lavage fluid, which accounted for 935 cases with 19 (2.03%) unsuccessful tests. The failure rates of the Xpert assay in extrapulmonary samples were as follows: 2.27% (22/968) for CSF, 2.74% (25/914) for serous membrane fluid, 1.94% (16/826) for puncture fluid, 2.08% (17/819) for secreta, 3.14% (12/382) for gastric juice, 0.27% (1/369) for urine and 4.69% (15/320) for stool. Stool had a higher failure rate in both the Xpert (4.69%) and Ultra (4.28%) tests compared to other samples such as sputum (2.17%) and gastric juice (3.14%). As our previous description, 81.34% (327/402) of Ultra were used to detect the MTB in stool samples from children. There was no significant difference (p = 0.803) in the failure rates between Xpert (4.69%) and Ultra (4.28%) for the clinical stool samples. When using sputum as the reference value in the logistic regression model, the odds ratio for having an unsuccessful result in extrapulmonary specimens was 0.12 (95% CI: 0.02–0.88, χ2 = 6.22, p = 0.021) for urine and 2.21 (95% CI: 1.28–3.84, χ2 = 8.43, p = 0.004) for stool in Xpert. Additionally, the odds ratio was 1.93 (95% CI: 1.09–3.40, χ2 = 5.35, p = 0.014) for stool in Ultra. Urine specimens have a lower risk of failure than sputum samples in Xpert, on the contrary, stool samples have approximately twice the failure rate of sputum samples, whether in Xpert or Ultra. Based on the failure rates, there is no significant difference between respiratory and extrapulmonary samples in the Xpert assay.

Table 2 The constitution of clinical samples on Xpert assay in our laboratory from April 2017 to April 2021a

The effect of prolonged preprocessing

We randomly collected 120 clinical sputum samples (including 100 positive and 20 negative samples) according to the criteria mentioned above. This experiment tested the samples that were processed at 15 min, 3 h, and 6 h after the SR was mixed in. Among these 100 positive cases, 82 consistently yielded the same result from Xpert at all three durations after preprocessing. However, 18 cases showed inconsistent results after prolonged treatment (3–6 h) compared to the results obtained in 15 min.

There were 7 cases that detected a higher level of MTB quantity after prolonging treatment (3–6 h) compared to the initial test in a 15-minute liquidation Additionally, 5 cases detected a lower level of MTB load, and 3 cases reported VL of MTB and rifampicin resistance as “Not detected” (ND) in 15-minute liquidation but were later reported as ND for MTB quantity after 3–6 h preprocessing. Furthermore, 2 cases reported VL of MTB quantity in 3 tests but with different results in rifampicin resistance, either ND or “Indeterminate “. Lastly, one case had a “Low” MTB load and rifampicin resistance ND in a 15-minute liquidation but was reported as “Medium” for MTB and rifampicin resistance ND in the 3-hour retest, followed by “Low” for MTB and rifampicin resistance ND in the third test after a 6-hour preprocess. None of the negative samples in the reference group reported a positive for MTB in the test. We have listed the 18 cases with inconsistent results after being preprocessed at 15 min, 3 h, and 6 h in Table 3.

Table 3 The tests with inconsonant results of Xpert after being preprocessed in 15 min, 3 h and 6 h

Review of related literature

Eventually, eight studies were included in the analysis (Table 4), accounting for a total of 221,526 tests, 9.01% (19,970/221,526) of them failed to report effective results. Except for the study by Basant Joshi in Nepal (1,740 failures in 23,057 tests), which didn’t list details of failure tests [8], the majority of unsuccessful tests were categorized as “Error” (55.77%, 10,167/18,230), followed by “Invalid” (25.40%, 4,631/18,230) and ND (18.83%, 3,432/18,230).

Table 4 Eight published studies with unsuccessful tests of Xpert assay in Pubmed

Discussion

In this study, we present an analysis of unsuccessful tests for MTB detection using the Xpert assay in a clinical laboratory located in Shanghai. Our findings indicate that the Xpert assay had an unsuccessful rate of 2.37% (268/11,314), with the predominant description of unsuccessful tests being “Error,” accounting for 82.36% (221/268) of all unsuccessful tests. This is consistent with previous research that has reported “Error” as the main description of failure in tests, ranging from 62.03% in Swaziland [5] to 93.36% in Ethiopia [6], but differs from findings in M. Gidado’s study in Nigeria [7]. We have a lower failure rate than the studies that have been retrieved, this may be due to the better infrastructure we have, for example, the stable power supply [7] and temperature control in the laboratory [8].

There is an obvious difference from other studies in our data, almost half of the Xpert tests had to be developed in extrapulmonary specimens in our laboratory. Due to the high incidence of tuberculosis in China, many patients with extrapulmonary tuberculosis (EPTB) need to be diagnosed. According to Jim E. Banta and Yu Pang, the prevalence of EPTB in China is higher than in other countries. In the United States, 24.5% of hospitalized tuberculosis patients had EPTB [9], whereas in Beijing, the proportion is 33.4% of tuberculosis patients [10]. Published studies support the use of the Xpert assay for extrapulmonary samples [2, 11]. Our study also found that the failure rate in extrapulmonary samples, except for stool and urine samples, was similar to that of sputum specimens. The Xpert assay could be used in EPTB diagnosis. Furthermore, urine specimens had a lower failure rate, while stool samples had approximately twice the failure rate of sputum samples in Xpert. Several studies have reported similar results in stool. In a study from Uganda, the invalid Xpert results in stool (2/71, 2.81%) was much higher than in sputum (2/350, 0.6%) [12]. Basti Andriyoko also found that unsuccessful Xpert tests were higher in stool (6/40, 15%) than in respiratory samples (1/30, 3%) [13]. Even though published studies emphasize the higher sensitivity of Ultra compared to Xpert [14, 15], we have not observed a significant difference in the failure rates of Ultra (4.28%) and Xpert (4.69%) tests in children’s stool samples (p = 0.803).

Analyzing unsuccessful tests in the Xpert assay is essential for quality and cost control in the clinical laboratory. Information from these tests should be recorded, such as the cartridge version and the kit lot number. It has been reported that earlier versions of Xpert cartridges (G3) had varying error rates in laboratories [16], but Tefera Agizew reported that they have a similar error rate [17]. On the other hand, there is no doubt that when unsuccessful tests occur, the cost and turnaround time of Xpert have to be increased. Labelling reports as “Test Failure” could pose a challenge for patients to accept. According to Cepheid’s advice, retesting the remaining sample and SR mixture or collecting a new sample for retesting can be helpful in resolving these unsuccessful tests, as most clinical laboratories were already doing [6, 8]. Neeraj Raizada reported that 86.85% of specimens obtained valid results after a single repeat test [8]. A study by Abebaw Kebede reports that 84.48% of tests will succeed by retesting the remaining treated samples or newly collected samples in Ethiopia [6]. Considering the direct cost of these tests in Shanghai (40 USD per Xpert MTB/RIF cartridge), the failures in Xpert have significantly and directly increased the operating cost of our laboratory.

Because of the high level of automation of Xpert, retesting the remaining sample and SR mixture means the preprocess had been prolonged. Few studies have confirmed the feasibility of this approach. In our experiment, 85% (102/120) of the samples detected the same report of MTB and rifampicin resistance. Seven samples showed an increasing load of MTB after being preprocessed for a longer duration. Which is consistent with Danica Hleb’s study, the tuberculocidal effect of SR increases with the length of exposure [18]. In addition, 8 samples showed a decrease in MTB load after prolonged preprocessing. This included 3 samples that reported VL of MTB for 15 min processed, compared to the report of ND after 3–6 h of exposure in SR. This phenomenon is inconsistent with the perspective of Padmapriya P. Banada and their team. They found that SR incubation could be prolonged up to 72 h without further decrease in MTB detection by Xpert [19]. In their experiment, MTB-negative sputum samples were spiked with MTB strain H37Rv at a final concentration of 60 CFU/ml. These artificial sputum samples were then incubated with SR at a 2:1 ratio for 15 min, 5 h, 8 h, 24 h, 3 days and 7 days before being tested using Xpert. They detected a decreased effect on MTB until the SR was mixed in for more than 72 h. Even with only 60 CFU/ml of MTB, which is less than half of the limit of detection of Xpert [18], they were still able to detect 22–38% positive sputum. We infer that the difference may come from the sputum samples being tested. The sputum we chose from clinical patients is compared with their five groups of artificial sputum. Another variable factor is that they prepared varying numbers of sputum each time, and the replicates tested were not the same samples at different durations. The samples in our experiment remained consistent throughout the tests. Besides, we found three cases that reported inconsistent results of rifampicin resistance after prolonging preprocess. This finding was associated with the study by Padmapriya P. Banada [19]. The inconsistencies may be due to the concentrations of MTB being in the grey area of Xpert in the two samples. In addition, we noticed that false rifampicin resistance determination due to chemically induced mutations by NaOH had been reported [19, 20]. We noticed that the inconsonant result was not found in the sputum with high MTB load. The possible explanation may be that the high background conceals the subtle changes that occur during the prolongation process in our test. In conclusion, samples that have been preprocessed for prolonged periods with SR should not be retested, especially for these paucibacillary cases.

Another influencing factor in Xpert tests is the volume of SR added during preprocessing. In the Xpert assay procedure, the SR is used to liquefy the sample, reduce biohazard, and inactivate PCR inhibitors [21]. In actuality, SR appears to be unnecessary in several studies which the Xpert tests were directly performed on CSF samples [22, 23]. Logically, more SR would reduce the bacterial load of MTB in the sample and could even alter the Xpert test results, especially in these paucibacillary cases. Nila J. Dharan and her team have clearly shown that reducing the amount of SR added can improve the sensitivity of the Xpert test on sputum [24]. In clinical practice, we have observed that the same volume of SR can satisfactorily liquefy CSF or saliva sputum after thorough mixing. However, the influence of the SR volume on the preprocessing of different clinical samples remains unclear and further research is needed.

Conclusions

The primary cause of unsuccessful tests in the Xpert assay was reported as “Error”. Despite varying failure rates depending on the samples, the Xpert assay can be applied to extrapulmonary samples. For paucibacillary specimens, retesting the remaining preprocessed mixture should be carefully considered when the Xpert test is unsuccessful.