SHG/TPEF-based image technology improves liver fibrosis assessment of minimally sized needle biopsies
- 110 Downloads
Background and aims
Sampling size variability of liver biopsy remains a major limitation in the assessment of liver fibrosis. We aimed to evaluate the diagnostic value of a fully quantitative method (second harmonic generation/two-photon excitation fluorescence, SHG/TPEF based) in “short” liver biopsy samples.
Liver biopsy samples from chronic hepatitis B (CHB) patients were constructed into “virtual” biopsies with different lengths. The original and “virtual” samples were measured by SHG/TPEF-based technology to obtain qFibrosis score, respectively. Here, ΔqFibrosis was defined as difference of qFibrosis between original biopsy and “virtual” biopsy. Equivalence test was used to compare ΔqFibrosis with the clinically acceptable error (deviation of 0.50) in each group.
In real-world practice, qFibrosis score increased significantly with fibrosis progression in ≥ 1.5-cm-, 1.0–1.5-cm-, and 0.5–1.0-cm-long specimens (p < 0.05), compared with ≤ 0.5-cm-long specimens (p > 0.05). In virtual biopsy samples with specified length, the equivalence was confirmed in 0.5–1.0-cm- and 1.0–1.5-cm-long specimens (0.27 vs. 0.22, p < 0.001), whereas not in ≤ 0.5-cm-long specimens (0.53, p > 0.05). The number of cross-linked collagen fibers, the total and aggregated collagen proportionate area, and the collagen strings in number, length, width and perimeter showed excellent consistency with original biopsy samples in 0.5–1.0-cm- and 1.0–1.5-cm-long specimens (ICC > 0.90).
The use of SHG/TPEF-based image technology may give useful suggestive information in evaluation of CHB-related liver fibrosis for the short sample (biopsy length > 0.5 cm).
KeywordsLiver fibrosis Liver biopsy Sampling error Quantification
Liver biopsy has been widely approved as the gold standard in evaluating liver fibrosis for patients with chronic liver disease. However, sampling variability continues to be one of the limitations of liver biopsy in the assessment of liver fibrosis [1, 2, 3, 4].
Many studies tried different methods to overcome this limitation. A biopsy specimen with sufficient size has been recommended to minimize sampling error and to improve the diagnostic accuracy of liver biopsies. Whereas, the optimal length remains controversial [5, 6], in an early study, a biopsy length of 1.5 cm was considered adequate . Thereafter, an optimal biopsy sample length of 2.0 cm and 2.5 cm was recommended  .
However, sufficient sample size cannot be guaranteed for each liver biopsy in clinical practice. Biopsy length data indicate that biopsies smaller than the current recommendations are obtained in over half of the patients . In a landmark clinical study focused on fibrosis reversion in CHB patients who received entecavir therapy, only 60% of biopsies were longer than 1.0 cm . A systematic review reported the mean biopsy length was only 17.7 ± 5.8 mm . Besides, many studies using liver biopsy as gold standard have not shown the data of biopsy length [11, 12].
An effective way to improve the diagnostic value in short liver biopsy samples is needed. In recent years, image morphometric analysis of liver biopsy sample has been applied to quantify the extent of liver fibrosis. qFibrosis (SHG/TPEF based), a structure-based quantitative assessment method, has been recently demonstrated to have a better performance for diagnosis of liver fibrosis compared with traditional collagen proportionate area (CPA) measurement . Due to the comprehensive quantitation of collagen structure features and collagen spatial distribution, qFibrosis was shown to be less sensitive than CPA to sampling size in animal models.
Our aims in this study were: (1) to evaluate the diagnostic value of qFibrosis (SHG/TPEF) measurements for “short” biopsy samples, and (2) to illustrate why this technique is less sensitive to sample size compared to routine CPA measurements.
Materials and methods
Clinical biopsy samples were retrospectively extracted from a prospective HBV-related fibrosis/cirrhosis cohort study. The study cohort has been described previously . In brief, the inclusions of recruitment were as follows: treatment-naive patients aged 18–65 years, positive for hepatitis B surface antigen (HBsAg) more than 6 months, HBV DNA levels higher than 20,000 IU/mL (positive for HBeAg) or 2000 IU/mL (negative for HBeAg), liver biopsy performed at baseline or week 78 after treatment.
Percutaneous needle biopsies were obtained with real-time ultrasound guidance. Tissue samples were fixed in formalin, embedded in paraffin, sectioned at 5 μm. All samples were stained with hematoxylin and eosin, Masson’s trichrome and reticulin for standard histological assessment. One unstained section from each biopsy was evaluated by SHG/TPEF-based imaging. Biopsy length and fragmentation were documented.
Two senior pathologists (TLW and HL), who were blinded to all data, independently evaluated all the liver biopsy samples using METAVIR scoring system (F0, no fibrosis; F1, portal fibrosis without septa; F2, portal fibrosis with rare septa; F3, numerous septa without cirrhosis; F4, cirrhosis) . Discordant cases will be reviewed again to achieve consensus.
Biopsies were imaged by second harmonic generation/two-photon excitation fluorescence (SHG/TPEF) microscopy . A total of 101 collagen features were extracted from the SHG images and then the features were normalized by the tissue area. Fifteen collagen architectural features, previously identified as meaningful, were quantified and combined into a single qFibrosis score, as described in our previous study .
Construction of virtual biopsy specimens
For randomly defined virtual biopsy, the first step was to define the starting point, and second step was to decide the specific length from 0.1 to 1.4 cm, all based on a random number generator. Finally, all of virtual biopsies were randomly constructed into different lengths from 0.1 to 1.4 cm long using an image processing tool with precise calibration. Meanwhile, the fragmented biopsies were constructed into virtual biopsies of different lengths from 0.2 to 1.5 cm long. At final, all these samples were classified into four groups: ≤ 0.5-cm, 0.5–1.0-cm, 1.0–1.5-cm, and ≥ 1.5-cm specimens.
The qFibrosis score was determined for each virtual biopsy specimen. The qFibrosis score of the entire tissue section was set as the reference of each case. For each virtual biopsy, the deviation of the qFibrosis score (ΔqFibrosis) was defined as the absolute value of the difference between the score on a virtual biopsy and the score for the entire (reference) biopsy sample. A deviation within 0.50 was set as clinically acceptable error.
Numerical variables were expressed as median with the interquartile range and categorical data as number with frequencies. Intraclass correlation coefficient (ICC) was performed to assess the degree of consistency. ΔqFibrosis was defined as the absolute value of the deviation of qFibrosis scores between a virtual biopsy and the entire biopsy. Equivalence test was used to compare ΔqFibrosis with the clinically acceptable error (deviation of 0.50) in each group. The equivalence between the short biopsy samples and “≥ 1.5-cm”-long specimens was confirmed if ΔqFibrosis fell within 0.5 of the score. Continuous variables were compared using one-way ANOVA or Kruskal–Wallis test. Correlations were evaluated by Spearman’s rank correlation. Equivalence test was performed with SAS 9.4; the other analyses were performed with SPSS 22.0. Two-sided p values < 0.05 were considered statistically significant.
A total of 535 biopsy samples with evaluable fibrosis stage were retrospectively extracted from the prospective cohort study. The prevalence of fibrosis stages was 35.1% for F1 (n = 188), 30.1% for F2 (n = 161), 19.1% for F3 (n = 102) and 15.7% for F4 (n = 84).
The distribution of biopsy length in biopsy specimens quantified by SHG/TPEF technology-based microscope and evaluated by Metavir stage
Biopsy length group
Proportion of METAVIR fibrosis stage, n (%)
≤ 0.5 cm, n (%)
0.5–1.0 cm, n (%)
1.0–1.5 cm, n (%)
≥ 1.5 cm, n (%)
Total, n (%)
Influence of biopsy length on the evaluation for fibrosis by qFibrosis for “short” samples in clinical practice: qFibrosis was good, especially in biopsy samples longer than 0.5 cm
The influence of biopsy length on qFibrosis score by comparing “virtual” biopsy samples with original biopsy
Then, to clarify the diagnostic value of qFibrosis score in differentiate fibrosis influenced by biopsy length, a total of 444 virtual biopsy samples were obtained from 161 qualified biopsies (≥ 1.5-cm-long specimens, excluding two biopsies with large portal tract, two cases due to pathological process made the effective length of biopsy in SHG image < 1.5 cm). The process of virtual liver biopsy construction is shown in Fig. 1, Part II.
qFibrosis score was less sensitive to biopsy length than CPA. For “randomly defined” virtual biopsy specimens, the relative deviation of qFibrosis against the original sample was gradually decreased from ≤ 0.5-cm-long specimens to 1.0–1.5-cm-long specimens and was smaller than that of CPA for each group (all p values < 0.001, Supplementary Fig. 1).
qFibrosis could distinguish the METAVIR stages well both in 0.5–1.0-cm and 1.0–1.5-cm specimens (AUC: 0.78–0.88), as shown by ROC curve analysis (Supplementary Fig. 2).
Therefore, qFibrosis is relatively accurate in the evaluation of liver fibrosis in short samples if the length is greater than 0.5 cm.
The reason that qFibrosis overcomes the drawbacks of small sample size: additional parameters are considered compared to routine collagen proportionate area calculation
Considering the spatial distribution of the respective collagen patterns, portal area is susceptible to the change of biopsy length, followed by fibrillar area and septal area. AUROC of the consistent parameters in septal and fibrillar areas are shown in Supplementary Table 2. Compared with portal area, quantitative characters in the septal and fibrillar areas could be meaningful in diagnosing the severity of liver fibrosis. Detailed change of CPA in each area, the representative feature, is shown in Supplementary Fig. 4. The change percentage of CPA of total area, septal area and fibrillar area has no significant difference in 0.5–1.0-cm and 1.0–1.5-cm groups, whereas, not in portal area.
Among the 21 most consistent parameters, 9 are used to establish qfibrosis score, accounting for 60% of parameters of qFibrosis scores, perhaps illustrating why qFibrosis can minimize errors induced by small sample size.
In this study, we demonstrated that biopsy samples longer than 0.5 cm are able to deliver a reliable quantitative assessment of fibrosis using the fully quantitative method of qFibrosis. The histological features in septal and fibrillar area were insusceptible to sample size, illustrating the reason why qFibrosis has an advantage in resolving the issue of sampling error.
The impact of biopsy length on quantitative fibrosis score (qFibrosis) in real-world practice of CHB patients was first analyzed. qFibrosis showed a very good performance in reflecting the severity of fibrosis in “short” biopsy samples if it is longer than 0.5 cm. It is not clear whether this good performance is caused by underestimation of METAVIR fibrosis stage or improved detection of fibrosis with the qFibrosis technology. To clarify this issue, we constructed “virtual biopsy samples” from actual biopsies ≥ 1.5 cm in length. These “virtual” biopsies simulated the so-called “short” biopsy specimens. The deviation and consistency of each biopsy were individually calculated by comparing with the biopsies longer than 1.5 cm. According to our analysis, the absolute value of ΔqFibrosis maintained a low level in liver biopsies if they were at least 0.5 cm in length. Similar to our results, another study has shown recently that qFibrosis was less sensitive to sample size than CPA .
The reason why this SHG/TPEF-based image analysis could reliably reflect the severity of liver fibrosis in small biopsies is the consistency of the established quantitative parameters between long and short biopsy samples. In our study, we found the stable features included: the number of cross-linked collagen fibers, the total and aggregated collagen proportionate area, the length, width, area and perimeter of collagen strings in total. qFibrosis, SHG/TPEF-based image technology, incorporates the multiple spatial architectural collagen features, giving the reason why qFibrosis can minimize errors induced by small sample size.
Interestingly, when considering the spatial characteristics of collagen pattern, features in septal and fibrillar areas were less sensitive to a reduced biopsy length, compared to portal features. Quantitative characters in the septal and fibrillar areas are meaningful in diagnosing the severity of liver fibrosis compared with the features in portal area. This sensitivity in small biopsies may be explained by reduced portal tract number while histological features in septal and fibrillar areas remain relatively constant because of their distributed location.
These favorable results with qFibrosis are in contrast to those from studies with traditional methods demonstrating substantial sampling error [4, 6, 7, 17, 18]. For these methods, a relatively generous 2-cm-long sample is now recommended [8, 19], leading to increased peri-operative risk [10, 20]. Traditional methods are also affected by lack of expertise and poor interobserver agreement, while qFibrosis is totally independent of these effects .
Traditional staging methods have been compared to standard digital image analysis, focusing on the issue of sampling error.  In a study of cirrhotic transplant tissue, Hall et al. concluded that, to achieve the 75% probability that CPA of a virtual biopsy will be within 5% of the reference CPA, a length of 15–20 mm was required . This requirement does not show an advantage of CPA over traditional histological analysis in the issue of sampling error. Our results suggest that qFibrosis methodology is more resistant to sampling error compared to CPA measurements by standard digital image analysis methods and the benefits are extended across the fibrosis range of F1–F4.
The clinical implications of these results might be profound, as shown in Supplementary Fig. 5. In the past, the guidelines have recommended liver biopsy length be at least 1.5 cm, to avoid sampling error, while our data with qFibrosis suggest that 0.5 cm may be sufficient for this purpose.
The strengths of our study are as follows: (1) the quantitative method we used is more sensitive and specific than traditional digital image analysis. It has been demonstrated to have better diagnostic value for discriminating adjacent stages even in early fibrosis stage compared with traditional digital image analysis. (2) The samples used in our study were liver biopsies, as opposed to samples obtained from liver transplantation with a limited range of stages. The simulation of “short” biopsy sample was close to the actual possibility. (3) Our cohort, including 165 biopsies, was larger than the other studies focused on the same issue.
There are still some limitations of our study. First, liver biopsy sample of cirrhosis is more easily to be fragmented, which is a major obstacle in the diagnosis of cirrhosis. However, the proportion of cirrhotic specimens is relatively small in our study, so the diagnostic value of qFibrosis in short cirrhotic samples needs to be further verified in larger number of samples. Second, all the liver biopsies were obtained from CHB patients in our study. The histological pattern of fibrosis from other etiologies may be different from CHB patients, indicating the application of qFibrosis in “short” biopsies other than CHB needs to be further explored.
Despite these limitations, our study still demonstrated that SHG/TPEF-based technology, coupled with additional parameters discovered by machine learning (qFibrosis scores) showed good performance in the evaluation of CHB-related liver fibrosis in short samples down to a lower limit of 0.5 cm in length. Future studies would be of great interest to search for further improvements in the diagnostic value of qFibrosis using small “unqualified” liver biopsies in the assessment of liver fibrosis.
This study was funded by National Science and Technology Major Project (2018ZX10302-204), Key Project from Beijing Municipal Science and Technology Commission (D161100002716003), and National Natural Science Foundation of China (81670539).
Study design: HY, XJO and JDJ. Data collection: YMS, JLZ, YWS, XNW, BQW, SYC, XJO. Liver biopsy assessment: TLW, HL. Statistical analysis: BQW, SSW. Manuscript writing: BQW. Critical revision of the manuscript: HY.
Compliance with ethical standards
Conflict of interest
Bingqiong Wang, Yameng Sun, Jialing Zhou, Xiaoning Wu, Shuyan Chen, Yiwen Shi, Shanshan Wu, Hui Liu, Yayun Ren, Xiaojuan Ou, Jidong Jia, and Hong You declare that they have no conflict of interest.
- 15.Bedossa P, Poynard T. An algorithm for the grading of activity in chronic hepatitis C. The METAVIR Cooperative Study Group. Hepatology. 1996;24:289–293Google Scholar
- 19.Guido M, Rugge M. Liver biopsy sampling in chronic viral hepatitis. Semin Liver Dis. 2004;24:89–97Google Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.