The analysis of risk factors for diabetic nephropathy progression and the construction of a prognostic database for chronic kidney diseases
- 215 Downloads
Diabetic nephropathy (DN) affects about 40% of diabetes mellitus (DM) patients and is the leading cause of chronic kidney disease (CKD) and end-stage renal disease (ESRD) all over the world, especially in high- and middle-income countries. Most DN has been present for years before it is diagnosed. Currently, the treatment of DN is mainly to prevent or delay disease progression. Although many important molecules have been discovered in hypothesis-driven research over the past two decades, advances in DN management and new drug development have been very limited. Moreover, current animal/cell models could not replicate all the features of human DN, while the development of Epigenetics further demonstrates the complexity of the mechanism of DN progression. To capture the key pathways and molecules that actually affect DN progression from numerous published studies, we collected and analyzed human DN prognostic markers (independent risk factors for DN progression).
One hundred and fifty-one DN prognostic markers were collected manually by reading 2365 papers published between 01/01/2002 and 12/15/2018. One hundred and fifteen prognostic markers of other four common CKDs were also collected. GO and KEGG enrichment analysis was done using g:Profiler, and a relationship network was built based on the KEGG database. Tissue origin distribution was derived mainly from The Human Protein Atlas (HPA), and a database of these prognostic markers was constructed using PHP Version 5.5.15 and HTML5.
Several pathways were significantly enriched corresponding to different end point events. It is shown that the TNF signaling pathway plays a role through the process of DN progression and adipocytokine signaling pathway is uniquely enriched in ESRD. Molecules, such as TNF, IL6, SOD2, etc. are very important for DN progression, among which, it seems that “AGER” plays a pivotal role in the mechanism. A database, dbPKD, was constructed containing all the collected prognostic markers.
This study developed a database for all prognostic markers of five common CKDs, offering some bioinformatics analyses of DN prognostic markers, and providing useful insights towards understanding the fundamental mechanism of human DN progression and for identifying new therapeutic targets.
KeywordsDiabetic nephropathy Progression Risk factor Prognostic marker Bioinformatics analysis Database
chronic kidney disease
end-stage renal disease
The Human Protein Atlas
diabetic kidney disease
type 1 DM
type 2 DM
glomerular basement membrane
glomerular filtration rate
idiopathic membranous nephropathy
primary focal segmental glomerulosclerosis
false discovery rate
nuclear factor of activated T cells 1
genome-wide association study
DN, also known as “diabetic kidney disease (DKD)”, is one of the most important diabetic microvascular complications, affecting 30–45% patients with either type 1 DM (T1DM) or type 2 DM (T2DM), with a peak incidence in the 10–20 years duration of DM [1, 2, 3]. DN, pathologically, is often characterized by glomerular basement membrane (GBM) thickening, glomerular mesangial matrix expansion, and formation of glomerular nodular sclerosis in its advanced stages , and clinically, is usually defined by proteinuria occurrence or declined renal function, e.g. reduced glomerular filtration rate (GFR) [1, 5]. DN patients exhibiting modest or no albuminuria may progress to ESRD [6, 7]. DN is the leading cause of CKD and ESRD in high-income countries and likely worldwide [8, 9, 10, 11], and also a single strong predictor of mortality in patients with DM . Even worse, the absolute number of DN patients continues to increase and the incidence of ESRD from DN keeps expanding , consistent with the global DM pandemic [9, 14].
Currently, tight glucose control and strict blood pressure control (especially with medications that inhibit the renin-angiotensin system) remain the mainstay of management for DN. Although some progress has been made in reducing diabetes-related mortality and delaying the development of kidney disease from DM, the percentage of DN patients who progress to ESRD has not substantially declined . Disappointingly, there has been an impasse in the development of new drugs for DN, with no success in Phase 3 clinical trials . One reason is the lack of accurate understanding of the underlying pathophysiological mechanisms of human DN development and progression. Targeting single molecules and/or pathways that were important in DN development and progression based on hypothesis-driven research has not yielded significant advances in DN treatment in the past two decades. On one hand, mechanisms underlying DN development and progression are complicated with many interacting molecules and a number of crosstalk pathways. On the other hand, current animal and cell culture models mainly replicate the early stage of and/or recapitulate certain features of human DN, failing to reproduce the whole process of DN development and progression . In addition, patients who strictly complied with treatment recommendations can still develop overt DN whereas patients with similar or poor compliance may not. Likewise, not all DM patients with microalbuminuria progress to macroalbuminuria or ESRD (some patients even revert and the microalbuminuria disappears). Therefore, more broad-based approaches including systems biology and multiple omics are being applied to understanding DN pathological mechanisms today [17, 18, 19].
Regarding this situation, we collected all DN prognostic markers (risk factors for DN progression) from both routine and high-throughput research based on human samples in the past two decades and performed additional bioinformatics analyses, hoping to offer some insights into the mechanism of DN progression, which might help DN research and the discovery of new therapeutic targets for DN.
We constructed a database dbPKD , for prognostic markers of DN, as well as other CKDs including IgA nephropathy (IgAN), idiopathic membranous nephropathy (IMN), primary focal segmental glomerulosclerosis (pFSGS) and Lupus nephritis (LN). There have been no previously focused databases for risk factors of kidney diseases. dbPKD may provide a resource for searching reported prognostic factors for common CKDs.
All DN prognostic markers (risk factors for DN progression) were collected by screening through related literature. We searched the PubMed database using 32 keywords, e.g. “DN”, “DKD”, “diabetic kidney disease”, “diabetic nephropathy”, “ESRD”, “marker”, etc. (Additional file 1: Table S1). Totally, 2365 papers published between 01/01/2002 and 12/15/2018 were collected, including both routine research and high-throughput research. Reviews and non-English literature were excluded first. Initial screening of literature was based on title and abstract. Four hundred and three papers were retained for further filtration. Their contents were checked for information in detail. Filtrations were carried out according to rules: (1) the research subjects must be human, that is, samples used for the prognosis study must be derived from humans; (2) the disease studied must be DN, or a synonym of its definition, such as DKD; (3) markers must be potentially prognostic, which means that these markers should be potential risk/protective factors of GFR decline, doubling of serum creatinine, CKD progression, ESRD or even death closely related to kidney damage. In addition, markers used to predict significant albuminuria/proteinuria progression in DN patients were also included; (4) only markers that were rigorously verified to be independent risk factors for DN progression in multivariate analysis were finally collected, and several markers with only univariate analysis results in current prognostic studies were also collected; (5) markers of multiple omic-levels were collected, including genes (involving mRNA, SNP, CNV, etc.), proteins, microRNAs, and mixed clinical indicators (referring to all the prognostic markers that are not genes, proteins, or microRNAs).
Besides DN prognostic markers, we also collected prognostic markers of other four CKDs (IgAN, IMN, pFSGS and LN). The collection guidelines were basically the same as that for DN data. However, there were several different points as follows: (1) the key words are shown in Additional file 1: Table S2; (2) papers published between 01/01/2002 and 01/01/2018 were filtered for IgAN, IMN, pFSGS and LN; (3) markers must be prognostic for GFR decline, doubling of serum creatinine, CKD progression, ESRD or even death closely related to kidney damage, but not necessarily prognostic for albuminuria/proteinuria progression. The workflow for data processing is shown in Additional file 1: Figure S1.
Functional enrichment analysis of DN prognostic molecules
We performed GO and KEGG enrichment analysis for DN prognostic molecules using a test based on the hypergeometric distribution, with false discovery rate (FDR) < 0.05 being considered significant. All this work was done using the g:Profiler platform .
In order to analyze the connectivity and co-regulation among the DN prognostic molecules, we constructed a network according to the main enriched pathways in DN progression based on KEGG  using Edraw Version 22.214.171.124 . We also manually constructed a signal-transduction diagram by extracting the regulatory relationship from the enriched signal transduction pathways to illustrate the speculated role of prognostic molecules in DN progression more clearly.
Tissue origin distribution
To establish the expression and location of prognostic molecules in normal kidney tissues, we searched all prognostic genes and proteins in the HPA . First, we downloaded the mRNA and protein data for all genes in different human systems/tissues from the HPA, and then screened out kidney tissue (e.g. glomeruli, tubules, etc.) related data. Finally, we obtained the expression levels and location data of prognostic genes and proteins in kidney tissues by molecule ID mapping.
To avoid duplication and to unify the naming of markers across different studies, genes were mapped to Entrez Gene IDs, and proteins were mapped to UniProt IDs. Mixed clinical indicators were given unified names if these are widely used. Sample sources were categorized into renal tissue, urine and blood (including serum and plasma), and the prognostic effects were mainly divided into “better” and “worse”. All the collected data were incorporated into the database after collation and normalization, and each entry included five types of information: reference, research parameters, marker annotation, prognostic effect(s) and the supportive public data.
The web interface mainly provides four types of application service: Browse, Search, Analysis and Download.
Genes, proteins and microRNAs verified in T1DN and T2DN, respectively
AGER, ATP5MC3, BDKRB2, CASP3, CAT, CCR5, CNDP1, COX5A, CTGF, CYP11B2, ENPP1, FLT4, GPX1, HPSE, LIPC, NPHS1, NPPA, PARP1, SLC2A1, SOD1, SOD2, TGFBR2, TRPC6, UQCRC1, CDH13, CYBA
ADIPOQ, AKR1B1, APOE, CCL2, CETP, GSTT1, IL10, ITGA2, LTA, NOS3, PON1, PON2, PRKCB, SLC12A3, TKT, FN3K, EP300, HP
miR-126, miR-196a, miR-9
CRP, CTGF, MBL2, TNFRSF11B, UMOD
ADIPOQ, CST3, TNNT2, TNFRSF1A, FABP1, HBB
CLU, COL18A1, CP, FGF21, HP, ICAM1, IL6, TNFRSF1B, CD59, CFHR2, C4A, MCAM, LGALS3, AVP, NPPB, RBP4, SAA1, TNF, VCAM1, VWF, C8A, AOC3, FGF23, SERPINF1, VEGFA, ALB, CCL2
Molecules involved in DN progression and the functional analyses
Although there are many biological processes (BPs) involved in DN progression, we only focused on the top 15 BPs significantly enriched for all the DN prognostic genes and proteins (Additional file 1: Figure S5). It is noted that the risk molecules in MacroAlb/PP group were mainly enriched in 5 of the top 15 BPs: response to stress, inflammatory response, response to oxygen-containing compound, response to lipid, and regulation of cell death, which might indicate that inflammation, oxidative stress, hemodynamics abnormality, and lipid metabolism disorder had been damaging the kidney function since the very early DN stage with albuminuria occurrence.
Risk molecules for different DN stages based on different end point events
According to the three clusters of DN prognostic molecules, based on different end point events (Fig. 2b), we could observe different risk molecules for specific DN stages. There were very few overlapping risk molecules between the ESRD group and the overt DN group, which indicated that there might be different key molecules promoting DN progression at different DN stages. For example, CTGF was verified as a risk gene for albuminuria progression  and a risk protein for progressing to ESRD . Studies using animal/cell models show that CTGF could be induced by high glucose through the mediation of TGF-β, and its upregulation could promote mesangial matrix accumulation, progressive glomerulosclerosis and tubulointerstitial fibrosis [37, 38]. In podocytes, its overexpression could damage podocytes and exacerbate proteinuria and mesangial expansion . Considering all the above observations, it is speculated that CTGF should exert a very weak or no effect on the promotion of DN progression in the early albuminuria stage of DN, although it was a risk gene for albuminuria progression, while in the middle and late DN stages, CTGF should act as a key molecule promoting the development of ESRD and play an very important role in DN progression.
Role of DN prognostic markers in the mechanism of DN progression
Protein expression and location of DN prognostic genes and proteins
dbPKD: database for prognostic markers of kidney diseases 
In total, 69 genes, 72 proteins, 4 microRNAs, and 92 mixed clinical indicators were extracted from 243 qualified papers, without distinguishing specimen sources. And 46 genes, 42 proteins, 3 microRNAs, and 60 mixed clinical indicators were extracted from 115 qualified papers for DN progression. In addition, 30 genes, 43 proteins, 1 microRNA, and 41 mixed clinical indicators were extracted from 128 qualified papers for IgAN, IMN, pFSGS and LN.
Theoretically, proper genetic intervention to DM patient might prevent DN from happening. However, resolving the genetics of DN remains complex with little progress. In the past decades, only a few molecules were identified as DN genetic factors through genome-wide association studies (GWAS), such as ACE, AKR1B1, APOE, PPARG, etc. . Some of them were also verified as DN prognostic genes, which could be called “high-risk genes for DN development and progression” (Additional file 1: Figure S8). At present, in addition to strict management of diabetic patients, there seems to be no precautions for DN development. The main therapeutic strategy for DN patients is to inhibit or retard the disease progression. The prognostic markers collected here were all verified as risk factors for DN progression in DN prognosis studies. They were all directly related to the end point events of DN patients regardless of the complex interactions among molecules and Epigenetics. Hence, they might reflect the most real “key molecules” in DN progression and serve for finding new therapeutic targets. Analyzing these prognostic markers might offer some insights in understanding the mechanism of DN progression.
MicroRNAs are small non-coding RNA molecules that usually function in RNA silencing and post-transcriptional regulation by affecting their target mRNAs. Here we only collected three microRNAs that were verified as risk factors of DN progression. Interestingly, their target molecules included more DN prognostic genes and proteins  (Additional file 1: Figure S9), indicating that microRNAs should play an important role in DN progression. In some other related works, we confirmed the clinical application value of miR-196a for several types of kidney diseases [57, 58]. The regulation details between microRNAs and their targets as well as the possible associations among these three microRNAs need further research, which might help to understand the mechanism of DN progression. In addition, there were also some clinical indicators (including metabolites, biochemical indicators, pathological parameters, etc.) that could be used as DN prognostic markers. In fact, serum creatinine has been widely reported and clinically used as an important parameter in assessing and monitoring renal functions of kidney diseases for decades [59, 60]. Vitamin D has been discussed to be a treatment option in DN for many years [61, 62]. Both of these suggest that DN prognostic markers have potential important applications in the clinical diagnosis and treatment of DN.
Although we attempted to collect all the DN prognostic markers and analyze them as accurately as possible, there were still some limitations in our study. First, due to the limited prognosis studies, the number of DN prognostic molecules collected was small. Second, because of the fuzzy definitions of end point events, it was difficult to judge the accurate DN stages for which some prognostic markers were used. This also hindered subsequent further analysis. Lastly, specimen sources of risk factors for DN progression were variable, including urine, blood and kidney tissue, which posed difficulties for further mechanistic studies of DN progression.
The work on prognostic markers will be continued and the data is scheduled to be updated every 2 years. In the meantime, we will keep trying to improve the efficiency of data extraction by adopting some machine learning methods and endeavor to optimize the workflows. In addition, other types of related data, such as data from single cell sequencing studies, may also be collected in the subsequent work for further analysis. We hope that more prognostic markers of kidney diseases and valuable insights could be provided to clinicians and researchers.
In conclusion, we collected human DN prognostic markers that were verified as independent risk factors of DN progression mostly through multivariate analysis in the past two decades and constructed a database. To our knowledge, this is the first systematic summary of DN prognostic markers. Bypassing the complex epigenetics and avoiding the shortcomings that animal/cell models could not replicate all the features of human DN, these prognostic molecules were directly related to human DN prognosis and were the most authentic key molecules in human DN progression. Also, we demonstrated the connections and regulation among these molecules and emphasized some related GO terms and KEGG pathways by bioinformatics analysis. The in-depth study of these molecules and related pathways will help to further understand the mechanism of human DN progression, discover new therapeutic targets and explore new DN drugs. In addition, some prognostic markers (mixed clinical indicators) might contribute to the improvement of the managements of DN patients. In the future, we will expand the data content and improve the functional modules for dbPKD, and strive to provide some more valuable insights for the research and treatment of related kidney diseases by adopting more and better analytical methods.
We would like to acknowledge Dr. Michael Liebman for his critical reading and editing.
GW: study concept and design, acquisition of data, analysis and interpretation of data, design and construction of the database, drafting of the manuscript, critical revision and final approval of the manuscript. JO: design and construction of the database, drafting of the manuscript, critical revision, and final approval of the manuscript. SL: acquisition of data and final approval of the manuscript. HW: acquisition of data and final approval of the manuscript. BL: critical revision, study supervision, and final approval of the manuscript. ZL: study concept, critical revision, study supervision, and final approval of the manuscript. LX: study design, drafting of the manuscript, study supervision, critical revision and final approval of the manuscript. All authors read and approved the final manuscript.
This work was supported by the National Key Research and Development Program of China (2016YFC0904101, 2016YFC0904100), the National Natural Science Foundation of China (No. 31870829), and the Shanghai Municipal Health Commission, Collaborative Innovation Cluster Project (No. 2019CXJQ02).
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
- 6.Thomas MC, Macisaac RJ, Jerums G, Weekes A, Moran J, Shaw JE, Atkins RC. Nonalbuminuric renal impairment in type 2 diabetic patients and in the general population (national evaluation of the frequency of renal impairment cO-existing with NIDDM [NEFRON] 11). Diabetes Care. 2009;32:1497–502.CrossRefGoogle Scholar
- 20.DataBase for Prognostic markers of Kidney Diseases(dbPKD). http://126.96.36.199/dn/index.php. Accessed 26 Mar 2019.
- 22.KEGG: Kyoto Encyclopedia of Genes and Genomes. https://www.genome.jp/kegg/. Accessed 20 Jan 2019.
- 23.Edraw ED. https://www.edrawsoft.cn/. Accessed 23 Jan 2019.
- 30.Panduru NM, Saraheimo M, Forsblom C, Thorn LM, Gordin D, Waden J, Tolonen N, Bierhaus A, Humpert PM, Groop PH. Urinary adiponectin is an independent predictor of progression to end-stage renal disease in patients with type 1 diabetes and diabetic nephropathy. Diabetes Care. 2015;38:883–90.CrossRefGoogle Scholar
- 31.von Scholten BJ, Reinhard H, Hansen TW, Oellgaard J, Parving HH, Jacobsen PK, Rossing P. Urinary biomarkers are associated with incident cardiovascular disease, all-cause mortality and deterioration of kidney function in type 2 diabetic patients with microalbuminuria. Diabetologia. 2016;59:1549–57.CrossRefGoogle Scholar
- 36.Nguyen TQ, Tarnow L, Jorsal A, Oliver N, Roestenberg P, Ito Y, Parving HH, Rossing P, van Nieuwenhoven FA, Goldschmeding R. Plasma connective tissue growth factor is an independent predictor of end-stage renal disease and mortality in type 1 diabetic nephropathy. Diabetes Care. 2008;31:1177–82.CrossRefGoogle Scholar
- 45.Maeda S, Matsui T, Takeuchi M, Yoshida Y, Yamakawa R, Fukami K, Yamagishi S. Pigment epithelium-derived factor (PEDF) inhibits proximal tubular cell injury in early diabetic nephropathy by suppressing advanced glycation end products (AGEs)-receptor (RAGE) axis. Pharmacol Res. 2011;63:241–8.CrossRefGoogle Scholar
- 56.miRBase: the microRNA database. http://www.mirbase.org/. Accessed 27 Jan 2019.
- 62.Kim MJ, Frankel AH, Donaldson M, Darch SJ, Pusey CD, Hill PD, Mayr M, Tam FW. Oral cholecalciferol decreases albuminuria and urinary TGF-beta1 in patients with type 2 diabetic nephropathy on established renin-angiotensin-aldosterone system inhibition. Kidney Int. 2011;80:851–60.CrossRefGoogle Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.