Background

Kidney is one of the main excretory and homeostatic organs of the body. The basic structural and functional unit of the kidney is the nephron. The development of a nephron involves a series of reciprocal tissue inductions between the ureteric bud and the metanephric mesenchyme. Induced metanephric mesenchyme cells condense to form pretubular cell aggregates and go through a mesenchyme to epithelial transition and a series of morphological changes, including the formation of nephric vesicles, comma- and S-shaped bodies and eventually the formation of mature nephrons. A mature nephron is composed of the vascular loop of the glomerulus, Bowman's capsule, the proximal convoluted tubule, the loop of Henle and the distal convoluted tubule that connects to the drainage system. Genes expressed in developing and mature nephrons may be important for their development, structural integrity, and physiological function. In humans, mutations in such genes may cause kidney disease [1].

Mouse has been widely used as a model organism for biomedical research. This is because the mouse is anatomically and physiologically similar to human. Recent progress in the human and mouse genome projects further indicates that the organization of these two mammalian genomes are highly conserved [2]. Over 95% of human genes can find their counterparts in the mouse genome [3, 4]. This high similarity between mouse and human underscore the use of the mouse as the model organism par excellence for studies of many aspects of human biology.

Although genes involved in kidney organogenesis or associated with kidney disease have been identified, there is still limited molecular genetic knowledge of kidney development and homeostasis. Recent progress in microarray technology provides a powerful tool to study the kidney [514]. Mice with mutations that alter specific aspects of kidney development and function provide unique tissue resources for microarray studies [1518].

Lim1, also called Lhx1, is a LIM-class homeobox gene that is expressed in the ureteric bud and pretubular cell aggregate prior to epithelialization of the developing metanephric kidney [19, 20]. Most Lim1 null mutants die around E10.5, an embryonic stage prior to the development of the metanephros [21]. Rare Lim1-null mutant mice survive to birth but do not have kidneys, demonstrating an essential role for this gene in kidney organogenesis [21]. To bypass the early lethality that hinders the analysis of Lim1 function in kidney organogenesis, a Lim1 conditional null allele in mouse was generated [22]. An Rarb2-Cre transgene was generated and used for metanephric mesenchyme-specific ablation of Lim1 that resulted in newborn mice that had kidneys but no nephrons [20].

Nephrogenesis is a continuous process that begins with the induction of metanephric mesenchyme by the ureteric bud, around embryonic day 10.5 (E10.5), and persists several weeks after birth in mice [1]. The first mature nephron is observed at E16.5 [23]. Histological analysis suggests that the development of Lim1 mutant nephrons stops at the nephric vesicle stage, which begins around E11.0. Loss of a nephric vesicle polarity marker, Brn1, expression in the E13.5 conditional mutant kidneys further indicated that the Lim1 is required for correct patterning of the nephric vesicle [20]. The developing nephric vesicle represents an important developmental stage in which nephron polarity is established. Disruption of its patterning results in a failure to form nephron structures such as proximal tubules and glomerular epithelium [24].

In this study, we hypothesized that Lim1 mutant nephron-deficient kidneys could be used as a novel tissue resource for microarray experiments to identify genes expressed in the developing nephrons. Kidneys of two developmental stages were examined. Control and conditional mutant kidneys of E14.5 mouse embryos were used to identify genes involved in early nephron development including pattern formation. In contrast, E18.5 kidneys were used to isolate functional genes that are expressed in mature nephrons.

Methods

Generation of conditional mutant mice and genotyping

All procedures performed on animals were done in accordance with guidelines of the American Physiological Society and were approved by The University of Texas MD Anderson Cancer Center Institutional Animal Care and Use Committee (Richard R. Behringer, IACUC Protocol Number: 02-90-01735). Mice carrying a targeted Lim1 null allele (Lim1 lacZ, [25]), a Lim1 conditional null allele (Lim1 flox, [22]) and an Rarb2-Cre transgene in which Cre is expressed in the metanephric mesenchyme of the developing kidney [20], were used in this study. Lim1 lacZ/+ and Lim1 flox/floxmice were maintained on a C57BL/6J × 129/SvEv genetic background. Rarb2-Cre transgenic mice were initially generated on a C57BL/6J × SJL/J genetic background.

To obtain mouse embryos with metanephric mesenchyme-specific Lim1 deficient (Lim1 flox/lacZ; Rarb2-Cre tg/+) kidneys, timed matings between Lim1 +/lacZ; Rarb2-Cre tg/+ males and Lim1 flox/floxfemales were established. Kidney samples of two different embryonic stages (E14.5 and E18.5) were isolated. Their genotypes were assigned unambiguously using real time PCR assays detecting the presence of lacZ and Cre alleles, which were established in the M. D. Anderson Cancer Center DNA Analysis Core Facility. For the E14.5 time point, a total of 71 embryos were collected, 17 of them were genotyped as conditional mutants (Lim1 flox/lacZ; Rarb2-Cre tg/+). Kidneys from 23 Lim1 flox/+; Rarb2Cre tg/+ embryos were used as "control" kidneys. For the E18.5 time point, a total of 39 embryos were harvested, 9 of them were genotyped as conditional mutants and 14 of them were genotyped as controls.

Tissue collection and RNA preparation

Embryonic kidney tissue from each individual was placed in a separate tube with 0.5 ml TRIzol (Invitrogen, Carlsbad, CA) and stored at -75°C until the corresponding visceral tissue could be genotyped. After a genotype was unambiguously assigned to each individual, TRIzol preserved kidneys of the same genotype were pooled and total RNA was prepared as per the manufacturer's instructions. Total RNA was then processed using a QIAGEN RNeasy Midi Kit before in vitro transcription-labeling reaction per Affymetrix (Santa Clara, CA) recommendation. Once purified, RNA quality was determined by electrophoretic methods using an agarose gel or analysis using an Agilent Bioanalyzer 2100 (Palo Alto, CA) and by spectroscopy at 260 and 280 nm.

Microarray processing

Five to forty micrograms of total RNA from each pooled embryonic kidney sample was used to produce the cRNA target for the microarray. The target was created using a reverse transcription reaction to produce cDNA (Supercript Choice System, Gibco), which was subsequently subjected to in vitro transcription with biotinylated cytidine-5'-triphosphate and uridine-5'-triphosphate using the ENZO BioArray High Yield RNA Transcript Labeling Kit to produce biotinylated cRNA. The target was then fragmented and hybridized to Mouse Genome 430 2.0 Affymetrix GeneChip Arrays (Affymetrix, Santa Clara, CA) in duplicates using an Affymetrix GeneChip Fluidics Station 400, according to the manufacturer's standard protocols. The arrays were stained with phycoerythrin-coupled avidin and scanned using a GeneArray Scanner 3000. The resultant output was analyzed using Affymetrix Microarray Suite software and examined for excessive background or evidence of RNA degradation. All microarray processing was performed in the Murine Microarray and Affymetrix Facility at the University of Texas M. D. Anderson Cancer Center.

After scanning, all probe sets were scaled to a signal intensity of 250 and relative levels of expression of each transcript (signal) were determined using Microarray Suite 5.0 software (Affymetrix). The images of all arrays were inspected for physical anomalies and for the presence of excessive background hybridization. Generally, all array results used in this study were of good quality, and no major manufacturer's defects or abnormalities were detected.

Microarray analysis

Microarray experiment on each time point and genotype was performed in technical duplicates (ie. a single RNA preparation of pooled kidneys of one genotype used for two separate target preparations). Data from a total of 8 independent arrays were used in this study, 2 arrays were used for RNA samples from E18.5 control kidneys, 2 for E18.5 Lim1 conditional mutant kidneys, 2 for E14.5 control kidneys, and the other 2 for E14.5 Lim1 conditional mutant kidneys. Data generated from all arrays that satisfied the preliminary analysis were exported and loaded into DNA-Chip Analyzer (dChip2004) [26, 27], where statistical and comparative analyses were performed to verify the data. The data were normalized using the default normalization method. Briefly, an iterative procedure was used to identify an invariant set of probes, which presumably consisted of non-differentially expressed genes. A piecewise-linear running median curve was then calculated and used as the normalization curve. After normalization, all arrays had similar brightness. Median intensities around 155 (between 155 to 158) were obtained after normalization. Percent gene present (P call%) values between 55.7% and 63.6% were observed using default detection p-value cut offs (a1 = 0.04 and a2 = 0.06). Array outlier (%) and single outlier (%) were detected at ranges from 0.016% to 0.080% and from 0.009% to 0.045%. Expression data obtained from all arrays used is provided in Additional file 1.

Normalized data were exported in a tab delimited text format. Fold changes of each transcript from different samples were calculated and sorted using Microsoft Excel 5.0 software. Signal obtained from control kidney samples were used as an experiment to compare to the signal obtained from Lim1 conditional mutant kidneys that was designated as a baseline. A 2-fold change in the means of signal obtained from experimental duplicates and those from baseline duplicates was used as the criterion to identify differentially expressed transcripts. To ensure the quality of the data, probe sets that showed a fold change between duplicates greater than between the experimental mean and baseline mean were removed. To study only genes that showed consistent expression on experimental chips, probe sets that did not show consistent present calls in the experimental duplicates were removed. To focus on genes with a significant fold change between the experiment and the baseline, only probe sets that the product of their experimental mean and fold change were more than 100 were retained. To produce a compact differentially expressed gene list, the probe set list was sorted within Microsoft Excel based on Locus Link number and redundant entries were removed. Our experimental design description and the data format provided in the Additional files fulfill the MIAME (minimum information about a microarray experiment) standards [28].

Expression specificity and ontological analysis

To evaluate kidney expression specificity of identified genes, gene symbols and locus numbers were used to retrieve their relevant expression information in the Genomics Institute of the Novartis Foundation (GNF) Gene Expression Atlas 2 and Unigene databases. The GNF Atlas 2.0 contains two replicates each of 61 mouse tissues run over Affymetrix probe arrays. It was accessed using the Gene Sorter server provided by the University of California at Santa Cruz [29]. Gene Sorter provides a score, between -4 to 4, to describe the relative expression level of a gene in different tissues presented in the GNF Atlas 2 [30]. In contrast Unigene is a system automatically partitioning GenBank sequences, including expressed sequence tags (ESTs), into a non-redundant set of gene-oriented clusters [31]. The Unigene data were obtained from SOURCE [32], which provides a normalized expression level, based on the number of ESTs within the cluster found in cDNA libraries of different sources, expressed in percentages, to represent the relative abundance of a transcript in different tissues or organs [33]. To understand the composition of the genes identified in our study, ontological analysis was performed using DAVID (Database for Annotation, Visualization and Integrated Discovery) and EASE (Expression Analysis Systematic Explorer) from the National Institute for Allergy and Infectious Disease (NIAID) [3437]. Data obtained were processed and charts were drawn using Microsoft Excel 5.0 software.

Results

Identification of stage-specific kidney genes

The first protocol used was to compare gene expression levels in kidney samples of the two developmental stages so that stage-specific kidney genes could be identified. To enrich for E18.5 kidney-specific genes, signals obtained from E18.5 control kidney samples (E18.5C) were used to compare with E14.5 control kidney data (E14.5C). Genes sorted according to their expression signal fold changes (E18.5C/E14.5C) generated a list enchriched for E18.5 mouse kidney genes. A list of 1,006 genes showed more than 2 fold change was identified (Table 1 and Additional file 2). As shown in Table 2, the enrichment for kidney specificity was dramatic. The average relative kidney expression level among the top 50 genes on the list reported by Gene Sorter is 3.6 whereas that of the gene list sorted using raw signals (E18.5C) is only -0.1. Similarly the average normalized expression level in the kidney, reported by SOURCE [33], of the top 50 genes also showed a nearly 15 fold increase (30.69% compare to 2.07%). Although a PubMed search did not find any of the top 50 genes in the list sorted by raw signals (E18.5C; 0%), 22 of the top 50 (44%) genes sorted by fold change (E18.5C/E14.5C) were described to have a nephron-specific expression pattern.

Table 1 Numbers of genes that showed more than a 2-fold increase in different expression level comparisons.
Table 2 Expression profile comparison between E18.5 control and Lim1 conditional mutant kidney generated a gene list enriched for nephron-specificity.

The same approach was applied to the E14.5 experiment to identify 796 gene that showed more than a 2 fold change (Table 1 and Additional file 3), however the enrichment for kidney- or nephron-specificity was not significant. As shown in Table 2, the average relative kidney expression levels of the top 50 genes were not much different in lists sorted by the raw signals and fold changes (-0.1 and 0.0). A slight increase in the average normalized expression in the kidney was observed (1.48% and 2.51%). There were only 2 genes (4%) among the top 50 reported to have a nephron-specific expression pattern in the gene list sorted by developmental stage-specific fold change (E14.5C/E18.5C). A closer look of the gene list revealed that this list enriched for genes generally expressed in undifferentiated, embryonic tissues but do not necessarily show kidney-specificity (data not shown).

Identification of nephron-specific genes of different developmental stages

A second protocol was to take advantage of the nephron-deficient Lim1 conditional mutant kidneys to identify nephron-specific genes of different developmental stages [20]. To generate an E18.5 nephron-specific gene list, we compared gene expression data of E18.5 control kidney (E18.5C) and Lim1 conditional mutant kidney (E18.5M). Fold changes were calculated and used to sort genes. A total of 465 genes showed a more than 2 fold increase in expression in the control kidney compared to the conditional mutant kidney (Table 1 and Additional file 4). The top 50 genes on the list were further evaluated computationally for their kidney specificity. The results indicate that the gene list generated by this protocol is highly enrich for nephron-specific genes. The average relative kidney expression level, based on GNF 2.0, reached 3.5, and the normalized kidney expression level, according to SOURCE, is 32.27%. There is also a slight increase in the ratio of genes with published nephron-specific expression patterns (56%) compared to the gene list generated using the previous protocol (44%). The details of the gene list are described in Table 3.

Table 3 Top 50 genes upregulated in the E18.5 control kidney when compared to the Lim1 conditional mutant kidney.

A gene list enriched for E14.5 nephron-specific genes was generated using the same protocol. Comparison of the E14.5 control kidney and Lim1 conditional mutant kidney gene expression profile picked up only 41 genes that showed a more than 2 fold change (Table 1). Unlike the gene list sorted by the comparison made between developmental stages (E14.5C/E18.5C), which does not significantly enrich for E14.5 kidney genes, the comparison between control and Lim1 conditional mutant kidney helped to identify kidney-specific genes, especially those expressed in the nephrons. As shown in Table 2, the average relative kidney expression level is as high as 2.9, and there is also a nearly 15 fold increase in the average normalized kidney expression level (21.51 % compare to 1.48%). Thirteen (26%) genes on the top 50 list were also previously described to have a nephron-specific expression pattern. Notably 3 genes related to the Notch signaling pathway, Msih2, Hes5, and Jag1 were found in the list. The details of this gene list are summarized in Table 4.

Table 4 Top 41 genes upregulated in the E14.5 control kidney when compared to the Lim1 conditional mutant kidney.

Ontological analysis on nephron-specific genes of different developmental stages

To gain insight into the functional aspects of the microarray data, we exploited the web-based annotation tool, DAVID, to help identify functional themes that showed differences between the control kidney and the Lim1 conditional mutant kidney [36]. Each of the top 1,000 genes on the lists that displayed upregulation in the control kidney compared to the Lim1 conditional mutant kidney at either E18.5 or E14.5 were used for this analysis. Ontological analyses were performed at Molecular Function Level 1, Biological Process Level 2, and Cell Component Level 4 [35]. The results of major functional categories are shown in Figure 1. The numbers of genes that fell in major categories were normalized by the number of genes annotated in each list and were expressed in percentages. Generally speaking nephron-specific genes identified at E18.5 were better studied. Close to 60% of these genes were annotated in Molecular Function and Biological Process ontologies and 32% in Cell Component at the levels our analyses were performed. In contrast, only about 30% and 11% of the genes on the E14.5 nephron-specific gene list were annotated at the same levels.

Figure 1
figure 1

Ontological analyses of the top 1,000 genes upregulated in E18.5 and E14.5 control kidneys. A. Molecular Function Level 1 analysis (annotation rates: E18.5 – 63.6%, E14.5 – 33.5%). B. Biological Process Level 2 analysis (annotation rates: E18.5 – 54.0%, E14.5 – 27.9%). C. Cell Component Level 4 analysis (annotation rates of the top 1,000 genes: E18.5 – 32.2%, E14.5 – 10.8%). Only molecule categories containing at least 3 hits in either of the gene lists are shown. The numbers of genes fell in major functional categories were normalized by the numbers of gene annotated in each list and were expressed in percentages.

Comparisons made at Molecular Function Level 1 (Figure 1A) revealed that the majority of the E18.5 nephron genes identified in our screen were described to possess catalytic activity (58.49%) whereas molecules found in the E14.5 nephron gene list were better studied for their physical interactions with other molecules (e.g. binding, 64.48%). The E18.5 gene list is also characterized by a relatively high proportion of transporter proteins (25.79%) whereas the E14.5 gene list contains a higher ratio of genes with signal transducer (13.73%), transcription regulator (8.06%), enzyme regulator (4.48%), motor (1.19%) and translation regulator (0.90%) functions.

Ontological analysis at Biological Process Level 2 (Figure 1B) indicated that the E14.5 nephron-specific gene list favors molecules involved in cell communication (20.79%) and morphogenesis (11.11%). Metabolism is detected as a very significant functional theme in the E18.5 gene list with an EASE score of 7.72 E-5 (data not shown) [37].

Cell Component Level 4 ontological study (Figure 1C) of E18.5 nephron-specific genes featured by a high proportion of genes identified at subcellular sites related to energy metabolism, including mitochondria (46.27%), mitochodrial inner membrane (13.04%), electron transfer flavoprotein complex (0.93%), hydrogen-translocating V-type ATPase complex (1.24%), proton-transporting ATP synthase complex (2.80%), proton-transporting two-sector ATPase complex (3.11%), respiratory chain complex III (0.93%), respiratory chain complex IV (1.86%), and ubiquinol-cytochrome-c reductase complex (0.93%). There were also more proteins found at the vacuole (5.59%), microbody (4.04%) and vesicular fraction (3.11%). However, a higher proportion of proteins in the cytoskeleton (18.52%), Golgi apparatus (12.96%), nucleoplasm (12.96%), cytosol (11.11%), chromatin (3.70%) and basal lamina (2.78%) was found in the E14.5 gene list.

Discussion

The mouse is one of the most widely used animal models to study human biology especially after the development of embryonic stem (ES) cells and the assembly and annotation of the mouse genome sequence [24, 38]. ES cells and gene targeting technology allow the construction of transgenic mice with defined genetic modifications. The availability of whole genome sequences forms the basis of the development of high-throughput technologies, such as microarrays, to conduct research at a genomic level. Since it is relatively difficult to collect significant numbers of genetically well-defined human samples, it is important to perform research on an evolutionarily close species prior to their human applications. In this study, we took advantage of the nephron-deficient kidneys from metanephric mesenchyme-specific Lim1 conditional mutant mice to perform a genome-wide screen for developing nephron genes. Whereas similar studies have been performed on kidneys from meprin β, vitamin D receptor, aquaporin-1, or metallothionein knockout mice, our study is the first to use a tissue-specific approach in the kidney [1518].

Computational analysis and PubMed search suggested that the expression profile comparison between control and Lim1 conditional mutant kidneys generated gene lists enriched for nephron-specific genes. In global gene expression level studies, ribosomal genes and other housekeeping genes that are highly expressed but do not show any tissue- or developmental stage-specificity are always identified. In this study, we used two different protocols to enrich for developmental stage-specific kidney genes and nephron-specific genes of different developmental stages. Two independent online gene expression databases, namely GNF Expression Atlas 2.0 and Unigene were used to evaluate the tissue-specific enrichment computationally. Our results indicated dramatic enrichments for kidney-specific genes by both protocols in the E18.5 experiments. However, the comparison made between E14.5 and E18.5 control kidneys did not generate a kidney-specific gene list. In our opinion, there are at least two possible reasons. Firstly, data from the two online databases we used were based on experiments performed using either adult or neonate kidney tissues. Neither of them are likely to reflect gene expression profiles during early nephron development. Secondly, since many important molecular pathways and fundamental developmental processes are repeatedly observed in different organ systems, genes predominantly expressed during early nephron development are also likely to be found in other undifferentiated tissues. A closer look of the gene list results in a conclusion consistent with this latter assumption. Many of the genes identified by this comparison are commonly found in undifferentiated tissues (data not shown). In contrast, the comparison between E14.5 control and Lim1 conditional mutant kidneys identified molecules that are also found in mature nephrons, which predominantly contributed to the kidney-specificity in our evaluation, and molecules involved in early nephron development. For example, podocin (Nphs2) is only expressed in terminally differentiated podocytes [39]. Elevated podocin level observed in E14.5 control kidney suggests that the first podocyte is found before or around E14.5. Notably, regulatory genes important in early nephron development, such as Msi2h, Hes5, and Jag1, which are involved in Notch signaling [4045], do not show kidney-specificity based on GNF Atlas 2.0 and UniGene data. However, our screening protocol placed them on top of our list. Therefore, the use of Lim1 conditional mutant tissue as an RNA source in our microarray experiment helped to identify nephron developmental genes.

The results of our ontological comparison made between the E18.5 and E14.5 nephron-specific gene lists are consistent with current concepts of kidney organogenesis and a previous study [46, 47]. Gene ontology (GO) annotations provide structured, precisely defined, common, controlled vocabulary for describing the roles of genes and their products in any organism. It is the current representation of biological knowledge as well as serving as a guide for organizing new data [35]. However, one should keep in mind that the gene ontology is a dynamic, web-based resource, the annotations are not complete and their accuracy is limited by current knowledge of the molecules. Although kidney organogenesis is a continuous process, the first nephron is not seen until around E16.5 [1, 23]. Nephrons forming in a E14.5 kidney are mainly composed of reciprocally induced tissues, stem cell growth and differentiation, cell polarization, mesenchyme to epithelia transformation, branching morphogenesis, angiogenesis, apoptosis, proximal-distal segmentation and the differentiation of several interesting cell types. In our oncological analysis, E14.5 forming nephron-specific genes are composed of those better studied for their protein-protein interaction (binding) and possess signal transducer, transcription regulator and enzyme regulator activities. A higher portion of them are involved in cell communication and morphogenesis process. Interestingly, they are associated with cytoskeleton and nuclear compartments (nucleoplasm and chromatin). In contrast, there are many mature nephrons present in an E18.5 kidney. Therefore, we were expecting to observe genes related to kidney function, e.g. those involved in solute transport and energy metabolism. Our results indicate a relatively high number of E18.5 nephron-specific genes possess catalytic activity (presumably related to energy metabolism and the extensive extracellular matrix change in late kidney development) and exert their function as transporters. Consistently, an extremely high proportion of them encoded proteins located in the mitochondria.

Ontological analysis and a detailed examination of the top 1,000 E14.5 nephron-specific gene list suggest that genes with modest upregulation in the control kidney (fold changes less than 2 in our experiment) are also interesting. For example, Brn1 (1.34) and EphA4 (1.67) were previously shown to be downstream of Lim1 [20, 25]. Fzd4 (1.71) has been considered a candidate receptor to transduce Wnt4 signals during kidney organogenesis [1, 48]. Irx2 (1.52) and Irx3 (1.28) are homeobox genes previously reported to be expressed in the developing nephrons [49]. Crb3 (1.40) is known to localize to kidney epithelia and is essential for ciliogenesis [50, 51]. The top 1,000 genes upregulated in the E14.5 control kidney is supplied in Additional file 5.

Tissue heterogeniety always complicates the interpretation of microarray data although analyses of different organs or even on organisms of different developmental stages have been reported [30, 52]. In our experiment, we used the whole kidney as a tissue source for RNA preparation. The complexity of kidney structure and development limits interpretations of our results. Improvements in tissue collection methods such as laser capture microdissection and fluoresence-activated cell sorting (FACS) on genetic marked/fluorescent protein-labelled transgenic tissue have been developed [14, 53] and could provide more specificity. Nevertheless, our study demonstrates that genetically engineered mouse organs can be used to identify tissue-specific and developmentally regulated genes during mammalian organogenesis.

Conclusion

Our experimental results indicate that the expression profile comparisons between the control and the Lim1 conditional mutant kidneys generated nephron-specific gene lists. Our results demonstrate the feasibility of exploiting genetically engineered kidneys to identify developing nephron-specific genes.