Functional annotation of the cattle genome through systematic discovery and characterization of chromatin states and butyrate-induced variations
The functional annotation of genomes, including chromatin accessibility and modifications, is important for understanding and effectively utilizing the increased amount of genome sequences reported. However, while such annotation has been well explored in a diverse set of tissues and cell types in human and model organisms, relatively little data are available for livestock genomes, hindering our understanding of complex trait variation, domestication, and adaptive evolution. Here, we present the first complete global landscape of regulatory elements in cattle and explore the dynamics of chromatin states in rumen epithelial cells induced by the rumen developmental regulator—butyrate.
We established the first global map of regulatory elements (15 chromatin states) and defined their coordinated activities in cattle, through genome-wide profiling for six histone modifications, RNA polymerase II, CTCF-binding sites, DNA accessibility, DNA methylation, and transcriptome in rumen epithelial primary cells (REPC), rumen tissues, and Madin-Darby bovine kidney epithelial cells (MDBK). We demonstrated that each chromatin state exhibited specific enrichment for sequence ontology, transcription, methylation, trait-associated variants, gene expression-associated variants, selection signatures, and evolutionarily conserved elements, implying distinct biological functions. After butyrate treatments, we observed that the weak enhancers and flanking active transcriptional start sites (TSS) were the most dynamic chromatin states, occurred concomitantly with significant alterations in gene expression and DNA methylation, which was significantly associated with heifer conception rate and stature economic traits.
Our results demonstrate the crucial role of functional genome annotation for understanding genome regulation, complex trait variation, and adaptive evolution in livestock. Using butyrate to induce the dynamics of the epigenomic landscape, we were able to establish the correlation among nutritional elements, chromatin states, gene activities, and phenotypic outcomes.
KeywordsCattle genome Functional annotation Chromatin states Butyrate Rumen development
Flanking bivalent TSS/enhancers
Differentially expressed genes
Differentially methylated regions
Active enhancer with ATAC
Weak active enhancer
Expression quantitative trait loci
Madin-Darby bovine kidney epithelial cells
Rumen epithelial primary cells
Transcriptional end sites
Transcriptional start sites
Flanking active TSS
Transcribed at gene 5′ and 3′
Volatile fatty acids
Whole-genome bisulfite sequencing
Ruminants evolved from simple-stomached animals by transforming into foregut microbial fermenters that could digest grasses and complex carbohydrates . In ruminants, the rumen is central to feed efficiency, methane emission, and productive performance. Rumen microbes digest simple and complex carbohydrates (fiber) and convert them into volatile fatty acids (VFAs; mainly acetic, propionic, and butyric acids), and in fact, VFAs can provide 50 to 70% of a cow’s energy requirements . Interestingly, VFAs not only are nutrients critical to the energy metabolism of the ruminant, but also appear to be responsible for the differentiation during post-natal rumen development . Butyrate has been established as the most potent among VFAs in the induction of changes in cellular functions . Roles for butyrate have been established in the cell differentiation, proliferation, and motility, as well as the induction of cell cycle arrest and apoptosis . Our previous research showed that butyrate can regulate DNA histone modification  and gene networks, controlling cellular pathways including cell signaling, proliferation, and apoptosis . In addition, butyrate is a histone deacetylase (HDAC) inhibitor that alters histone acetylation and methylation  and, therefore, also functions as an epigenomic regulator . Thus, butyrate-induced biological effects in bovine cells may serve as a paradigm of epigenetic regulation and serve as a model for understanding the full range of butyrate’s potential biological roles and molecular mechanisms in cell growth, proliferation, and energy metabolism .
Researchers have discovered a plethora of regulatory elements for controlling genome activities (e.g., gene expression) in human and model organisms, which play central roles in normal development and diseases, hence dramatically improving our biological interpretation of the primary DNA sequence [11, 12, 13, 14, 15]. The Roadmap Epigenomics Consortium (2015) defined 15 chromatin states (e.g., promoter/transcript-associated and large-scale repressive states) in humans by combining five histone marks and demonstrated that those states have specific enrichments for DNA methylation and accessibility, as well as for non-exonic evolutionary conserved elements, indicating their distinct biological roles . Kazakevych et al. reported that chromatin states were dramatically changed during the specialization and differentiation of intestinal stem cells in adult humans, suggesting their important roles in normal organ development . In addition to the basic research of genomic biology, having a complete functional annotation of genomes will contribute to understanding the genomic underpinning of complex traits and diseases, thus benefiting precision medicine in humans. For instance, through partitioning heritability of complex traits by different functional annotations, Finucane et al. revealed that the heritability of immunological diseases was highly enriched in FANTOM5 enhancers . Speed and Balding increased the genomic prediction accuracy for complex traits and diseases in both humans and the mouse by differentially weighting genomic variants according to their functional annotations .
Although functional annotation of genomes has been well explored in a diverse set of tissues and cell types in human and model organisms, livestock genomes lack such functional annotation. Investigating the global regulatory elements of genomes in livestock not only informs us their basic biology, but also enhances the execution of genomic improvement programs [19, 20]. As shown in previous studies, even with limited functional annotations, investigators could improve QTL detection and genomic prediction for complex traits of economic importance in dairy cattle, particularly in multi-breed scenarios [21, 22, 23, 24, 25]. To produce comprehensive maps detailing the functional elements in the genomes of domesticated animal species, a coordinated international effort, the Functional Annotation of Animal Genomes (FAANG) project, was launched in 2015 .
General characteristics of epigenomic, DNA methylation, and transcriptomic data sets
Among the four experiments, we generated a total of 38 genome-wide epigenomic data sets at a high resolution, including six different histone marks (H3K4me3, H3K4me1, H3K27ac, H3K27me3, H3K9ac, and H3K9me3), RNA poly II, ATAC, and CTCF, producing a total of 1,545,698,388 clean paired-end reads with an average uniquely mapping rate of 73.20%. Additionally, we profiled six RNA-seq data sets and six whole-genome bisulfite sequencing (WGBS) data sets from REPC to explore changes in gene expression and DNA methylation before and after (24 h) butyrate treatment, producing a total of 83,648,115 (the average uniquely mapped rate of 86.9%) and 362,173,297 (31.9%) clean paired-end reads, respectively. Details of summary statistics for all 50 newly generated data sets are described in Additional file 2: Table S1.
For all 38 epigenomic data sets, as shown in Additional file 1: Figure S1a, we obtained a total of 1,624,657 peaks with an average of 42,754 (ranging from 738 for RNA pol II in the rumen tissue before weaning to 187,475 for H3K27ac in MDBK following butyrate treatment). In general, we obtained more peaks from the two cell lines (i.e., REPC and MDBK) than actual rumen tissues, possibly reflecting a sensitivity issue for measuring epigenomic marks in the actual tissues. The corresponding genome coverage for peaks in each sample had an average of 1.31% (ranging from 0.01% for RNA poly II in rumen tissue to 11.87% for H3K27me3 in REPC following butyrate treatment) (Additional file 1: Figure S1b). At 24 h post butyrate treatment in REPC, we observed CTCF, H3K27me3, and H3K4me3 generally increased their genome coverage percentage, whereas H3K27ac, H3K4me1, and ATAC lost their genome coverage percentage (Additional file 1: Figure S1b). We observed that the repressive histone mark, H3K27me3, exhibited a greater peak length than the other epigenomic marks (Additional file 1: Figure S2). These epigenomic marks exhibited a bimodal distribution along with their nearest genes, with one peak overlapped with the corresponding gene body and the other ~ 100 kb away from the gene body (Additional file 1: Figure S3). The first peak agrees with the enrichment of transcriptional start sites (TSS) with epigenomic marks, indicating the existence of cis-regulatory mechanisms underlying gene expression . The second peak might imply the existence of long-range regulatory elements (e.g., enhancers and insulators); however, further researches are required for a better understanding of its functional impacts on the gene activities. Both of the two repressive histone marks, H3K27me3 and H3K9me3, exhibited a higher peak at ~ 100 kb away from the gene body compared to the other epigenomic marks (Additional file 1: Figure S3). In addition, we found that correlations of peak-length vs. exon-length were higher than those of peak-length vs. gene-length and peak-length vs. chromosome-length (Additional file 1: Figure S4–S6), indicating the epigenomic peaks were more likely to be associated with exons as compared to genes and chromosomes. This might support that epigenomic marks play important roles in the transcriptional regulation [11, 15]. We also observed that CTCF and ATAC from the REPC sets were associated with many active histone modifications (e.g., H3K4me1, H3K4me3, RNA poly II, H3K9ac, and H3K27ac) in both REPC and rumen tissues (Additional file 1: Figure S7a), demonstrating that epigenomic modification shared certain similarities between the primary cells and rumen tissues. We identified that gene expression correlations of samples within groups (three biological replicates) were very high (r > 0.99), with a clear separation between samples from control and butyrate treatment (Additional file 1: Figure S7b). However, DNA methylation correlations among the six samples did not show a clear group-based pattern (Additional file 1: Figure S7c), consistent with the concept that DNA methylation is a relatively long-term regulator of gene expression compared to other epigenomic modifications . This suggests that DNA methylation may not regulate transcriptional changes in a short term, such as tested here for only 24 h after butyrate treatment.
Systematic definition and characterization of 15 chromatin states in cattle
We detected genes (n = 1230) with specifically high expression in REPC by comparing gene expression of REPC to that of 77 other somatic tissues and cell types from cattle, while excluding similar tissues in the gastrointestinal tract (Additional file 1: Figure S8). We found REPC-specific genes were significantly engaged in oxidation-reduction and metabolic processes (Additional file 1: Figure S8) and more likely to be enriched for active enhancers (chromatin states 4–6: active enhancer, EnhA; active enhancer with ATAC, EnhAATAC; and weak active enhancer, EnhWk) as compared to the other chromatin states (Fig. 2e), indicating the tissue specificity of many enhancers for ensuring tissue-specific gene expression . The neighboring regions of both TSS and TES of REPC-specific genes were enriched for the active promoter/transcript-associated states (chromatin states 1–3) (Fig. 2g, h). We observed that ATAC peaks (chromatin state 10) were highly enriched for CpG islands and satellite DNA, suggesting that chromatin structure of CpG islands and satellite DNA create an accessible environment for RNA polymerase II and other transcriptional components to bind . Of note was the flanking bivalent TSS/enhancers (chromatin state 12, BivFlnk, covering 0.56% of the entire genome), which was not only enriched near TSS of expressed genes but was also enriched near TSS of repressed genes. BivFlnk also had a low level of DNA methylation and had high enrichment for CpG islands, promoter regions, and transcription factors, similar to active promoter/transcript-associated states (Fig. 2d–f). We observed that repressive Polycomb (chromatin state 13, ReprPC, covering 3.58% of the entire genome) exhibited higher enrichment near repressed genes than expressed genes and had a high level of DNA methylation (Fig. 2e, f), indicative of their critical roles in gene repression. The transition parameters among chromatin states learned from ChromHMM suggested that the weak/poised enhancer-associated states and ATAC state were more likely to transition to the quiescent state than any other states (Fig. 2i).
Our large-scale GWAS signal enrichment analysis revealed that active promoters and transcripts (i.e., TssA, TssAFlnk, and TxFlnk) were the top enriched chromatin states across 45 complex traits of economic importance in the US Holstein population (Fig. 3g), in line with the findings in cattle QTLdb (Fig. 3d). Interestingly, enhancer-associated regions (e.g., EnhA, EnhWk, EnhAATAC, and EnhPoisATAC), which were likely to be tissue specific, were specifically enriched for body type traits (particularly for stature) and somatic cell score (an indicator of mastitis resistance), suggesting the potential roles of rumen epithelial cells in growth and innate immune responses (Fig. 3g). The motif enrichment analysis revealed that 136 out of 922 tested motifs were significantly (adjusted P < 0.01) enriched in TssA, mainly including motif families of zinc finger (n = 21), AP2EREBP (n = 40), and C2C2dof (n = 20) (Additional file 3: Table S2). This observation demonstrates that TssA is a hotspot for transcription regulatory factors, and implies that highly expressed genes also require a complex regulatory mechanism to ensure their proper function. We found that BivFlnk enriched for similar motifs as TssA, whereas ReprPC and EnhWk enriched for distinct motifs, such as Atoh1 and Tcf12, which belong to the bHLH family (Fig. 3h).
Butyrate-induced changes in chromatin states, gene expression, and DNA methylation
Among the first three active chromatin states, we observed that TssA was more stable during butyrate treatment, as 76.03% retained, while only 59.94% and 43.19% of TssAFlnk and TxFlnk were retained, respectively. Of note was TssAFlnk, which transitioned 11.31% to the quiescent state, whereas only 0.07% and 0.54% transitioned for TssA and TxFlnk, respectively (Additional file 1: Figure S17a). Within the 332 downregulated DEGs (± 20Kb), we found the top five most dynamic chromatin states induced by butyrate treatment were transitions from TssAFlnk and TxFlnk to the weak enhancer, quiescent, active enhancer, and poised enhancer (Additional file 1: Figure S17a). We found that 289, 179, and 302 out of the 332 downregulated DEGs (± 20Kb) also exhibited a loss of at least one of the three active epigenomic marks (i.e., H3K9ac, H3K27ac, and RNA pol II) in the rumen tissues after weaning, in the rumen tissues with butyrate treatment, and in MDBK with butyrate treatment, respectively (Fig. 6d). By examining the transcriptome from MDBK cell responses before and after butyrate treatment, we verified that expression of 302 out of 332 genes was significantly downregulated at 24 h with butyrate treatment (Fig. 6e). We showed changes of individual epigenomic marks of MAD2L1 gene (fold change = − 27.54) before and after butyrate treatment in Fig. 6f, as an example of the downregulated DEGs. MAD2L1 is a key component of the mitotic spindle assembly checkpoint and associates with multiple tumor processes [44, 45].
In summary, we established the first global map of regulatory elements (15 unique chromatin states) and defined their coordinated activities in cattle, through genome-wide profiling for six specific histone modifications, RNA polymerase II, CTCF-binding sites, DNA accessibility, DNA methylation, and transcriptomes in rumen epithelial primary cells (REPC), rumen tissues, and Madin-Darby bovine kidney epithelial cells (MDBK). Functional annotations of genome in the REPC capture a remarkable diversity of genomic functions encoded by distinct chromatin states and show that a majority of them are consistent across tissues and cell types. We identified significant associations of chromatin states with gene expression and DNA methylation, as well as demonstrated the importance of comprehensive functional annotation to facilitating the improved understanding of the genetic basis underpinning complex trait variation, eQTLs, positive selection, and adaptive evolution in cattle. Our findings directly support the concept that proximal regulatory elements contribute to positive selection and adaptive evolution of modern sheep breeds, while a previous study reported a similar idea through cross-species mapping of human functional annotation data on to the sheep genome . Additionally, we observed that a large proportion (~ 70%) of the cattle genome of rumen REPC exists in a quiescent state, similar to findings from human tissues where approximately two thirds of the reference epigenome in each tissue and cell type are quiescent [15, 52].
Ruminant species utilize VFAs as their major nutrient energy resources . Most of the VFAs are uptaken and utilized in the rumen epithelium and other gastrointestinal organs . The intrinsic necessities of VFAs add a level of increased sensitivity to ruminant cells. The full range of the biological roles and the molecular mechanisms that butyrate may play in bovine genomic activities has been intensively studied in vitro and in vivo. At 5-mM concentration, butyrate induces specific changes of gene expression and epigenomic landscapes in MDBK cells [5, 6, 7, 10, 41]. Comparing to the MDBK cell line, REPC provides a better in vitro model and mimic the rumen epithelium much closely than MDBK cells. To validate the data from in vitro experiment with REPC, in vivo experiments with the rumen tissues before and after weaning and rumen tissues before and after butyrate treatment by direct infusion  were also performed with ChIP sequencing. Our data suggested that the majority of defined chromatin states in REPC were generally consistent across tissues and cell types. Certainly, future studies with additional epigenomic marks and tissues/cell types are required for a more comprehensive functional annotation of the cattle genome and validation of the essential roles of butyrate played in rumen development and genetic activities.
Furthermore, our data provided strong verification that butyrate can change the epigenomic landscapes and chromatin states in both rumen tissues and cell lines, resulting in specific changes in gene expression and influencing rumen differentiation/development. We illustrated that the up- and downregulated genes induced by butyrate treatment exhibited distinctive variations in chromatin states and altered biological functions. It has been generally accepted that histone modifications play a crucial role in controlling gene expression. Butyrate, as a native HDAC inhibitor, re-induces histone post-translational modifications and, thus, regulates cell growth, apoptosis, and cell differentiation in many types of cancer . Many previously published reports were dedicated to the biological effects of butyrate on cancer cells. As a result, there is a wealth of knowledge on butyrate as an HDAC inhibitor, the role of aberrant histone acetylation in tumorigenesis, and the potential for cancer chemoprevention and therapy [46, 47, 48, 49]. There is little, if any, information about the biological impacts of butyrate in “normal” cells. And there is even less literature available addressing the fundamental mechanism of epigenomic regulatory activities of butyrate in rumen development and function. The HDAC inhibition activity of butyrate makes it a uniquely suited inducer for specific changes in the epigenomic landscape in the foregut of ruminants. Delineating the extent to which the epigenomic landscape and chromatin states are modified by butyrate-induced histone post-translational modification is a critical step in the path to understanding how this nutrient is perturbing specific transcriptomes at the mechanistic level. By surveying butyrate-induced dynamic variation of chromatin states concomitantly with changes in transcription activities observed in REPC, for the first time, we were able to establish strong correlations between nutritional elements, histone modifications, chromatin states, genomic activities, and cellular functions in cattle. Our findings also shed light on the putative use of HDAC functionality in chemoprevention therapies for malignant and non-malignant, hyperproliferative, and inflammatory disorders in humans .
We established the first global map of regulatory elements (15 chromatin states) and defined their coordinated activities in cattle. By integrating a range of genome-wide data sets, such as multiple-tissues/species gene expression, DNA methylation, trait-associated variants, selection signatures, and evolutionary conservation elements, we demonstrate the crucial role of functional genome annotation for understanding genome regulation, complex trait variation, and adaptive evolution in livestock. Using butyrate to induce the dynamics of the epigenomic landscape, we observed the correlation among nutritional elements, chromatin states, gene activities, and phenotypic outcomes.
Sample collections and next-generation sequencing
In the current study, all animal procedures were conducted under the approval of the Beltsville Agricultural Research Center (BARC) Institutional Animal Care Protocol Number 15-008. Animal experimental procedures (butyrate infusion and rumen biopsies), RNA extraction, and sequencing were detailed in our previous report . Rumen primary epithelial cells were isolated from a 2-week-old Holstein bull calf fed with milk replacer only. The methods for rumen epithelial cell isolation and culture were reported previously . The MDBK cell line was purchased from ATCC (ATCC CCL-22; Manassas, VA, USA) and grown in Eagle’s essential medium with 5% fetal bovine serum.
Butyrate treatment of cell culture
Ruminant species have evolved to metabolize the short-chain fatty acids to fulfill up to 70% of their nutrient energy requirements [2, 55]. The concentration of short-chain fatty acids in ruminant species is much higher than that in humans and other animals . Based on our previous experiment with MDBK cells, treatment of 5 mM butyrate in vitro can induce significant changes in histone acetylation level and transcription activities without induced significant apoptosis . Thus, 5 mM butyrate was added to the culture medium for 24 h for butyrate treatment of cells.
ATAC-seq, CTCF-seq, and ChIP-seq of H3K27ac, H3K27m3, H3K4m1, and H3K4m3 in rumen primary epithelial cells (RPEC) were performed by using NextSeq 500 (Illumina, Inc. San Diego, CA, USA) at Active Motif, Inc. (Carlsbad, CA, USA). ChIP-seq of rumen epithelial tissues and MDBK cells was performed as reported in our earlier publication . In short, DNA recovered from a conventional ChIP procedure was quantified using the QuantiFluor fluorometer (Promega, Madison, WI, USA). The DNA integrity was verified using the Agilent Bioanalyzer 2100 (Agilent; Palo Alto, CA, USA). The DNA was then processed, including end repair, adaptor ligation, and size selection, using an Illumina sample prep kit following the manufacturer’s instructions (Illumina, Inc., San Diego, CA, USA). Final DNA libraries were validated and sequenced at 75-nt per sequence read, using an Illumina HiSeq 2500 platform.
RNA extraction and RNA sequencing
RNA extraction was following the procedure reported previously . Total RNA from six rumen epithelial cell samples was extracted using Trizol (Invitrogen, Carlsbad, CA, USA) followed by DNase digestion and Qiagen RNeasy column purification (Qiagen, Valencia, CA, USA). The RNA integrity was verified using Agilent Bioanalyzer 2100 (Agilent, Palo Alto, CA, USA). High-quality RNA (RNA integrity number [RIN]: 9.0) was processed using an Illumina TruSeq RNA sample prep kit following the manufacturer’s instruction (Illumina, Inc., San Diego, CA, USA). After quality control (QC) procedures, individual RNA-seq libraries were pooled based on their respective sample-specific 6-bp (base pairs) adaptors and paired-end sequenced at 150 bp/sequence reads (PE150) using an Illumina HiSeq 2500 sequencer.
Whole-genome bisulfite sequencing (WGBS)
All experiments were carried out following published procedures [56, 57, 58]. Briefly, DNA from REPC culture was isolated by phenol/chloroform extraction. DNA (100 ng) was bisulfite-converted and subjected to library preparation using the Pico Methyl-Seq™ Library Prep Kit (Zymo) following the instructions of the supplier. High-sensitivity DNA chips were used to assess libraries for quality on the Agilent Bioanalyzer and quantified with Qubit fluorometer. Libraries were sequenced on an Illumina HiSeq2500 (150-bp paired-end sequencing).
Bioinformatics analyses for all epigenomic marks, RNA-seq, and DNA methylation
We removed raw reads that failed Illumina’s quality filter. In the REPC study, we generated a total of 385,544,396 and 428,908,598 clean paired-end reads for four ATAC-seq data sets and ten ChIP-seq data sets, respectively, using Illumina NextSeq 500. We also generated a total of 39,941,058 paired-end clean reads as the random background input. For the remaining three studies, we generated a total of 731,245,394 paired-end clean reads, and 3,247,857 and 5,709,815 paired-end clean reads as the random background input for the rumen tissue and MDBK studies, respectively. We then mapped clean reads to the cattle reference genome (UMD3.1.1) using the BWA algorithm with default settings . We only kept reads uniquely aligned with less than two mismatches for the subsequent analysis. We employed MACS2.1.1 for peak calling with default parameter settings by looking for significant enrichment in the studied samples when compared to the input data file (i.e., random background) . We calculated peak correlations among all 38 epigenomic samples using the following strategy. Briefly, we computed the correlation of sample A with sample B as the number of peaks in A overlapped with B, divided by the total number of peaks in A, while the correlation of B with A as the number of peaks in B overlapped with A, divided by the total number of peaks in sample B.
We employed a multivariate Hidden Markov Model (HMM), implemented in ChromHMM version 1.18 , to define 15 chromatin states using 200-bp sliding windows through combining all six epigenomic marks and one input random background in REPC. This method could provide an unbiased and systematic chromatin state discovery along the whole genome [13, 61]. We computed the enrichment fold of each state for each external annotation (e.g., CpG islands) as (C/A)/(B/D), where A is the number of bases in the chromatin state, B is the number of bases in the external annotation, C is the number of bases overlapped between state and the external annotation, and D is the number of bases in the genome. We calculated the significance of enrichment using Fisher’s exact test.
For all 12 RNA-seq and WGBS data sets in the REPC study (three biological replicates in each condition), we did quality control and trimming by employing FastQC (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) and Trim_Galore (version 0.4.1) (https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/), respectively. Generally, we removed adapters and reads with low quality (Q < 20) or shorter than 20 bp. For RNA-seq, we used STAR aligner  and Cufflinks software tools  to quantify gene expression and conduct differentially expression analysis, where only the uniquely mapped reads were used. We used the FPKM value of each gene as its normalized expression level. We defined DEGs as Bonferroni-corrected P value less than 0.05 and log2(fold change) greater than 2. For WGBS, all clean data were mapped to the cattle reference genome (UMD 3.1.1) using bowtie2 . We then applied Bismark software  with default settings to extract methylcytosine information. We kept loci with at least 10 clean reads coverage for further analyses. We determined DMRs using methylKit with 500-bp window size and 500-bp step size . Briefly, we used a logistic regression model, implemented in the calculateDiffMeth function, to detect DMRs. We computed P values by comparing the model fitness of alternative models (with treatment effects) to the null model (without treatment effects) and corrected to q values for multiple testing using the SLIM method . We considered q value less than 0.05 and the absolute value of the difference in methylation greater than 10% as DMRs.
GWAS signal enrichment analysis
Tissue enrichment analysis for DEGs and other down-stream bioinformatics analysis
To detect tissue/cell types that may be associated with DEGs induced by butyrate treatment, we conducted enrichment analyses for these DEGs using tissue/cell type-specific genes. We previously uniformly analyzed a total of 732 RNA-seq data sets to detect tissue/cell type-specific genes while accounting for known covariates (e.g., sex and age), including 91 different tissue/cell types in cattle. The details of the tissue/cell type-specific genes were summarized by Fang et al. (2019; submitted; https://github.com/LingzhaoFang1/Cattle-GeneAtlas). For tissue/cell type-specific genes, we chose the top 5% of genes that were specifically highly expressed in a tissue/cell type as the corresponding tissue/cell type-specific genes. We then employed a hypergeometric test, similar to GO enrichment analysis implemented in clusterProfiler . For exploring the biological function of a list of genes, we conducted the gene functional enrichment analysis using R package clusterProfiler , where a hypergeometric test, based on the current GO and KEGG databases, was employed. We used HOMER (http://homer.ucsd.edu/homer/motif/) to conduct the motif enrichment analysis for chromatin states considering the whole genome as background. We adjusted P values for multiple testing using the FDR method.
We thank Reuben Anderson, Mary Bowman, Donald Carbaugh, Christina Clover, Cecelia Niland, and Sara McQueeney for the technical assistance and sample collection. We thank the Council on Dairy Cattle Breeding for the genotype, phenotype, and pedigree data; Interbull for global trait evaluations; and the anonymous reviewers for many helpful comments. Mention of trade names or commercial products in this article is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the US Department of Agriculture. The USDA is an equal opportunity provider and employer.
LF, GEL, and CJL conceived and designed the experiments. EEC, RLB, GEL, and CJL collected samples and/or generated data. LF, SLL, ML, XK, SDL, BL, AT, and LM performed computational and statistical analyses. LF, EEC, RLB, AT, LM, GEL, and CJL wrote the paper. All authors read and approved the final manuscript.
This work was supported in part by AFRI grant numbers 2013-67015-20951, 2016-67015-24886, and 2019-67015-29321 from the USDA National Institute of Food and Agriculture (NIFA) Animal Genome and Reproduction Programs and BARD grant number US-4997-17 from the US-Israel Binational Agricultural Research and Development (BARD) Fund. B. Li was supported in part by an appointment to the Agriculture Research Service (ARS) Research Participation Program, administered by the Oak Ridge Institute for Science and Education through an interagency agreement between the U.S. Department of Energy and ARS. G.E. Liu was supported by appropriated project 8042-31000-001-00-D, “Enhancing Genetic Merit of Ruminants Through Improved Genome Assembly, Annotation, and Selection” of the Agricultural Research Service of the United States Department of Agriculture. E.E. Connor, R. L Baldwin, and C-J Li were supported by appropriated project 8042-31310-078-00-D, “Improving Feed Efficiency and Environmental Sustainability of Dairy Cattle through Genomics and Novel Technologies”. A. Tenesa was funded by the Roslin Institute Strategic Programme Grants BBS/E/D/10002070 and BBS/E/D/30002275.
Ethics approval and consent to participate
All animal procedures were conducted under the approval of the Beltsville Agricultural Research Center (BARC) Institutional Animal Care Protocol Number 15-008.
Consent for publication
The authors declare that they have no competing interests.
- 23.Fang L, Sahana G, Ma P, Su G, Yu Y, Zhang S, et al. Exploring the genetic architecture and improving genomic prediction accuracy for mastitis and milk production traits in dairy cattle by mapping variants to hepatic transcriptomic regions responsive to intra-mammary infection. Genet Sel Evol. 2017;49:44.CrossRefPubMedPubMedCentralGoogle Scholar
- 56.Fang L, Jiang J, Li B, Zhou Y, Freebern E, Vanraden PM, et al. Genetic and epigenetic architecture of paternal origin contribute to gestation lengthin cattle. Commun Biol. 2019;2:100.Google Scholar
- 73.Fang L, Liu S, Liu M, Kang X, Lin S, Li B, et al. Functional annotation of the cattle genome through systematic discovery and characterization of chromatin states and butyrate-induced variations. Gene Expression Omnibus. 2019; https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fgeo%2Fquery%2Facc.cgi%3Facc%3DGSE129423&data=01%7C01%7C%7C2d3537bc2cfb49b9940708d7106c5429%7Ced5b36e701ee4ebc867ee03cfa0d4697%7C1&sdata=ce%2BS%2BBl5hoLhUszg1ea7%2Bk3OrFvJllUCjTUqyR46vek%3D&reserved=0.
- 74.Fang L, Liu S, Liu M, Kang X, Lin S, Li B, et al. Functional annotation of the cattle genome through systematic discovery and characterization of chromatin states and butyrate-induced variations. Github Repository. 2019; Available from: https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FLingzhaoFang1%2FCattle-Genome-Functional-Annotation&data=01%7C01%7C%7C2d3537bc2cfb49b9940708d7106c5429%7Ced5b36e701ee4ebc867ee03cfa0d4697%7C1&sdata=oaJ1prPiV6GRQyvfiUmJDL1Liz7XUkY16YORoFHnK1I%3D&reserved=0.
- 75.Hunt S, McLaren W, Gil L, Thormann A, Schuilenburg H, Sheppard D, et al. Ensembl variation resources. Database. 2018;1. https://doi.org/10.1093/database/bay119.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.