Introduction

Quantitative traits are those under polygenic control and often exhibit continuous phenotypic variation within or among populations. Those traits are controlled by many genes, together with environmental factors, and each underlying gene contributes a small proportion of the genetic variation (Falconer and Mackay 1996). Most of the traits considered in animal and plant genetic improvement programs are quantitative traits. Quantitative trait locus (QTL) mapping has long been recognized as constituting a central challenge to researchers working on a wide variety of questions. The first publication on QTL mapping by using whole genome molecular markers was reported in 1988 (Paterson et al. 1988); after that interval mapping based on DNA markers applied to genetically localize QTL in natural and experimental populations started to attract wide attention from research scientists such as Lander and Botstein (1989). Although QTL mapping has identified hundreds of chromosomal regions containing genes affecting different traits in farm animals, the truly identified QTL genes are still very limited in number (shown in Table 1). For a long time gaps have existed between their molecular basis and complex traits; in other words, the molecular basis of quantitative genetic variation is not clear. In recent years, the development of several novel advanced technologies for genomic analysis has made it available for QTL detection, and hence their fine cloning is more tractable. The new approaches applied to the research will facilitate multi-factorial and complex trait analysis for QTL. The aim of this paper is to introduce the developing field of farm animal genomics, to describe integrated strategies and technologies to localize and characterize QTL and to explore marker-assisted selection and its use in farm animal breeding.

Table 1 A list of QTL whose causal mutations have been identified in farm animal

Genomics development in farm animals

Molecular markers and linkage map

Genetic-linkage maps illustrate the order of genes on a chromosome and the relative distances between those genes. Owing to advances in the area of molecular genetics and automatic techniques, high density molecular genetic maps are now available for many farm animals. The availability of a linkage map can provide important insights into genome organization and chromosomal localization of cloned genes as well as the framework for the identification and localization of major QTL (Crittenden et al. 1993). Because assigning a locus to the genetic linkage maps requires segregating polymorphic genetic markers, the genetic linkage maps contain a preponderance of highly polymorphic anonymous markers, primarily microsatellite markers, and relatively few expressed genes, which have very limited genetic variability. The first reported map in livestock was for the chicken (Gallus gallus) (Bumstead and Palyga 1992; Groenen et al. 1998, 2000), and subsequently several genetic maps for agriculturally important animals have been reported, such as for the cattle (Bos taurus) (Bishop et al. 1994; Kappes et al. 1997), pig (Sus scrofa) (Rohrer et al. 1994; Rohrer et al. 1996), sheep (Ovis aries) (Crawford et al. 1995; Maurico et al. 1998; Maddox et al. 2001), goat (Capra hircus) (Vaiman et al. 1996), rabbit (Chantry-Darmon et al. 2006), and duck (Anas platyrhynchos) (Huang et al. 2006). Cattle and swine are the most frequently used livestock species for linkage maps and their linkage maps are the most highly developed, whereas genome mapping in chickens has become a challenge due to a large number of micro-chromosomes that exist in the chicken genome. The chicken linkage map has about 2,110 assignments. The current status is summarized in Table 2. Those genome-wide maps will be useful for establishing marker–trait associations, not only through linkage analysis but also through association mapping.

Table 2 The status of farm animal genomic research (before 2007)

Large-insert library and BAC-based physical map

A large size DNA fragment library is an excellent resource for marker development and helps to increase the resolution of the chromosome regions containing interesting QTL. A bacterial artificial chromosome (BAC) is the most versatile and commonly used large fragment cloning system. They have been produced for most livestock species as a prerequisite for the generation of high density marker maps and positional cloning, including cow (Cai et al. 1995), sheep (Vaiman et al. 1999), chicken (Zimmer and Gibbins 1997; Crooijmans et al. 2000; Lee et al. 2003; Liu et al. 2003), pig (Anderson et al. 2000; Suzuki et al. 2000; Fahrenkrug et al. 2001; Jeon et al. 2003; Liu et al. 2006), and Rabbit (Rogel-Gaillard et al. 2001). The construction of large-insert libraries makes it available for a more targeted approach to physical and comparative mapping.

Physical maps, in contrast to genetic linkage map, always give the physical, DNA-base-pair distances from one landmark to another. These maps are constructed by direct assignment of a landmark to an intact chromosome or chromosomal fragment. The assigned landmark can be a gene or anonymous markers. Since a genetic variant within the locus is not necessary, physical maps usually contain a relatively large number of expressed genes. Localizing the same loci on both the genetic linkage map and physical maps within a species allows the two kinds of maps to be combined with each other. With the development of BAC library construction in the past few years the BAC-based whole genome physical mapping has become an important part of farm animal genomics research. Researches on approaching integrated BAC end sequencing, finger printed contigs (FPC), and genome sequencing have been undertaken on many farm animals including cattle (Schibler et al. 2004), chicken (Lee et al. 2003; Wallis et al. 2004), sheep (Cockett et al. 2001), and pig (Andersson 2001). The BAC-based physical maps enable the identification of individual clones aligned to positions in their sequenced genomes so as to provide essential platforms for a large-scale genome sequencing, effective positional cloning, high-throughput expressed sequence tag (EST) physical mapping, and target DNA marker development. The NCBI provides comprehensive linkage map and physical map information for many farm animal species. Mapping information for 15 vertebrate species including cattle, sheep, pig and chicken are available at http://www.ncbi.nlm.nih.gov/mapview.

RH map and comparative map

Whole-genome radiation hybrid (RH) mapping is also an important method for generating high resolution maps. A hybrid cell line is fragmented by X-rays and chromosome fragments are then fused to a recipient rodent cell (Walter et al. 1994). The RH panel reveals the location of markers relative to one another and enables construction of maps of chromosomes. The distance between loci in the RH map is proportional to physical distance. Collaborative research by scientists all over the world on RH mapping in livestock species has delivered high resolution, high-density RH maps in cattle (Itoh et al. 2005), pig (Yerle et al. 1998; Hawken et al. 1999), chicken (Morisson et al. 2002), and sheep (Laurent et al. 2007; Tetens et al. 2007).

The RH map can be used for constructing integrated linkage maps and physical maps within a species. Furthermore, by mapping functional genes common across species, an RH panel can also serve as the link for comparative mapping. Comparative genomics can serve as a powerful tool to predict gene function in farm animals based on a shared evolutionary history with human and other model organisms. The availability of RH and BAC based high resolution physical maps provides a link between the ‘information-poor’ maps of farm animals and the ‘information-rich’ genomes of human, mouse and other model organisms (Burt 2002). Early attempts to construct comparative maps of livestock with human as a model species were based mainly on somatic cell genetics and in situ hybridization (Winter et al. 2002). The use of RH panels and sequence data from EST projects has accelerated the development of high-resolution comparative maps with humans and mouse. Comparative mapping makes it possible to predict the location of previously identified human genes or QTL for the farm animal QTL studies. The candidate gene information becomes available for traits mapped by linkage analysis once comparative and linkage maps are combined. The available information open to the public related to livestock RH mapping and comparative mapping is shown in Table 2.

EST sequencing

The ESTs are small pieces of DNA sequences (usually 200–500 nucleotides long) that are generated by sequencing either one or both ends of an expressed gene (http://www.ncbi.nlm.nih.gov/About/primer/est.html). It provides a feasible means for understanding the genome, or at least the transcriptome, of a given tissue of a species. Large-scale EST-sequencing projects have been undertaken in several important farm animal species before genome sequencing projects and a large number of ESTs have become available for them.

EST resources have also been widely used in applied studies, by exploiting them in the development of molecular markers and functional genomics studies. For example, ESTs have been used extensively for the development of EST–SSR and SNP markers, which are not only used in trait mapping and MAS but also provide information about genome evolution. Similarly, the cDNA clone or EST data also have been used to develop microarrays. Many of these resources are available through the NCBI and ARK-genomics (The Centre for Functional Genomics in Farm Animals of UK). In Table 2 we summarize the current status of development of EST projections for the major farm animal species, the largest being the chicken with almost 350,000 ESTs. Many databases and tools based on the web have been set up to meet the EST annotation and data management requirements of multiple high-throughput EST sequencing projects for cattle (Kumar et al. 2004) sheep (Caprera et al. 2007), chicken (Abdrakhmanov et al. 2000; Chen et al. 2005; Carre et al. 2006; Park et al. 2006) and pig (Kumar et al. 2004; Chen et al. 2005; Uenishi et al. 2007). They have become an important resource for livestock genetic research.

Genome sequence and SNPs maps

With the human genome project inaugurated at the beginning of the last decade the complete DNA sequence of humans and several experimental animals has been decoded. Now sequencing has turned to farm animals, and. BAC-based whole genome sequencing and short-gun sequencing have been undertaken for a number of genomes. The chicken genome sequence was completed in 2004 (Wallis et al. 2004) and the bovine in 2006 (Gao et al. 2007). The sequence of the pig whole genome is still on the way. These resources for genetics research can now be more fully exploited through the availability of genome information and tools (sequences, SNPs, microarrays) that are equivalent to those available to human and mouse geneticists.

The chicken

As a food animal the chicken comprises 41% of the meat produced and most of the eggs in the world. Beyond that, the chicken also serves as a model organism for the study of disease and biology (Dequeant and Pourquie 2005). In the year 2004, the chicken became the first farm animal to have its genome sequenced (Wallis et al. 2004). Led by the Genome Sequencing Center at Washington University School of Medicine, St. Louis, a group of scientists including individuals from the US, UK, Europe and China were involved in the sequence and data analysis. The improved 6.6X draft chicken genome assembly (http://www.ncbi.nlm.nih.gov/mapview/map_search.cgi?taxid=9031) was submitted to NCBI in 2006. In parallel with the above project Beijing Genomics Institute led an international team of scientists from China, USA, UK, Sweden, Netherlands and Germany to map extensive DNA sequence variation throughout the chicken genome by sampling DNA from domestic breeds (a male broiler (Cornish) from Roslin Institute, a female layer (White Leghorn) from the Swedish University of Agricultural Sciences, and a female Silkie from the China Agricultural University in Beijing). Using the Red Jungle Fowl genome sequence as a reference, 2.8 million non-redundant DNA sequence variants were identified (Wong et al. 2004). The small haplotype blocks detected in this study underscore the need for a larger number of SNPs to identify such identical-by-descent segments. Although these small blocks may require greater marker density and more recombinants to identify the causative haplotype, once the haplotype is found, successful resolution of the actual QTL alleles can be foreseen. The completion of the chicken genome sequence and SNP map has provided a framework for investigating polymorphisms of informative quantitative traits (Fadiel et al. 2005).

The bovine

The first draft of the bovine genome sequence was deposited into a free public database a few months later than the chicken genome sequence in October 2004. The second version of the bovine genome, Btau_2.0, which was a 6.2X whole genome shotgun assembly, was then released in NCBI. In August 2006, the third version, Btau_3.1, which was a 7.15X mixed assembly that combines whole genome shotgun sequence with BAC sequence, was released (http://www.ncbi.nlm.nih.gov/genome/guide/cow/) (Gao et al. 2007).

The pig

The whole-genome sequencing for pig began in 2005. The “Sino-Danish Pig Genome Project” has published a pig genome sequence with <1X coverage (Wernersson et al. 2005). In the near future, the sequencing of the porcine genome will identify the gene markers for specific traits of body growth, fat deposition, reproduction and so on, assisting breeders in selecting pig stocks.

The genome sequence assemblies of farm animal species are now accessible through public domain databases, and further sequencing projects are in rapid progress. In addition, large collections of expressed sequences will aid in constructing annotated transcript maps for many economically important species. Thus, the breeding of farm animals is entering the post-genome era.

Proteomics

Notwithstanding the increase in the number of genome sequencing projects, it is obvious that the knowledge of the genome sequence of an organism certainly represented a wealth of information but constituted more a starting than an end point in understanding the function of life. Proteomics is a snapshot of proteins produced by a species using the technologies of large-scale protein separation and identification.

The aim of proteomics is to identify and localize proteins within a given organelle, cell, or organism, and unravel protein pathways in vivo (Michaud and Snyder 2002; Patterson and Aebersold 2003). There are two major areas: expression proteomics which is aimed at measuring the protein levels and functional proteomics which is aimed at characterizing protein activities and signaling pathways (Shevchenko et al. 1997; Pandey et al. 2000). Understanding protein functions depends on the identification of the interaction between the protein and its partners. The protein complex shows the biological function of the unknown protein and its partners (Gavin et al. 2002).

Two-dimensional gel electrophoresis is often used to fractionate the numerous proteins from a cell or tissue. Then the proteins of interest are identified from the gels by mass spectrometry. Three strategies used in functional proteomic studies have been developed to identify the interacting proteins in stable complexes in vivo: fishing for partners’ strategies, immunoprecipitation strategies and the TAP tag system (Hubner et al. 2005). Proteomics can also be used to identify posttranslational modifications and alternative splicing, protein–protein interactions. High-throughput-interaction analysis has resulted. Recently Hamelin et al. (2006) used the method of two-dimensional gel electrophoresis to investigate the effects of a QTL for muscle hypertrophy on sarcoplasmic protein expression in ovine muscles. In the Belgian Texel breed, the QTL for muscle hypertrophy is localized in the myostatin-encoding gene. Their result indicates that transferrin and alpha-1-antitrypsin may interact to reinforce myogenic proliferative signaling.

Strategies for QTL mapping

The status of QTL mapping and genome research in cattle (Rocha et al. 2002; Khatkar et al. 2004), pig (Rothschild 2003, 2004; Buske et al. 2006; Chen et al. 2007; Otto et al. 2007), and chicken (Abasht et al. 2006; de Koning et al. 2007) have been reviewed. Andersson and Georges (2004) provide a standard for measuring the development of livestock genomics in its role of helping to define the human genome through the genetic analysis of complex traits. The main goal of genome research in farm animals is to map and characterize genes that are causative for QTL. There are two main strategies for QTL mapping: association tests using candidate genes and genome scans based on linkage mapping in a cross population (Andersson 2001).

Candidate gene approach

The candidate gene approach studies the relationship between the trait of interest and known genes that may be contained in the physiological pathways underlying the trait (Andersson 2001). The implementation of a candidate gene approach consists of the following steps: (1) constructing or collecting of a resource population, (2) recording phenotypes for target traits, (3) choosing candidate gene(s), (4) genotyping animals in the target population, and (5) analyzing the phenotypic and genotypic data (Kadarmideen et al. 2006). Although great progress has been achieved by using the candidate gene approach (MacLennan et al. 1990; Taylor et al. 1998; Liu and Lamont 2003; Wang et al. 2005; Guyonnet-Duperat et al. 2006), its limitation is obvious too. This approach can be very powerful only when the candidate gene is a true causative gene. The candidate gene tests must be interpreted with caution because spurious results can occur due to linkage disequilibrium to linked or non-linked causative genes or because the significance thresholds have not been adjusted properly when testing multiple candidate genes (Andersson 2001). It also requires prior knowledge of the physiology of the target trait, but the knowledge needed is not always available. Sometimes there are many candidate genes co-existing for the trait and it is very time consuming work to evaluate all of them. Furthermore, some genes that are not part of the known physiological pathways may contribute to the trait under investigation. Before the genome was sequenced or before the genome sequence was applied, the selection of candidate genes was based upon cross-species gene comparisons. It was hard and troublesome work to analyze and compare the genes from different species. The genome sequence, especially the SNP map, should solve many issues in the candidate genes selection.

Genome scan approach

The genome scan approach studies the relationship between a trait and markers selected across the genome to identify chromosomal locations associated with the trait (Andersson 2001), and will find the map location of a trait locus with a major effect. It is carried out by the follow steps: (1) designing and constructing of a resource population, (2) recording phenotype of the resource population, (3) choosing genetic markers, (4) genotyping of the population for selected markers, (5) constructing of linkage maps, and (6) statistical analysis of the phenotypic and genotypic data derived from the resource population (Kadarmideen et al. 2006). A resource population is a population generated for particular research purposes with phenotypic information and sufficient DNA supply for genotyping, for example an intercross between two divergent breeds of farm animal or a population containing particularly interesting phenotypic data. Because the design of the intercross between two divergent populations of farm animals is a more powerful approach for QTL mapping, it is used in almost all the resource populations, such as the wild boar and Large White pigs (Knott et al. 1998; Jeon et al. 1999; Nii et al. 2005, 2006), Asian and European breeds of pig (Jungerius et al. 2004; Stratil et al. 2006), Bos taurus (Angus) and Bos indicus (Brahman) cattle (Jeon et al. 2003), the broiler and the silkie (Deng et al. 2001; Gao et al. 2005), and the jungle fowl and the layer (Wright et al. 1992).

Using multiple linear regression, efficient and robust methods to map QTL in simple and complex pedigrees were developed (Haley and Knott 1992; Haley et al. 1994; Visscher et al. 1996; Seaton et al. 2002). User friendly web-based interfaces were developed, and the web based mapping tools simplified the application of QTL interval mapping technically. Individuals with little knowledge of statistical mapping also found they were capable to undertake this operation. The web-based interface enables more efficient international cooperation.

Through a genome scan, many QTL can be located in farm animals, and it can provide a useful bridge to link genome information with phenotype. The Animal Quantitative Trait Loci (QTL) database (AnimalQTLdb), which contains all publicly available QTL data on farm animal species from the past decade (Hu et al. 2007), comprises QTL location (chromosome, location, and location span), flanking markers, peak markers, test statistics, QTL effects and traits. It is easy to locate and make comparisons among the species within this database. Up to May of 2007, there were 1,675 QTL from 110 publications representing 281 different traits for pig, 846 QTL from 55 publications representing 91 different traits for cattle, and 657 QTL from 45 publications representing 112 different traits for chicken (http://www.animalgenome.org/QTLdb/). Although a genome scan can identify QTL for the aim trait with full genome coverage, it is still difficult to detect the loci which have smaller effects and hence hardly reach the more stringent significance thresholds.

Epistatic analysis

The traditional models used to detect the QTL usually account for additive and dominant effects but do not account for epistatic or interaction effects. Theoretically by taking epistasis into account, it should be possible to identify more QTL. The interaction between different genes has been found to play an important role in Arabidopsis (Kroymann and Mitchell-Olds 2005) and yeast (Segre et al. 2005), but the same analysis in farm animals using traditional models has been limited. However, Calborg et al. (2003, 2004) reanalyzed the chicken population using an epistatic model for growth traits which had initially been analyzed with traditional models. Additional QTL could be detected with the epistatic model, the individual QTL explained the phenotypic variance better together with the interactive effects fitted, and the epistatic interactions played an important role in the chicken’s early growth rate. Epistatic QTL mapping could help to detect the underlying quantitative trait genes and elucidate the genetic mechanism of quantitative traits.

Fine mapping and positional candidate cloning

As mentioned above, QTL mapping by genome analysis generally begins with the collection of genotype and phenotype data from a segregating population, followed by statistical analysis to find the possible markers with allelic variation related to the target trait. Most reported QTL in animals have large confidence intervals possibly harboring hundreds of genes, hardly sufficient to anchor the target genes. The goal of fine mapping is to map a QTL to a narrow chromosome region so as to physically identify and positionally clone the causative gene. There is a long way from the initial QTL detection to the identification of genes affecting the phenotypic variation, and it is much more difficult to deal with the genes underlying polygenic traits than monogenic traits. Larger pedigrees and more sophisticated statistical analysis are needed for polygenic traits. There are several options for fine-mapping analysis, including high density linkage analysis, linkage disequilibrium mapping, selective backcrossing, producing advance intercross lines and identical-by-descent mapping. Some of them can be combined to dissect the functional gene and the causative variation with the help of bioinformatics and statistics.

The detection of selective sweeps is also used to identify chromosomal regions affecting the phenotype that have been under strong selection by human intervention (Andersson and Georges 2004). A selective sweep results in the elimination of surrounding variation in regions linked to a recently fixed beneficial mutation. The identification of the causative mutation for the insulin-like growth factor 2 (IGF2) that causes a QTL in pigs is an excellent example of an application using these approaches (Van Laere et al. 2003). More recently, Sutter et al. (2007) employed a selective sweep as the main approach to show that the single IGF1 allele is a major determinant of small size in dogs.

The information on QTL map location from the genome scan strategy and gene function is combined to identify candidate genes in the process of positional candidate cloning. In farm animals such identification often relies heavily on the exploitation of comparative data. The candidate gene approach will become even more powerful with the completion of the genome sequence, the availability of the generation of informative databases on gene function and gene expression patterns. The RN gene identification was the classical positional cloning study in farm animals. The RN gene was mapped to SSC 15q2.4–2.5 between the flanking makers SW2053 and SW936 (Milan et al. 1998, 2000). To clone this gene, linkage mapping, linkage disequilibrium mapping, radiation mapping, BAC contig construction, BAC sequencing and bioinformatics analysis were used, and eventually the result was the identification of the causative missense mutation in the PRKAG3 genes. In 2000 the RN phenotype was found to be caused by an R225Q mutation in the PRKAG3 gene, which encodes a muscle-specific isoform of the regulatory-subunit of adenosine monophosphate activated protein kinase (AMPK) (Milan et al. 2000). The distinct phenotype of the RN mutation indicates that PRKAG3 plays a key role in the regulation of energy metabolism in skeletal muscle. Other mutations were found in the PRKAG3 gene associated with the quality of meat in the pork loin (Lindahl et al. 2004a, b). Recently a comparative proteome study of the RN gene effect showed that the expression profiles of several enzymes of the glycogen storage pathways were differentially regulated in a pattern, and the integrated data from this proteome study indicates that regulation of glucose transport was severely affected in mutant animals (Hedegaard et al. 2004). Further studies of these mutations were of great interest in explaining molecular mechanisms that influence drip loss in porcine meat (Otto et al. 2007).

The RN phenotype is associated with elevated glycogen content in the sarcoplasmic as well as in the lysosomal compartment of glycolytic muscle cells. The RN gene has no effect on early post mortem pH values, but results in a lower pH-24 h value, which again is associated with a higher reflectance (lighter meat) and inferior WHC (water-holding capacity) (Le Roy et al. 1990). In other words, the positive effect of the RN gene should not be neglected. The RN allele is another example of a mutation that has probably increased in frequency because of selection for meat content in pigs. It occurs at a high frequency only in the Hampshire breed and increases glycogen content in muscle by ~70% (Milan et al. 2000).

Genome wide association analysis

Genome-wide association analysis (GWA) is an approach that makes use of markers across the complete set of genomes to find genetic variation associated with a particular phenotype. It is supposed that the causative mutation for the phenotype is in linkage disequilibrium with flanking polymorphic markers in existing diverse populations even after many generations of recombination. So it would require very dense markers to capture all the chromosome fragments for the weak LD, but fortunately the many available SNPs makes the GWA possible. Genome-wide association analysis studies are particularly useful for finding genetic variants that contribute to complex traits. Although the technologies to perform genome-wide association analysis have been available only for a short period, several studies by this approach have already been published (Duerr et al. 2006; Saxena et al. 2007; Scott et al. 2007).

Genome-wide association (GWA) analysis has been used in many individual studies, and it also could open new frontiers in our understanding and treatment of complex traits (Hirschhorn and Daly 2005) because the many generations of recombination process in existing populations enable fine mapping QTL using SNPs as very dense markers. Andersson and his colleagues used the GWA strategy to identify functional genes in dogs, which are the first reports using this method in the animal genetic field (Karlsson et al. 2007). The small number (20) of dog breeds sampled but which led to great results was very impressive. GWA is being widely used for drug-resistance control gene identification in humans and model animals. The current results indicate great usefulness of the GWA strategy.

Validation of QTN through RNAi

For QTL mapping, it is essential to verify the effect of candidate gene function and to work out the detailed the molecular basis of quantitative traits. Gene function now can be addressed at every level from the engineering of DNA alterations in the germ-line using homologous recombination, to inhibiting gene transcription using RNAi (RNA interference) technologies, to inhibiting protein function using chemical genetic approaches (Schuelke et al. 2004). RNAi is a highly evolutionally conserved process of post-transcriptional gene silencing (PTGS) by which double stranded RNA (dsRNA), when introduced into a cell, causes sequence-specific degradation of homologous mRNA sequences. It was first discovered in 1998 by Fire and Mello (Fire et al. 1998) in the nematode worm Caenorhabditis elegans and later found in a wide variety of organisms, including mammals.

Since the demonstration that synthetic siRNAs could be used in mammalian cells for gene silencing (Hutvagner et al. 2001; Lagos-Quintana et al. 2001; Tuschl and Borkhardt 2002), siRNA-induced RNAi has become a key strategy for defining and analysis of gene function in many research programs. The rapid adoption of RNAi technology resulted primarily from the ease of use of siRNAs and the great need for a method to reduce the expression of individual genes in mammalian cells in order to establish a link between gene identity and gene function. Since siRNA-mediated RNAi results in knockdown of gene expression, the observed phenotype is dependent upon how much remaining gene expression is required to give measurable function in the assay used.

Predictions about mammalian gene function are sometimes based on methods for identifying differentially expressed genes or on the known function of homologous genes in model organisms like Drosophila, C. elegans, and S. cerevisiae. RNA interference technologies allow scientists to knock down expression of a targeted gene and observe the effects of partial to full loss of function (Butler et al. 2005). With the RNAi method gene function studies can be executed not only by gain-of-function but also by loss-of-function approaches and therefore it represents a new tool for studying functional genomics.

eQTL mapping

The availability of complete, or almost complete, genome sequences for a wide variety of organisms, combined with the emerging technologies for genome scale assays of transcriptional and translational activity and sequence variation, has revolutionized biological research in the disciplines ranging from phylogenetics to biomedicine.

With the development of microarray techniques it is possible to perform genome-wide expression profiling in biology. Jansen and Nap (2001) proposed a merger of genomics and genetics into “genetical genomics” which utilize the genetic variation and transcriptional variation of the individual in a segregating population. Expression quantitative trait loci (eQTL) are used to describe the variation affecting transcript abundance. Thereafter, some quantitative geneticists focused on the heritability of transcription and detection of eQTL (Gibson and Weir 2005).

The basic experimental design of an eQTL study is identical to that of classical QTL mapping. Some eQTL have been identified in yeast (Brem et al. 2002), mice (Schadt et al. 2003; Morley et al. 2004), rats (Hubner et al. 2005), and humans (Schadt et al. 2003; Morley et al. 2004). Major-effect eQTL are prevalent in these studies, and overall the eQTL could account for 25–50% of the transcriptional variation. Although a major-effect eQTL explains an important part of the transcriptional variation, the other part remains to be accounted by the undetected loci due to the current limitation on experiment design or the inadequate amount of data (Gibson and Weir 2005). Brem and Kruglyak (2005) found the variance for transcription was better explained by polygenic models than by monogenic models in yeast. With the application of genetical genomics in model species and humans, it has great potential in farm animals that many QTL(s) have already been mapped. Combining expression studies with fine mapping has ben proposed to facilitate the implementation of genetical genomics in farm animals (de Koning et al. 2007).

Li et al. (2006) began a study of the heritable difference in the environment-induced plastic response of gene expression (plasticity quantitative trait loci, pQTL) induced by temperatures of 16 and 24°C in Caenorhabditis elegans. The result indicated that heritable differences in plastic responses of gene expression are largely regulated in trans, but in several studies it was also suggested that cis-acting eQTL have larger effects on transcription. All these studies demonstrate the potential of genetical genomics to determinate the molecular basic of phenotypic variation.

Epigenomics and QTL mapping

It is a popular classical view that most traits are encoded and transmitted by the DNA (or gene) sequence, but now more and more scientists accept that a separate code outside the DNA sequence known as the epigenetic code also has dramatic effects on the phenotype of organisms (Bjornsson et al. 2004). As an “unseen genome” beyond DNA, epigenetic information plays an essential role in regulating the function of eukaryotic genome without changing the DNA sequence (Gibbs 2003; Jaenisch and Bird 2003; Bjornsson et al. 2004; Egger et al. 2004). Unlike the traditional genetic information which is written in the sequence of the DNA molecule, the epigenetic information is encoded by several types of covalent chemical modifications of chromatin (DNA and histone) and is meiotically and mitotically heritable during cell division (Egger et al. 2004). Currently, epigenetic modifications mainly refer to three parts: DNA methylation, histone modifications (methylation, acetylation, phosphorylation, ubiquitinylation and ADP-ribo-sylation) and chromatin remodeling. In contrast to the typical DNA sequence variation, the epigenetic variation is a more dynamic and sometimes reversible process which is more difficult to analyze, and its impact on gene expression and regulation has been assessed only during the past two decades (Hsieh 2000; Gibbs 2003; Waterland and Jirtle 2004). Now the accumulating evidence indicates that epigenetic mechanisms are involved in diverse biological processes including maintenance of genome stability, genomic imprinting, X chromosome inactivation, transcriptional regulation and carcinogenesis. Epigenetic mechanisms are becoming an exciting field and the focus of extensive interest in more and more research areas (Gibbs 2003; Jaenisch and Bird 2003; Egger et al. 2004; Murrell et al. 2005).

Now that it is realized that the epigenetic code contributes considerably to regulate the variation in phenotypic traits in farm animals, the research on particular phenotypic traits is complicated by the interaction of multiple factors in both genetic and epigenetic respects. Four types of genetic variation, epigenetic variations and their interactions are well documented in human and other model organisms that have the potential to influence a given trait (Bjornsson et al. 2004): (1) DNA sequence variation (insertions, deletions, conversions and so on) that contribute to phenotypic traits; (2) Epigenetic variation directly associated with certain traits, such as the alterations in methylation, imprinting and chromatin that appear to be associated with cancers; (3) DNA sequence variation contributing to the alternation of epigenotype, such as mutations in DNA methyltransferases gene and CG dinucleotides; (4) Epigenetic variations modulatinge or modifying the penetrance of sequence variations. In the fourth case, expression of the variants is epigenetically controlled, so the effect of the DNA sequence variation is highly dependent on related epigenotypes. An extreme example is imprinted genes: the effect of a loss-of-function mutation will not be observed if it occurred in the allele that is imprinted (epigenetically silenced). In the four variations mentioned above that underlying traits, DNA-sequence variation is the most thoroughly investigated in farm animal populations, while the analysis of epigenetic factors contributing to the performance of specific traits is still in its infancy.

Several advances have been made in recent years to explore the epigenetic basis of phenotypic traits in farm animals. An excellent example of an epigenetically regulated trait is the callipyge (beautiful buttocks) phenotype in sheep (Freking et al. 2002; Van Laere et al. 2003): the unique mode of inheritance of callipyge is referred to as a polar overdominant mechanism which is caused by the complex interaction between a cis-mutation in a putative long-range control element (LRCE) and gene imprinting. The effect of the mutation (the expression levels of genes in the DLK1-GTL2 imprinted domain) is dependent on the epigenetic status (imprinting) of related genes. In the callipyge phenotype, the epigenetic modification is crucial for the penetrance of the sequence variation, and to a certain extent the phenotype will not be present without epigenetic mechanisms. Another representative example is the IGF2 gene, for which it was shown that the imprinted IGF2 locus has a major effect on skeletal and cardiac muscle mass and fat deposition in pigs (Jeon et al. 1999; Nezer et al. 1999; Van Laere et al. 2003). Subsequently, a genome-wide scan for body composition in pigs also revealed an important role for imprinting. In the five identified QTL affecting body composition, four were epigenetically imprinted (de Koning et al. 2000). In addition, coat-color phenotype in mice is also an example of epigenetic variation (DNA methylation) that directly leads to a certain phenotype (Whitelaw and Martin 2001). Based on the above observations, the importance of epigenetic factors in regulating QTL appears to have long been underestimated. In order to gain broader insight into the molecular basis of phenotypic diversity in domestic animals, an integrated epigenetic and genetic approach is indeed necessary.

Epigenetic variation which existed outside the DNA sequence can help to explain the quantitative nature of phenotypic traits which are multigenic. In humans, an international human epigenome project (HEP) has been proposed. The DNA methylation profile of chromosomes 6, 20 and 22 has been released recently (Rakyan et al. 2004; Murrell et al. 2005; Eckhardt et al. 2006), and it is further paving the way towards the systematic analysis of the epigenetic code that underlies phenotypic traits in farm animals.

miRNA and QTL mapping

MicroRNAs (miRNAs) are a class of endogenous ~22 nucleotide non-coding RNA molecules involved in the degradation and translation regulation of specific message RNA targets (Lagos-Quintana et al. 2001; Lau et al. 2001; Lee and Ambros 2001; Slof-Op ‘t Landt et al. 2005). miRNAs are transcribed as mono- or polycistronic, long primary precursor transcripts (pri-miRNA) that are processed into ~70-nt hairpin pre-miRNAs by Drosha Rnase. Subsequently, pri-miRNA are exported to the cytoplasm by Exportin-5 (Liu and Lamont 2003; Bohnsack et al. 2004), where they are cleaved by another RNase, known as Dicer, into ~22-nt duplexes (Bernstein et al. 2001; Hutvagner et al. 2001; Ketting et al. 2001; Knight and Bass 2001) and then bind with the 3′-UTR of target mRNA to block the protein translation or degrade the mRNAs by a functional complex (Sun et al. 1998; Khvorova et al. 2003). The first discovered miRNAs were lin-4 and let-7, identified by forward genetics, which control the timing of larval development in Caenorhabditis elegans (Lee et al. 1993; Reinhart et al. 2000).

Identification of causal genetic variants underlying complex traits is a major goal in genetic research. Meatiness of sheep and cattle in domestic animals is a good example of an economically important trait for which location of quantitative trait loci (QTL) has been undertaken. miRNA is a micromanager of gene expression that shows a potentially widespread influence on metazoan gene control (Slof-Op ‘t Landt et al. 2005). Georges and colleagues have now determined the molecular mechanism of muscularity in Texel sheep (Clop et al. 2006). The result showed that the impressive muscle was a result of a point mutation which changed an illegitimate miRNA target in the 3′-UTR of myostatin, resulting in down-regulation of MSTN expression.

The QTL detection for the cause of large muscles in Texel sheep began with a whole-genome scan in a Romanov × Texel F2 population. A quantitative trait locus with a major effect on muscle mass was mapped to chromosome 2 and subsequently fine-mapped to a region including the myostatin (GDF8) gene. Because a GDF8 loss-of-function mutation led to a “double-muscle” phenotype in mice, cattle, and human (Lee and McPherron 1999; Tobin and Celeste 2005), it was seemed as a perfect candidate to continue. Although there was no difference in DNA level and RNA level, in the 3′-UTR a mutation (G–A) turned out to create an illegitimate miRNA target for miR-1 and miR-206, which are abundantly expressed in mammal animal muscle. The miRNA-mediated translational inhibition was tested by the interaction between mutant GDF8 transcripts and miR-1 and miR-206 directly (Clop et al. 2006). The results indicate that the G–A substitution which creates the miR-1 and miR-206 target site produce the muscle mass.

This work shows that SNPs in miRNA target sites may be one type of sequence variant that underlies QTL effects, and indicated that the polymorphic miRNA-mediated regulation of gene expression could be universal in mammals. In addition, a novel class of regulatory mutations underlying QTL identified will be an important contribution to understanding the heritability of complex traits, and give strong evidence for possible epistatic interactions between polymorphisms in miRNA gene and their associated targets. Thus, SNPs in miRNA target sites may lead to heritable variation in gene expression. For example, 5000 SNPs were identified by comparing human and chimpanzee sequences, of which about one-tenth had a high possibility of affecting gene regulation and hence phenotypic variation by their interaction with miRNA (Clop et al. 2006) (data at: http://www.patrocles.org/).

PolymiRTS is another database in which polymorphisms in microRNA target sites are linked with complex traits (Bao et al. 2007). In the database are collected the naturally occurring DNA variants in putative microRNA target sites. It is a resource for studying PolymiRTSs and their implication in phenotypic variation, integrating sequence polymorphism, phenotype and expression microarray data, and characterizing PolymiRTSs as potential candidates responsible for the QTL effects.

Perspective

Nowadays multiple approaches should be employed to elucidate the genetic mechanisms for QTL in farm animals because the most interesting traits may be regulated at many different levels of the biological system (Fig. 1). One should adopt an integrated, systematic approach to assess, refine, and extend the understanding of the molecular pathways underling the form and development of a complex trait. As a result, the importance of the concept of integrity is emerging. The integrated view of systematic biology will help people gain deeper understanding of complex traits such as growth, reproduction and disease resistance traits.

Fig. 1
figure 1

Genomics-based approaches for enhancing the dissection of the genetic base of phenotype from an integrative systematic view

The advances in molecular genetics research have led to the identification of the genes or markers associated with these genes that affect the important traits. The molecular basis of QTL has been understood by functional genomics approaches, and it will help us to gain further insight into the biological components. Now some candidate genes associated with economic traits in farm animals have been used in marker-assisted selection (MAS) systems, such as HAL, MC4R, RN, PRKAG3, CAST gene. The genomic technologies as a supplementary tool for evaluation of populations or animals for breeding have brought more opportunities to enhance genetic improvement research programs in farm animals through marker-assisted selection, as well as to better predict the phenotype that a particular genotype will produce, which is a primary goal of genomics-based breeding.