Advertisement

The Contribution of Low-Frequency and Rare Coding Variation to Susceptibility to Type 2 Diabetes

  • Jason FlannickEmail author
Open Access
Genetics (AP Morris, Section Editor)
Part of the following topical collections:
  1. Topical Collection on Genetics

Abstract

Purpose of Review

Soon after the first genome-wide association study (GWAS) for type 2 diabetes (T2D) was published, it was hypothesized that rare and low-frequency variants might explain a substantial proportion of disease risk. Rare coding variants in particular were emphasized given their large expected role in disease. This review summarizes the extent to which recent T2D genetic studies provide evidence for or against this hypothesis.

Recent Findings

Following a comprehensive study of T2D genetic architecture using three sequencing and genotyping technologies, four even larger studies have provided a yet higher resolution view of the role of rare and low-frequency coding variation in T2D susceptibility.

Summary

Empirical evidence strongly suggests that common regulatory variants are the dominant contributor to T2D heritability. However, rare coding variants may nonetheless be pervasive across T2D-relevant genes. A strategy using common variants to map disease genes, and rare coding variants to link molecular gene perturbations to cellular and phenotypic effects, may be an effective means to investigate T2D pathogenesis and potential new therapies.

Keywords

Rare variants Coding variants Exome Sequencing GWAS RVAS Genetic architecture 

Introduction

Genetic studies of complex diseases are largely motivated by two goals: to understand the heritable risk factors for disease in the population, and to identify biological processes relevant to disease pathogenesis [1]. The first goal seeks to quantify the contribution of different classes of genetic variation to disease heritability [2]. The second seeks to identify genetic “experiments of nature” that link genes or pathways to disease risk and potentially suggest new therapeutic strategies [3].

Coding variants have long been an emphasis in genetic studies for type 2 diabetes (T2D) and other complex diseases. Because they constitute the bulk of known genetic risk factors for Mendelian diseases, they have been hypothesized to contribute disproportionately to complex disease heritability [4, 5, 6]. Because their effects are usually easier to interpret than those of noncoding variants, they can lead to clear hypotheses about a disease-relevant gene and its directional relationship with disease risk (i.e., whether loss of function predisposes to or protects from disease) [5, 7]. The demonstration in 2004 that loss of function mutations in PCSK9 lower low-density lipoprotein levels [8] and protect from coronary artery disease [9], and the successful cholesterol-lowering PCSK9 inhibitors consequently developed [10], have served as longstanding exemplars for many complex diseases.

When the first genome-wide association studies (GWAS) for T2D were published in 2007, some observers were therefore surprised that (a) most associations mapped outside of protein-coding regions of the genome [11] and (b) the identified associations explained only a relatively small portion of disease risk [2]. Early GWAS thus produced the first robust associations for T2D—a clear success [1, 12]—but in few cases provided clear insight into T2D’s genetic basis or its molecular and cellular mechanisms [5, 7, 13]. However, because GWAS directly or indirectly analyze only a limited set of common (minor allele frequency [MAF] > 5%) variants in the genome, their associations are not expected to explain all (or even most of) disease heritability, and might in fact tag disease-causal variants some distance away [2, 5].

This review will discuss how these early GWAS findings inspired a decade of studies to understand the role of low-frequency (MAF < 5%) and rare (MAF < 0.5%) coding variation in T2D susceptibility. In the past few years, a clear picture has begun to emerge as to how these variants contribute to T2D heritability and might be used to better understand T2D biology.

Hypotheses, Conceptual Frameworks, and Experimental Approaches

Following early GWAS findings, three hypotheses (or models) emerged about the contribution of low-frequency and rare variants to the “genetic architecture” of complex diseases. First, rare variants were hypothesized to have significantly larger effects on disease risk than do common variants [5, 7, 14, 15]. Purifying natural selection might prevent strong-effect variants from becoming common in a population [5, 16, 17], which could explain the empirically modest effects (odds ratio [OR] < 1.1) on disease risk of most common variants [1, 2]. Strong-effect, low-frequency variants could be more clinically or therapeutically actionable than modest-effect common variants [3, 18].

Second, rare variants were hypothesized to explain a significant amount of disease heritability [2, 7, 13]. There are many more rare variants than common variants within the population [19, 20], and GWAS by design do not interrogate them. If rare or low-frequency variants have significantly larger effects on average than do common variants, then they in aggregate could explain much of the heritability not captured by GWAS.

Third, rare variants were hypothesized to cause some, and perhaps a significant portion of, common variant GWAS associations. By chance, it is possible that one or more disease-causal rare variants may segregate non-randomly with a common variant, creating a “synthetic association” detected by a common variant GWAS [5, 21]. If synthetic associations are commonplace, they could impact the design of “fine mapping” studies—efforts to localize a GWAS “index variant” association to a causal variant(s)—because index variants may lie significantly further from causal rare variants than they are expected to lie from causal common variants [21].

Testing these three hypotheses for T2D and other complex diseases required advances in rare variant ascertainment, genotyping, and association analysis. Foremost, rare variants can only be comprehensively ascertained through sequencing, and their large-scale study therefore required technology to advance from traditional Sanger sequencing of individual genes to cost-effective high-throughput next-generation sequencing [22]. By 2010, next-generation sequencing technologies were inexpensive enough to apply to thousands of samples at select regions of the genome, beginning with several genes [23, 24] and soon expanding to the entire exome [25, 26]. Because cost considerations limited the total size of regions sequenced, most early studies focused on protein-coding regions of the genome.

As whole-exome sequence (and later whole-genome sequence) data progressively accrued, opportunities also emerged to genotype a subset of coding variants detected by sequencing in much larger sample sizes. By 2012, enough European ancestry exomes had been sequenced to enable design of an inexpensive SNP microarray (the Illumina Exome Array) capturing (at a cost one order of magnitude less than sequencing) over 80% of MAF > 0.5% coding variants in Europeans [27]. By 2016, enough European ancestry genomes had been sequenced to enable a reference panel (compiled by the Haplotype Reference Consortium [HRC]) for high-quality imputation of MAF > 0.1% variants in European ancestry samples (provided the samples had been previously genotyped by a genome-wide SNP microarray) [28]. Exome array analysis and HRC-based imputation complement exome sequencing by trading off full variant ascertainment for increased study sample size (and therefore association power).

Exome array or HRC-based imputation studies predominantly employ traditional GWAS analyses, which test variants individually for disease association (“single-variant analysis”). By contrast, analyses of rare variants from exome sequencing studies require different approaches [25]. In the early 2010s, a significant number of methods were advanced to aggregate rare variants and test for association at the level of genes (“gene-level analysis”) [29]. The most basic methods collapse variants of similar molecular effect and test for different frequencies of variation between disease cases and controls (“burden tests”). While simple and easy to interpret, burden tests rely on judicious selection of variants included in the test: their application has therefore been aided by bioinformatic algorithms for predicting protein-damaging variants and theoretical frameworks for understanding how different variant selection strategies impact power to detect association [30, 31]. Alternatively, statistical tests can be made more robust to the inclusion of benign variants in gene-level analysis, a strategy that motivated the design of tests such as SKAT [32] and SKAT-O [33] that test for an overdispersion of variant associations within a gene, rather than simply a directional excess of variation in cases or controls. Most rare variant studies therefore apply multiple methods for gene-level analysis, which increases the number of tests performed, but because most studies have far fewer genes than variants, the study-wide multiple testing burden is ultimately reduced relative to GWAS.

By the early part of the 2010s, therefore, sequencing technologies and rare variant analysis methodologies were sufficiently advanced to begin the first empirical assessments of the role of low-frequency and rare variation in the development of T2D (Table 1).
Table 1

Technologies for interrogating low-frequency and rare variants

 

Exome array

GWAS imputation

Whole-exome sequencing

Whole-genome sequencing

Properties

  Ascertainment

70–80% of MAF > 0.5% variants

Statistical inference of MAF > 0.1% variants (Europeans) or MAF > 1% variants (other populations)

All coding variants

All variants

  Analysis

Single-variant

Single-variant

Gene-level

Single-variant, gene-level

  Current T2D sample size

~ 500K

~ 1M

~ 50K

~ 3K

Contribution to testing rare and low-frequency variant hypotheses

  Large effects

Medium

Medium

Medium

Low

  Missing heritability

Medium

Medium

High

Low

  Synthetic associations

Low

Low

Medium

High

Four genotyping technologies and/or study designs (columns) have been used to identify low-frequency and rare variants associated with T2D. Each ascertains different variants (Ascertainment), enables different association analysis methodologies (Analysis), and has been applied to different sample sizes for T2D (Current T2D sample size). The bottom half of the table summarizes the historical contribution of each study design toward evaluating the validity of three rare variant hypotheses about the role of rare variation in T2D susceptibility

The First Studies of Low-Frequency and Rare Coding Variants

The first searches for low-frequency or rare variant T2D associations began even before the GWAS era. Following paradigms for Mendelian disease genetic mapping, many linkage and candidate gene studies were conducted for T2D in the late 1990s and early 2000s [34]. Other than an association near TCF7L2 [35] (still the largest genetic contributor to T2D risk), these studies produced few replicable associations [34, 36, 37] and today are cited less so for their discoveries and more so as cautionary contrasts to the statistical rigor of GWAS [1, 13].

The first large-scale sequencing studies of T2D focused on genes with prior genetic links to T2D. One class of study focused on genes within GWAS regions, showing MTNR1B [38], SLC30A8 [39, 40], PPARG [41], and HNF1A [42] to harbor collections of rare variants with moderate (OR 2–7) effects on T2D risk. Notably, in each case, stringent filtering of variants was necessary to reveal an association: the SLC30A8 association was detected with the small fraction of variants predicted to truncate SLC30A8-encoded protein, while systematic characterizations of rare variants in (gene-specific) assays were needed to identify associations for MTNR1B, PPARG, and HNF1A. Furthermore, each of these genes was already widely believed (prior to the sequencing studies) to mediate the original GWAS association; early sequencing studies were less successful at identifying truly novel GWAS effector genes [40, 43].

A second class of targeted sequencing studies focused on genes for Maturity Onset Diabetes of the Young (MODY) or other Mendelian diseases with clinical similarities to T2D. Beginning with small studies that showed rare variants in MODY genes to have effects on T2D risk in the general population [44]—albeit with penetrances much lower than might have been expected—and continuing with larger studies that provided stronger statistical evidence of association [27, 44, 45••, 46, 47], MODY genes have been consistently shown to harbor not only rare variants that cause early onset Mendelian diseases but also a broader “allelic series” of variants that predispose to the later onset form of T2D. These findings are now widely interpreted as evidence that MODY and T2D are not distinct conditions but rather opposite extremes of a continuum of diabetes subtypes [15, 48, 49].

As exome sequencing, the exome array, and sequence-based imputation reference panels began to mature, the first genome-wide scans for rare and low-frequency variant T2D associations began to appear. The earliest exome array studies relevant to T2D were focused on glycemic traits; while some coding variants of moderate effect emerged from these studies (e.g., PAM for insulinogenic index [50], G6PC2 for fasting glucose [51, 52], and AKT2 for fasting insulin [53]), the number of significant associations was much smaller than would be expected from the hypotheses positing large contributions of rare variants to complex trait heritability. Early T2D sequencing studies [46, 47, 54, 55] (each in a few thousand individuals) similarly were successful at identifying some, but not many, novel rare or low-frequency coding variant T2D associations (e.g., in PAM [47], PDX1 [47], HNF1A [46], and ADCY3 [56]). Perhaps the biggest lesson to emerge from these investigations is the value offered by studies of populations either isolated or subject to historical bottlenecks (e.g., Iceland [47], Mexico [46], Finland [53], Greenland [56, 57]) as a means to identify strong-effect T2D variants that have, by chance (e.g. genetic drift) or perhaps even positive selection, risen to moderate (or even high) frequency.

Early studies of low-frequency and rare variation thus suggested that rare variant hypotheses might have been overly optimistic in their predictions about the contribution of rare variation to T2D. However, a definitive assessment of these hypotheses would require global and systematic analyses of larger datasets.

An Emerging Picture of T2D Genetic Architecture

While a study of a few thousand sequenced individuals might be expected to detect a substantial number of rare variant associations under optimistic models, more measured early-stage simulations [58, 59] and analytical calculations [60] had predicted that tens of thousands of sequenced individuals would be required for reasonable power to interrogate the rare variant hypothesis for most complex diseases. An early sequencing study of 1000 T2D cases and 1000 controls, for example, had power to exclude only extreme models in which rare variants in < 20 genes explained the majority of T2D risk [55]. The lack of rare variant associations from early studies did not, therefore, rule out rare variant models for T2D, and a systematic simulation study showed that, prior to large-scale sequencing studies, rare and common variant models could each be constructed as consistent with empirical T2D genetic associations [61•].

The study that took the largest step toward constraining potential T2D genetic architectures was published in 2016 [45••], analyzing ~ 13,000 multi-ethnic exomes, ~ 2700 whole European ancestry genomes, ~ 80,000 samples genotyped on the exome array, and ~ 44,000 samples with genotypes imputed from a whole-genome sequence reference panel enriched for T2D cases. Using these data collectively, the study provided insights into all three major hypotheses about the role of rare and low-frequency variants in T2D genetic susceptibility. First, despite near-complete variant ascertainment in a modest-size European ancestry sample, only one low-frequency variant (a previously reported noncoding variant in CCND2 [47]) achieved genome-wide significance, enabling quantitative bounds on the T2D effect sizes of low-frequency variants, which, in short, rejected models proposing a significant number of low-frequency strong-effect T2D variants. Second, simulated rare variant models predicted far more rare and low-frequency variant associations than were observed empirically, instead supporting a T2D genetic architecture characterized by many modest-effect common variants. Third, no rare variants could plausibly explain any significant T2D GWAS signals, rejecting synthetic associations as a common phenomenon for T2D. A fourth finding of the study was that no gene-level coding variant associations reached exome-wide significance, although the implications of this finding for the validity of rare variant models were not pursued in detail.

Since the publication of the 2016 study, four other large-scale studies have further constrained the contribution of rare and low-frequency variants to T2D susceptibility. The first study [62] performed deep whole-genome sequencing of 20 large Hispanic pedigrees (spanning ~ 1000 individuals), providing the opportunity to observe and analyze multiple copies of extremely rare variants (e.g., those private to a family). Although the power of the study design was validated by the identification of several rare variant associations with gene expression (cis-expression quantitative trait loci), no evidence of large-effect rare variant associations was observed for T2D in these families.

The second study [63••] applied the exome array to ~ 450,000 samples (~ 80,000 with T2D), significantly increasing power to interrogate MAF > 0.5% coding variation for T2D association. Although the study identified 40 coding variant associations, only five had observed MAF < 5% and none had observed OR > 1.4, strongly suggesting the fruitlessness of searches for low-frequency or common coding variants with even moderate effects on T2D risk. Furthermore, through fine mapping with densely imputed GWAS data, < 50% of the 40 coding variants identified in the study were shown to be causally linked to T2D risk, with the remainder likely proxies for nearby noncoding causal variants. Coding variant associations therefore cannot be immediately assumed to implicate specific variants or genes, although (because most of the 40 associations analyzed in the study were observed with common variants) the proportion of rare coding variant T2D associations that are causal may well be significantly higher.

The third study [64••] used HRC-based imputation to analyze ~ 900,000 European samples (~ 75,000 with T2D), providing even greater power to detect T2D associations with variants as rare as MAF~ 0.1% (although imputation quality, and therefore effective sample size, is lower for rarer variants). This study produced by far the largest catalog of low-frequency and rare variant associations to date for T2D, identifying associations with 56 low-frequency (0.5% < MAF < 5%) and 14 rare (MAF < 0.5%) variants across 60 loci; many of these variants are nearby but independent from common variants identified by earlier GWAS. Although variant OR estimates were not independently validated (and may be overestimates), some of the identified low-frequency variants had moderate to high estimated effects on T2D risk, with 14 having observed OR > 2 and two having observed OR ~ 8. However, only seven of the 56 low-frequency variant associations lie within coding regions, and all of these had estimated OR < 2. Collectively, low-frequency variant associations in the study were estimated to explain 15× less T2D heritability than were common variant associations in the same study (1.13% vs. 16.3%), implying the heritability explained by low-frequency coding variants to be even lower (by perhaps an order of magnitude).

These three studies progressively limited the role of low-frequency and rare coding variants in T2D susceptibility; however, they collectively ascertained at most only a small fraction of rare coding variation in the population. The final, and most recent, large-scale genetic study of T2D [65••] used exome sequencing in five major ethnic groups to analyze ~ 45,000 samples (~ 20,000 with T2D) across ~ 3M coding variants, ~ 95% of which are rare and ~ 90% of which are absent from exome array or HRC-based imputation studies. This study, as expected, identified essentially no novel coding single-variant associations (only one low-frequency variant in the known obesity and T2D gene MC4R). However, it did demonstrate, for the first time, exome-wide significant gene-level associations (PAM, MC4R, and SLC30A8). Notably, these three genes had been previously implicated in T2D via GWAS, and the rare variants contributing to the gene-level signals explain significantly less T2D heritability than do the nearby independently associated common variants.

Nonetheless, exome sequencing at this scale did reveal evidence for pervasive rare variant associations across T2D-relevant genes. Twelve gene sets, defined based on prior evidence from mouse models, T2D drug targets, or monogenic diabetes, each exhibited significantly more rare variant gene-level associations than expected by chance. Individually, the gene-level associations were modest at best, requiring (in the case of T2D drug targets) perhaps 500K–1M samples to be detected at exome-wide statistical significance. However, these results suggest that, once a gene is established as relevant to T2D, it might reasonably be expected to carry a rare coding variant allelic series that could be mined for more insight into its function.

In the last 3 years, a clearer picture of T2D genetic architecture has therefore begun to emerge. The set of rare and low-frequency coding variants with effect sizes large enough to be detected via single-variant analysis of even 1 million samples appears quite limited, even though ~ 350 independent common variants—most of them noncoding—have been identified from those same studies. This does not imply that rare variants play no role in T2D susceptibility; indeed, the most recent T2D exome sequencing study suggests that rare variant gene-level signals may be more of a norm than a surprise in T2D-relevant genes [65••], supporting other studies that have shown an excess of associations (genome-wide) in coding exons [45••, 66, 67]. However, rare variant signals have effect sizes significantly lower than might have been expected by optimistic early hypotheses and are often detectable only after the disease relevance of gene has been established (e.g., at a relaxed significance threshold justified by the gene’s “prior” evidence). For this reason, in the past and likely in the near future, most T2D rare variant signals have been or will be identified within genes harboring additional variants that, by chance, either (a) cause an extreme form of diabetes (e.g., MODY) or (b) have risen to sufficient frequency to be detectable via GWAS (Fig. 1).
Fig. 1

Profiles of rare coding variation in T2D-relevant genes. Based on recent empirical evidence, many T2D-relevant genes seem likely to harbor a series of rare coding variant associations. However, based on empirical aggregate effect sizes, typical rare variant associations may require an order of magnitude more exome sequences to detect than are available today. a Some genes, by chance, will lie near or contain a low-frequency or common variant T2D association, as is the case for the three exome-wide significant T2D gene-level associations identified to date. Such genes will likely be detected by GWAS or exome array single-variant analysis long before they are detected by exome sequence gene-level analysis. b Some genes will harbor extremely rare, severe mutations associated with a monogenic form of diabetes. Given evidence of a genetic overlap between monogenic forms of diabetes and T2D, these genes are strong candidates to harbor an allelic series of variants associated with T2D. c Many genes will only harbor a rare variant T2D gene-level association. Based on empirically observed aggregate effect sizes, these genes will likely be very hard to identify for the foreseeable future. For each example gene, variants are shown as tics on the transcript map; red and blue bars indicate variant case and control frequencies, respectively, and black boxes indicate the variant’s “prominence” (e.g., detectability via GWAS or a Mendelian gene mapping study)

The Use of Coding Variants to Understand Disease Biology

Although it seems likely that rare and low-frequency variants contribute modestly to T2D heritability, they may still play a significant role in efforts to understand disease biology or design new therapies. It has long been appreciated that a variant need not explain much heritability to offer valuable disease insight [12, 13, 15], as a modest-effect variant could point to a gene of which larger therapeutic perturbations may have a significant impact [68] or to a pathway that might suggest an important new disease mechanism [69]. Translating genetic associations to biological function—and then investigating how natural or potential therapeutic alteration of these functions affects disease risk or progression—represents the most pressing current and future challenge in complex disease research [70].

Rare coding variants offer unique value in this endeavor. Because they localize to protein sequence, and because they are less likely to be highly correlated to hidden causal variants via linkage disequilibrium than are common variants, they can directly implicate genes in disease pathogenesis. Furthermore, since their effects are easier to interpret than those of noncoding variants, and because they can be introduced into model systems and evaluated for effects on a variety of molecular or cellular processes, they can help basic researchers to experimentally evaluate the downstream effects of a gene on a biological process. The empirical difficulty of identifying T2D rare variant associations limits their ability to identify novel disease loci genes. However, once a gene is hypothesized to be involved in disease, rare coding variants can provide valuable “handles” on the gene to probe the relationship between its function and disease (Table 2).
Table 2

The role of common and rare variation in future T2D genetic studies

Goal

Common variants

Rare coding variants

Locus discovery

GWAS in large sample sizes, imputed from progressively larger whole-genome sequence reference panels

Limited role for the foreseeable future (and possibly longer) due to significantly greater efficiency of GWAS

Reverse genetics

Limited role due to difficulties in identifying variants with clear molecular function

Identify individuals with loss of function mutations (“human knockouts” or haploinsufficiency), or severe missense mutations and analyze deep phenotypes

Biological function

Determine mechanism of original common variant association through functional genomic predictions, genome editing, and readouts from cellular/animal models

Characterize an “allelic series” of missense mutations to assess the molecular and cellular consequences of varied gene perturbations

Therapeutic translation

Of potential clinical utility to define subgroups or stratify populations through common variant polygenic risk scores

Use coding variants to link molecular and cellular readouts (effects of variants on an assay) to physiological phenotypes (genetic associations of the same variants) and potentially identify putative drug targets

Future studies to understand the biology of T2D and identify potential new therapies will require a combination of approaches to identify new genetic associations (Locus discovery), evaluate the role candidate genes play in human disease (Reverse genetics), translate associations to biological insights (Biological function), and suggest new therapies (Therapeutic translation). The table summarizes potential strategies to use common and rare variants toward each goal

The identification and subsequent characterization of SLC30A8 provides an illustrative example of this approach. A coding variant in SLC30A8 was one of the earliest findings of T2D GWAS [71], and the gene immediately piqued research and therapeutic interest: its protein product ZnT8 is expressed mainly in pancreatic islets and transports zinc into insulin-containing granules, thus contributing to insulin processing and storage. Because of its known function, the initial hypothesis was that reduced ZnT8 activity would increase risk of T2D [72, 73]; however, when numerous Slc30a8 knockout mice were subsequently created and tested for a variety of glycemic phenotypes, no consistent effects on hyperglycemia emerged [73, 74, 75, 76, 77].

It was therefore surprising when 12 rare protein-truncating variants in SLC30A8 were shown to associate, in aggregate, with protection from T2D [40]. While these data offered no insight into a potential protective mechanism, their clear predicted molecular effects (reduced ZnT8 activity) and expression within the full human system (rather than mice) prompted re-evaluation of the link between ZnT8 and T2D. After several years of further research, a recent mouse model of the most common SLC30A8 rare protein-truncating variant offered a potential mechanism for T2D protection, through increased first-phase insulin secretion [78•].

Much still remains to be determined about the mechanism linking ZnT8 loss of function to increased insulin secretion, not only in animal models but also in humans. One limitation of using the 12 SLC30A8 protein-truncating variants to explore this relationship is that all are expected to result in human haploinsufficiency, and they thus do not inform on potential effects of either full loss of function or more measured reductions in ZnT8 activity. Coding variants could help address this knowledge gap in two respects. First, if SLC30A8 “human knockouts”—or individuals with homozygous or compound heterozygous loss of function mutations—were identified [18, 79], they could be deeply characterized for various phenotypes to better understand the intermediate human physiological processes responsible for T2D protection. Second, the exome-wide significant series of > 100 SLC30A8 protective missense alleles from the most recent T2D exome sequencing study [65••] could be used to probe the SLC30A8 “dose-response” curve: each allele could be introduced into a molecular or cellular assay (such as zinc transport or insulin secretion), and their effects on these assays could then be compared to their T2D protective effects to calibrate the relationship between T2D risk and the biological process measured by the assay. Particularly interesting variants could be subsequently introduced into an Slc30a8 mouse model for further characterization. This paradigm has been previously demonstrated for T2D in the context of PPARG and insulin resistance, in which simultaneous effects of rare coding variants on T2D risk (when analyzed in a genetic study) and adipogenesis (when introduced into a cellular assay) validated the gene’s mechanism of action [41]. Furthermore, once an assay has been established as a proxy for human disease risk, it represents an attractive asset for future therapeutic screens [3].

Conclusion

The past 5 years of T2D genetic research have addressed the goal of understanding T2D genetic architecture: it is now highly likely that T2D genetic risk is mostly determined by many common, modest-effect regulatory variants rather than a few rare, large-effect coding variants. The jury is still out on the extent to which rare and low-frequency variants of individually large effect might aid disease risk prediction in selected groups: only a few examples of low-frequency variants with large effects on T2D risk have been identified [57, 80], and these are typically highly population-specific (and in fact often common in the particular population). Collections of large-effect rare variants may reach sufficient aggregate frequency for useful risk prediction in some genes, but to provide clinical utility, these variants will likely require functional characterization [42, 81•, 82, 83], a significant limitation for variants in novel genes. By contrast, polygenic risk scores constructed from many common variants have steadily increased in predictive power: those constructed from the latest T2D GWAS show a ninefold difference in disease prevalence for individuals at the high and low extremes of risk [64••].

The most effective path toward the goal of T2D biological discovery seems likely to include both traditional GWAS approaches as well as novel methods incorporating rare coding variant analyses (Table 2). GWAS approaches—which will only increase in power to interrogate low-frequency and rare variants as imputation reference panels expand—will likely persist for the foreseeable future as the most efficient means to identify genomic loci associated with disease. Coding variants seem better placed to help understand the functional consequences of an association by providing a variety of gene perturbations that can be both experimentally and phenotypically characterized, an area in which improved methods for conducting and dissecting gene-level tests could have a significant impact. Additionally, it seems plausible that some (maybe many) disease-relevant genes may prove undetectable by GWAS, as an association depends not only on the disease relevance of a gene but also on the historical emergence of a (reasonably common) genetic variant that sufficiently perturbs its function [17]. For T2D-relevant genes lacking a GWAS association—which might be identified as therapeutic or biological candidates from high-throughput functional screens or because they participate in a hypothesized disease-relevant pathway—coding variants offer the opportunity to conduct “reverse genetic” analyses in which variants that alter function of the gene are identified and then characterized for their effects on human phenotypes.

Although rare and low-frequency coding variants are not the panacea that some optimistic genetic models proposed after the first T2D GWAS, they are not as irrelevant to T2D as might be naively inferred from the now-appreciated dominant contribution of common noncoding variants to disease heritability. Instead, they provide one of several genetic tools necessary to understand and ultimately develop better treatments for complex diseases such as T2D—in particular, a tool for probing and refining gene function. The true revolution enabled by rare coding variation may therefore be not the previous phase of complex disease discovery but the next phase of complex disease association functional and clinical translation.

Notes

Compliance with Ethical Standards

Conflict of Interest

Jason Flannick reports personal fees from Decibel Therapeutics.

Human and Animal Rights and Informed Consent

This article does not contain any studies with human or animal subjects performed by any of the authors.

References

Papers of particular interest, published recently, have been highlighted as: • Of importance •• Of major importance

  1. 1.
    Altshuler D, Daly MJ, Lander ES. Genetic mapping in human disease. Science. 2008;322(5903):881–8.  https://doi.org/10.1126/science.1156409.CrossRefPubMedPubMedCentralGoogle Scholar
  2. 2.
    Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, et al. Finding the missing heritability of complex diseases. Nature. 2009;461(7265):747–53.CrossRefGoogle Scholar
  3. 3.
    Plenge RM, Scolnick EM, Altshuler D. Validating therapeutic targets through human genetics. Nat Rev Drug Discov. 2013;12(8):581–94.  https://doi.org/10.1038/nrd4051.CrossRefPubMedGoogle Scholar
  4. 4.
    Botstein D, Risch N. Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease. Nat Genet. 2003;33(Suppl):228–37.  https://doi.org/10.1038/ng1090.CrossRefPubMedGoogle Scholar
  5. 5.
    Cirulli ET, Goldstein DB. Uncovering the roles of rare variants in common disease through whole-genome sequencing. Nat Rev Genet. 2010;11(6):415–25.  https://doi.org/10.1038/nrg2779.CrossRefPubMedGoogle Scholar
  6. 6.
    Cooper GM, Shendure J. Needles in stacks of needles: finding disease-causal variants in a wealth of genomic data. Nat Rev Genet. 2011;12(9):628–40.  https://doi.org/10.1038/nrg3046.CrossRefPubMedGoogle Scholar
  7. 7.
    McClellan J, King MC. Genetic heterogeneity in human disease. Cell. 2010;141(2):210–7.  https://doi.org/10.1016/j.cell.2010.03.032.CrossRefPubMedGoogle Scholar
  8. 8.
    Cohen J, Pertsemlidis A, Kotowski IK, Graham R, Garcia CK, Hobbs HH. Low LDL cholesterol in individuals of African descent resulting from frequent nonsense mutations in PCSK9. Nat Genet. 2005;37(2):161–5.  https://doi.org/10.1038/ng1509.CrossRefPubMedGoogle Scholar
  9. 9.
    Cohen JC, Boerwinkle E, Mosley TH Jr, Hobbs HH. Sequence variations in PCSK9, low LDL, and protection against coronary heart disease. N Engl J Med. 2006;354(12):1264–72.  https://doi.org/10.1056/NEJMoa054013.CrossRefPubMedGoogle Scholar
  10. 10.
    Stein EA, Mellis S, Yancopoulos GD, Stahl N, Logan D, Smith WB, et al. Effect of a monoclonal antibody to PCSK9 on LDL cholesterol. N Engl J Med. 2012;366(12):1108–18.  https://doi.org/10.1056/NEJMoa1105803.CrossRefPubMedGoogle Scholar
  11. 11.
    Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, Collins FS, et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci U S A. 2009;106(23):9362–7.  https://doi.org/10.1073/pnas.0903103106.CrossRefPubMedPubMedCentralGoogle Scholar
  12. 12.
    Hirschhorn JN. Genomewide association studies--illuminating biologic pathways. N Engl J Med. 2009;360(17):1699–701.  https://doi.org/10.1056/NEJMp0808934.CrossRefPubMedGoogle Scholar
  13. 13.
    Goldstein DB. Common genetic variation and human traits. N Engl J Med. 2009;360(17):1696–8.  https://doi.org/10.1056/NEJMp0806284.CrossRefPubMedGoogle Scholar
  14. 14.
    Bodmer W, Bonilla C. Common and rare variants in multifactorial susceptibility to common diseases. Nat Genet. 2008;40(6):695–701.  https://doi.org/10.1038/ng.f.136.CrossRefPubMedPubMedCentralGoogle Scholar
  15. 15.
    Lupski JR, Belmont JW, Boerwinkle E, Gibbs RA. Clan genomics and the complex architecture of human disease. Cell. 2011;147(1):32–43.  https://doi.org/10.1016/j.cell.2011.09.008.CrossRefPubMedPubMedCentralGoogle Scholar
  16. 16.
    Kryukov GV, Pennacchio LA, Sunyaev SR. Most rare missense alleles are deleterious in humans: implications for complex disease and association studies. Am J Hum Genet. 2007;80(4):727–39.  https://doi.org/10.1086/513473.CrossRefPubMedPubMedCentralGoogle Scholar
  17. 17.
    Pritchard JK. Are rare variants responsible for susceptibility to complex diseases? Am J Hum Genet. 2001;69(1):124–37.  https://doi.org/10.1086/321272.CrossRefPubMedPubMedCentralGoogle Scholar
  18. 18.
    Saleheen D, Natarajan P, Armean IM, Zhao W, Rasheed A, Khetarpal SA, et al. Human knockouts and phenotypic analysis in a cohort with a high rate of consanguinity. Nature. 2017;544(7649):235–9.  https://doi.org/10.1038/nature22034.CrossRefPubMedPubMedCentralGoogle Scholar
  19. 19.
    Tennessen JA, Bigham AW, O'Connor TD, Fu W, Kenny EE, Gravel S, et al. Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science. 2012;337(6090):64–9.  https://doi.org/10.1126/science.1219240.CrossRefPubMedPubMedCentralGoogle Scholar
  20. 20.
    Keinan A, Clark AG. Recent explosive human population growth has resulted in an excess of rare genetic variants. Science. 2012;336(6082):740–3.  https://doi.org/10.1126/science.1217283.CrossRefPubMedPubMedCentralGoogle Scholar
  21. 21.
    Dickson SP, Wang K, Krantz I, Hakonarson H, Goldstein DB. Rare variants create synthetic genome-wide associations. PLoS Biol. 2010;8(1):e1000294.  https://doi.org/10.1371/journal.pbio.1000294.CrossRefPubMedPubMedCentralGoogle Scholar
  22. 22.
    Metzker ML. Sequencing technologies - the next generation. Nat Rev Genet. 2010;11(1):31–46.  https://doi.org/10.1038/nrg2626.CrossRefPubMedGoogle Scholar
  23. 23.
    Nejentsev S, Walker N, Riches D, Egholm M, Todd JA. Rare variants of IFIH1, a gene implicated in antiviral responses, protect against type 1 diabetes. Science. 2009;324(5925):387–9.  https://doi.org/10.1126/science.1167728.CrossRefPubMedPubMedCentralGoogle Scholar
  24. 24.
    Rivas MA, Beaudoin M, Gardet A, Stevens C, Sharma Y, Zhang CK, et al. Deep resequencing of GWAS loci identifies independent rare variants associated with inflammatory bowel disease. Nat Genet. 2011;43(11):1066–73.  https://doi.org/10.1038/ng.952.CrossRefPubMedPubMedCentralGoogle Scholar
  25. 25.
    Kiezun A, Garimella K, Do R, Stitziel NO, Neale BM, McLaren PJ, et al. Exome sequencing and the genetic basis of complex traits. Nat Genet. 2012;44(6):623–30.  https://doi.org/10.1038/ng.2303.CrossRefPubMedPubMedCentralGoogle Scholar
  26. 26.
    Ng SB, Turner EH, Robertson PD, Flygare SD, Bigham AW, Lee C, et al. Targeted capture and massively parallel sequencing of 12 human exomes. Nature. 2009;461(7261):272–6.  https://doi.org/10.1038/nature08250.CrossRefPubMedPubMedCentralGoogle Scholar
  27. 27.
    Bansal V, Gassenhuber J, Phillips T, Oliveira G, Harbaugh R, Villarasa N, et al. Spectrum of mutations in monogenic diabetes genes identified from high-throughput DNA sequencing of 6888 individuals. BMC Med. 2017;15(1):213.  https://doi.org/10.1186/s12916-017-0977-3.CrossRefPubMedPubMedCentralGoogle Scholar
  28. 28.
    McCarthy S, Das S, Kretzschmar W, Delaneau O, Wood AR, Teumer A, et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat Genet. 2016;48(10):1279–83.  https://doi.org/10.1038/ng.3643.CrossRefPubMedPubMedCentralGoogle Scholar
  29. 29.
    Lee S, Abecasis GR, Boehnke M, Lin X. Rare-variant association analysis: study designs and statistical tests. Am J Hum Genet. 2014;95(1):5–23.  https://doi.org/10.1016/j.ajhg.2014.06.009.CrossRefPubMedPubMedCentralGoogle Scholar
  30. 30.
    Adzhubei I, Jordan DM, Sunyaev SR. Predicting functional effect of human missense mutations using PolyPhen-2. Current protocols in human genetics 2013;Chapter 7:Unit7 20.  https://doi.org/10.1002/0471142905.hg0720s76 CrossRefGoogle Scholar
  31. 31.
    Liu X, Wu C, Li C, Boerwinkle E. dbNSFP v3.0: a one-stop database of functional predictions and annotations for human nonsynonymous and splice-site SNVs. Hum Mutat. 2016;37(3):235–41.  https://doi.org/10.1002/humu.22932.CrossRefPubMedPubMedCentralGoogle Scholar
  32. 32.
    Wu MC, Lee S, Cai T, Li Y, Boehnke M, Lin X. Rare-variant association testing for sequencing data with the sequence kernel association test. Am J Hum Genet. 2011;89(1):82–93.  https://doi.org/10.1016/j.ajhg.2011.05.029.CrossRefPubMedPubMedCentralGoogle Scholar
  33. 33.
    Lee S, Emond MJ, Bamshad MJ, Barnes KC, Rieder MJ, Nickerson DA, et al. Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies. Am J Hum Genet. 2012;91(2):224–37.  https://doi.org/10.1016/j.ajhg.2012.06.007.CrossRefPubMedPubMedCentralGoogle Scholar
  34. 34.
    Bonnefond A, Froguel P. Rare and common genetic events in type 2 diabetes: what should biologists know? Cell Metab. 2015;21(3):357–68.  https://doi.org/10.1016/j.cmet.2014.12.020.CrossRefPubMedGoogle Scholar
  35. 35.
    Grant SF, Thorleifsson G, Reynisdottir I, Benediktsson R, Manolescu A, Sainz J, et al. Variant of transcription factor 7-like 2 (TCF7L2) gene confers risk of type 2 diabetes. Nat Genet. 2006;38(3):320–3.  https://doi.org/10.1038/ng1732.CrossRefPubMedGoogle Scholar
  36. 36.
    Guan W, Pluzhnikov A, Cox NJ, Boehnke M. International Type 2 Diabetes Linkage Analysis C. Meta-analysis of 23 type 2 diabetes linkage studies from the International Type 2 Diabetes Linkage Analysis Consortium. Hum Hered. 2008;66(1):35–49.  https://doi.org/10.1159/000114164.CrossRefPubMedGoogle Scholar
  37. 37.
    Hirschhorn JN, Lohmueller K, Byrne E, Hirschhorn K. A comprehensive review of genetic association studies. Genet Med : Off J Am Coll Med Genet. 2002;4(2):45–61.  https://doi.org/10.1097/00125817-200203000-00002.CrossRefGoogle Scholar
  38. 38.
    Bonnefond A, Clement N, Fawcett K, Yengo L, Vaillant E, Guillaume JL, et al. Rare MTNR1B variants impairing melatonin receptor 1B function contribute to type 2 diabetes. Nat Genet. 2012;44(3):297–301.  https://doi.org/10.1038/ng.1053.CrossRefPubMedPubMedCentralGoogle Scholar
  39. 39.
    Billings LK, Jablonski KA, Ackerman RJ, Taylor A, Fanelli RR, McAteer JB, et al. The influence of rare genetic variation in SLC30A8 on diabetes incidence and beta-cell function. J Clin Endocrinol Metab. 2014;99(5):E926–30.  https://doi.org/10.1210/jc.2013-2378.CrossRefPubMedPubMedCentralGoogle Scholar
  40. 40.
    Flannick J, Thorleifsson G, Beer NL, Jacobs SB, Grarup N, Burtt NP, et al. Loss-of-function mutations in SLC30A8 protect against type 2 diabetes. Nat Genet. 2014;46(4):357–63.  https://doi.org/10.1038/ng.2915.CrossRefPubMedPubMedCentralGoogle Scholar
  41. 41.
    Majithia AR, Flannick J, Shahinian P, Guo M, Bray MA, Fontanillas P, et al. Rare variants in PPARG with decreased activity in adipocyte differentiation are associated with increased risk of type 2 diabetes. Proc Natl Acad Sci U S A. 2014;111(36):13127–32.  https://doi.org/10.1073/pnas.1410428111.CrossRefPubMedPubMedCentralGoogle Scholar
  42. 42.
    Najmi LA, Aukrust I, Flannick J, Molnes J, Burtt N, Molven A, et al. Functional investigations of HNF1A identify rare variants as risk factors for type 2 diabetes in the general population. Diabetes. 2017;66(2):335–46.  https://doi.org/10.2337/db16-0460.CrossRefPubMedGoogle Scholar
  43. 43.
    Wellcome Trust Case Control Consortium, Maller JB, Mcvean G, Byrnes J, Vukcevic D, Palin K, et al. Bayesian refinement of association signals for 14 loci in 3 common diseases. Nat Genet. 2012;44(12):1294–301.  https://doi.org/10.1038/ng.2435.CrossRefGoogle Scholar
  44. 44.
    Flannick J, Beer NL, Bick AG, Agarwala V, Molnes J, Gupta N, et al. Assessing the phenotypic effects in the general population of rare variants in genes for a dominant Mendelian form of diabetes. Nat Genet. 2013;45(11):1380–5.  https://doi.org/10.1038/ng.2794.CrossRefPubMedPubMedCentralGoogle Scholar
  45. 45.••
    Fuchsberger C, Flannick J, Teslovich TM, Mahajan A, Agarwala V, Gaulton KJ, et al. The genetic architecture of type 2 diabetes. Nature. 2016;536(7614):41–7.  https://doi.org/10.1038/nature18642 This paper used a combination of next-generation sequencing technologies and novel analytical approaches to provide the most comprehensive characterization of T2D genetic architecture to date. CrossRefPubMedPubMedCentralGoogle Scholar
  46. 46.
    SIGMA Type 2 Diabetes Consortium, Estrada K, Aukrust I, Bjorkhaug L, Burtt NP, Mercader JM, et al. Association of a low-frequency variant in HNF1A with type 2 diabetes in a Latino population. Jama. 2014;311(22):2305–14.  https://doi.org/10.1001/jama.2014.6511.CrossRefGoogle Scholar
  47. 47.
    Steinthorsdottir V, Thorleifsson G, Sulem P, Helgason H, Grarup N, Sigurdsson A, et al. Identification of low-frequency and rare sequence variants associated with elevated or reduced risk of type 2 diabetes. Nat Genet. 2014;46(3):294–8.  https://doi.org/10.1038/ng.2882.CrossRefPubMedGoogle Scholar
  48. 48.
    Flannick J, Johansson S, Njolstad PR. Common and rare forms of diabetes mellitus: towards a continuum of diabetes subtypes. Nat Rev Endocrinol. 2016;12(7):394–406.  https://doi.org/10.1038/nrendo.2016.50.CrossRefPubMedGoogle Scholar
  49. 49.
    Katsanis N. The continuum of causality in human genetic disorders. Genome Biol. 2016;17(1):233.  https://doi.org/10.1186/s13059-016-1107-9.CrossRefPubMedPubMedCentralGoogle Scholar
  50. 50.
    Huyghe JR, Jackson AU, Fogarty MP, Buchkovich ML, Stancakova A, Stringham HM, et al. Exome array analysis identifies new loci and low-frequency variants influencing insulin processing and secretion. Nat Genet. 2013;45(2):197–201.  https://doi.org/10.1038/ng.2507.CrossRefPubMedGoogle Scholar
  51. 51.
    Mahajan A, Sim X, Ng HJ, Manning A, Rivas MA, Highland HM, et al. Identification and functional characterization of G6PC2 coding variants influencing glycemic traits define an effector transcript at the G6PC2-ABCB11 locus. PLoS Genet. 2015;11(1):e1004876.  https://doi.org/10.1371/journal.pgen.1004876.CrossRefPubMedPubMedCentralGoogle Scholar
  52. 52.
    Wessel J, Chu AY, Willems SM, Wang S, Yaghootkar H, Brody JA, et al. Low-frequency and rare exome chip variants associate with fasting glucose and type 2 diabetes susceptibility. Nat Commun. 2015;6:5897.  https://doi.org/10.1038/ncomms6897.CrossRefPubMedPubMedCentralGoogle Scholar
  53. 53.
    Manning A, Highland HM, Gasser J, Sim X, Tukiainen T, Fontanillas P, et al. A low-frequency inactivating AKT2 variant enriched in the finnish population is associated with fasting insulin levels and type 2 diabetes risk. Diabetes. 2017;66(7):2019–32.  https://doi.org/10.2337/db16-1329.CrossRefPubMedPubMedCentralGoogle Scholar
  54. 54.
    Albrechtsen A, Grarup N, Li Y, Sparso T, Tian G, Cao H, et al. Exome sequencing-driven discovery of coding polymorphisms associated with common metabolic phenotypes. Diabetologia. 2013;56(2):298–310.  https://doi.org/10.1007/s00125-012-2756-1.CrossRefPubMedGoogle Scholar
  55. 55.
    Lohmueller KE, Sparso T, Li Q, Andersson E, Korneliussen T, Albrechtsen A, et al. Whole-exome sequencing of 2,000 Danish individuals and the role of rare coding variants in type 2 diabetes. Am J Hum Genet. 2013;93(6):1072–86.  https://doi.org/10.1016/j.ajhg.2013.11.005.CrossRefPubMedPubMedCentralGoogle Scholar
  56. 56.
    Grarup N, Moltke I, Andersen MK, Dalby M, Vitting-Seerup K, Kern T, et al. Loss-of-function variants in ADCY3 increase risk of obesity and type 2 diabetes. Nat Genet. 2018;50(2):172–4.  https://doi.org/10.1038/s41588-017-0022-7.CrossRefPubMedPubMedCentralGoogle Scholar
  57. 57.
    Moltke I, Grarup N, Jorgensen ME, Bjerregaard P, Treebak JT, Fumagalli M, et al. A common Greenlandic TBC1D4 variant confers muscle insulin resistance and type 2 diabetes. Nature. 2014;512(7513):190–3.  https://doi.org/10.1038/nature13425.CrossRefPubMedGoogle Scholar
  58. 58.
    Kryukov GV, Shpunt A, Stamatoyannopoulos JA, Sunyaev SR. Power of deep, all-exon resequencing for discovery of human trait genes. Proc Natl Acad Sci U S A. 2009;106(10):3871–6.  https://doi.org/10.1073/pnas.0812824106.CrossRefPubMedPubMedCentralGoogle Scholar
  59. 59.
    Moutsianas L, Agarwala V, Fuchsberger C, Flannick J, Rivas MA, Gaulton KJ, et al. The power of gene-based rare variant methods to detect disease-associated variation and test hypotheses about complex disease. PLoS Genet. 2015;11(4):e1005165.  https://doi.org/10.1371/journal.pgen.1005165.CrossRefPubMedPubMedCentralGoogle Scholar
  60. 60.
    Zuk O, Schaffner SF, Samocha K, Do R, Hechter E, Kathiresan S, et al. Searching for missing heritability: designing rare variant association studies. Proc Natl Acad Sci U S A. 2014;111(4):E455–64.  https://doi.org/10.1073/pnas.1322563111.CrossRefPubMedPubMedCentralGoogle Scholar
  61. 61.•
    Agarwala V, Flannick J, Sunyaev S, Go TDC, Altshuler D. Evaluating empirical bounds on complex disease genetic architecture. Nat Genet. 2013;45(12):1418–27.  https://doi.org/10.1038/ng.2804.CrossRefPubMedPubMedCentralGoogle Scholar
  62. 62.•
    Jun G, Manning A, Almeida M, Zawistowski M, Wood AR, Teslovich TM, et al. Evaluating the contribution of rare variants to type 2 diabetes and related traits using pedigrees. Proc Natl Acad Sci U S A. 2017;115:379–84.  https://doi.org/10.1073/pnas.1705859115 . This paper employed a novel pedigree strategy to characterize ultra-rare variants (private to a family) for effects on T2D, showing that they contribute minimally to T2D.CrossRefPubMedPubMedCentralGoogle Scholar
  63. 63.••
    Mahajan A, Wessel J, Willems SM, Zhao W, Robertson NR, Chu AY, et al. Refining the accuracy of validated target identification through coding variant fine-mapping in type 2 diabetes. Nat Genet. 2018;50(4):559–71.  https://doi.org/10.1038/s41588-018-0084-1. This paper is the largest exome array study of T2D to date, further limiting the contribution to T2D risk from low-frequency coding variants.CrossRefPubMedPubMedCentralGoogle Scholar
  64. 64.••
    Mahajan A, Taliun D, Thurner M, Robertson NR, Torres JM, Rayner NW, et al. Fine-mapping type 2 diabetes loci to single-variant resolution using high-density imputation and islet-specific epigenome maps. Nat Genet. 2018;50(11):1505–13.  https://doi.org/10.1038/s41588-018-0241-6. This paper is the largest T2D GWAS to date, significantly expanding the number of common and low-frequency variants to T2D but finding no new evidence for moderate effect coding variants. CrossRefPubMedPubMedCentralGoogle Scholar
  65. 65.••
    Flannick J, Mercader JM, Fuchsberger C, Udler MS, Mahajan A, Wessel J, et al. Genetic discovery and translational decision support from exome sequencing of 20,791 type 2 diabetes cases and 24,440 controls from five ancestries. bioRxiv. 2018.  https://doi.org/10.1101/371450. This paper is the largest T2D exome sequencing study to date, demonstrating evidence for pervasive rare variant T2D gene-level signals but showing them to contribute minimally to T2D heritability.
  66. 66.
    Finucane HK, Bulik-Sullivan B, Gusev A, Trynka G, Reshef Y, Loh PR, et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat Genet. 2015;47(11):1228–35.  https://doi.org/10.1038/ng.3404.CrossRefPubMedPubMedCentralGoogle Scholar
  67. 67.
    Sveinbjornsson G, Albrechtsen A, Zink F, Gudjonsson SA, Oddson A, Masson G, et al. Weighting sequence variants based on their annotation increases power of whole-genome association studies. Nat Genet. 2016;48(3):314–7.  https://doi.org/10.1038/ng.3507.CrossRefPubMedGoogle Scholar
  68. 68.
    Altshuler D, Hirschhorn JN, Klannemark M, Lindgren CM, Vohl MC, Nemesh J, et al. The common PPARgamma Pro12Ala polymorphism is associated with decreased risk of type 2 diabetes. Nat Genet. 2000;26(1):76–80.  https://doi.org/10.1038/79216.CrossRefPubMedGoogle Scholar
  69. 69.
    Claussnitzer M, Dankel SN, Kim KH, Quon G, Meuleman W, Haugen C, et al. FTO obesity variant circuitry and adipocyte browning in humans. N Engl J Med. 2015;373(10):895–907.  https://doi.org/10.1056/NEJMoa1502214.CrossRefPubMedPubMedCentralGoogle Scholar
  70. 70.
    Flannick J, Florez JC. Type 2 diabetes: genetic data sharing to advance complex disease research. Nat Rev Genet. 2016;17(9):535–49.  https://doi.org/10.1038/nrg.2016.56.CrossRefPubMedGoogle Scholar
  71. 71.
    Sladek R, Rocheleau G, Rung J, Dina C, Shen L, Serre D, et al. A genome-wide association study identifies novel risk loci for type 2 diabetes. Nature. 2007;445(7130):881–5.  https://doi.org/10.1038/nature05616.CrossRefPubMedGoogle Scholar
  72. 72.
    Rutter GA. Think zinc: new roles for zinc in the control of insulin secretion. Islets. 2010;2(1):49–50.  https://doi.org/10.4161/isl.2.1.10259.CrossRefPubMedGoogle Scholar
  73. 73.
    Nicolson TJ, Bellomo EA, Wijesekara N, Loder MK, Baldwin JM, Gyulkhandanyan AV, et al. Insulin storage and glucose homeostasis in mice null for the granule zinc transporter ZnT8 and studies of the type 2 diabetes-associated variants. Diabetes. 2009;58(9):2070–83.  https://doi.org/10.2337/db09-0551.CrossRefPubMedPubMedCentralGoogle Scholar
  74. 74.
    Pound LD, Sarkar SA, Ustione A, Dadi PK, Shadoan MK, Lee CE, et al. The physiological effects of deleting the mouse SLC30A8 gene encoding zinc transporter-8 are influenced by gender and genetic background. PLoS One. 2012;7(7):e40972.  https://doi.org/10.1371/journal.pone.0040972.CrossRefPubMedPubMedCentralGoogle Scholar
  75. 75.
    Pound LD, Sarkar SA, Benninger RK, Wang Y, Suwanichkul A, Shadoan MK, et al. Deletion of the mouse Slc30a8 gene encoding zinc transporter-8 results in impaired insulin secretion. Biochem J. 2009;421(3):371–6.  https://doi.org/10.1042/BJ20090530.CrossRefPubMedPubMedCentralGoogle Scholar
  76. 76.
    Wijesekara N, Dai FF, Hardy AB, Giglou PR, Bhattacharjee A, Koshkin V, et al. Beta cell-specific Znt8 deletion in mice causes marked defects in insulin processing, crystallisation and secretion. Diabetologia. 2010;53(8):1656–68.  https://doi.org/10.1007/s00125-010-1733-9.CrossRefPubMedPubMedCentralGoogle Scholar
  77. 77.
    Lemaire K, Ravier MA, Schraenen A, Creemers JW, Van de Plas R, Granvik M, et al. Insulin crystallization depends on zinc transporter ZnT8 expression, but is not required for normal glucose homeostasis in mice. Proc Natl Acad Sci U S A. 2009;106(35):14872–7.  https://doi.org/10.1073/pnas.0906587106.CrossRefPubMedPubMedCentralGoogle Scholar
  78. 78.•
    Kleiner S, Gomez D, Megra B, Na E, Bhavsar R, Cavino K, et al. Mice harboring the human SLC30A8 R138X loss-of-function mutation have increased insulin secretory capacity. Proc Natl Acad Sci U S A. 2018;115(32):E7642–E9.  https://doi.org/10.1073/pnas.1721418115. This paper provides experimental evidence that the SLC30A8 loss of function protects from T2D, confirming one of the earliest predictions from a rare variant T2D association.CrossRefPubMedPubMedCentralGoogle Scholar
  79. 79.
    Pal A, Barber TM, Van de Bunt M, Rudge SA, Zhang Q, Lachlan KL, et al. PTEN mutations as a cause of constitutive insulin sensitivity and obesity. N Engl J Med. 2012;367(11):1002–11.  https://doi.org/10.1056/NEJMoa1113966.CrossRefPubMedPubMedCentralGoogle Scholar
  80. 80.
    Consortium STD, Estrada K, Aukrust I, Bjorkhaug L, Burtt NP, Mercader JM, et al. Association of a low-frequency variant in HNF1A with type 2 diabetes in a Latino population. Jama. 2014;311(22):2305–14.  https://doi.org/10.1001/jama.2014.6511.CrossRefGoogle Scholar
  81. 81.•
    Majithia AR, Tsuda B, Agostini M, Gnanapradeepan K, Rice R, Peloso G, et al. Prospective functional classification of all possible missense variants in PPARG. Nat Genet. 2016;48(12):1570–5.  https://doi.org/10.1038/ng.3700. This paper introduces a paradigm for systematically characterizing coding variants in a T2D-relevant functional assay, of potential import for the future clinical and biological utility of coding variants in T2D.CrossRefPubMedPubMedCentralGoogle Scholar
  82. 82.
    Starita LM, Ahituv N, Dunham MJ, Kitzman JO, Roth FP, Seelig G, et al. Variant interpretation: functional assays to the rescue. Am J Hum Genet. 2017;101(3):315–25.  https://doi.org/10.1016/j.ajhg.2017.07.014.CrossRefPubMedPubMedCentralGoogle Scholar
  83. 83.
    Findlay GM, Daza RM, Martin B, Zhang MD, Leith AP, Gasperini M, et al. Accurate classification of BRCA1 variants with saturation genome editing. Nature. 2018;562(7726):217–22.  https://doi.org/10.1038/s41586-018-0461-z.CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© The Author(s) 2019

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors and Affiliations

  1. 1.Division of Genetics and GenomicsBoston Children’s HospitalBostonUSA
  2. 2.Department of PediatricsHarvard Medical SchoolBostonUSA
  3. 3.Programs in Medical and Population Genetics and Metabolism, Broad Institute of Harvard and MITCambridgeUSA

Personalised recommendations