Skip to main content

Junk or functional DNA? ENCODE and the function controversy

Abstract

In its last round of publications in September 2012, the Encyclopedia Of DNA Elements (ENCODE) assigned a biochemical function to most of the human genome, which was taken up by the media as meaning the end of ‘Junk DNA’. This provoked a heated reaction from evolutionary biologists, who among other things claimed that ENCODE adopted a wrong and much too inclusive notion of function, making its dismissal of junk DNA merely rhetorical. We argue that this criticism rests on misunderstandings concerning the nature of the ENCODE project, the relevant notion of function and the claim that most of our genome is junk. We argue that evolutionary accounts of function presuppose functions as ‘causal roles’, and that selection is but a useful proxy for relevant functions, which might well be unsuitable to biomedical research. Taking a closer look at the discovery process in which ENCODE participates, we argue that ENCODE’s strategy of biochemical signatures successfully identified activities of DNA elements with an eye towards causal roles of interest to biomedical research. We argue that ENCODE’s controversial claim of functionality should be interpreted as saying that 80 % of the genome is engaging in relevant biochemical activities and is very likely to have a causal role in phenomena deemed relevant to biomedical research. Finally, we discuss ambiguities in the meaning of junk DNA and in one of the main arguments raised for its prevalence, and we evaluate the impact of ENCODE’s results on the claim that most of our genome is junk.

This is a preview of subscription content, access via your institution.

Fig. 1

Notes

  1. Because natural selection tends to remove deleterious mutations from the pool, we can expect to observe less mutations in DNA sequences important for the survival and reproduction of the organism. As there are a number of technical hurdles in the detection of such selection, estimates vary (some going up to 15 %). For a discussion, see Ponting and Hardison (2011).

  2. See for example http://genomeinformatician.blogspot.ch/2012/09/encode-my-own-thoughts.html and http://www.homolog.us/blogs/blog/2013/04/09/homolog-us-blog-calls-for-sean-eddy-be-fired-for-the-sake-of-good-science/.

  3. In fact, it must be noted that most criticisms are not directly aimed at what is written in ENCODE's scientific publications, which are careful in their formulations, but instead at its interpretation. This is not limited to the mass media, but also to the coverage the results were given in prestigious scientific journals such as Science (Pennisi 2012).

  4. On the day the embargo was lifted on the last round of ENCODE's publications (and therefore long before the publication of ENCODE's criticisms), Ewan Birney, ENCODE's lead analysis coordinator, published a post on his personal blog providing his personal perspective of the project (Birney 2012a, b). In this post, Birney acknowledges that the proportion of the genome that is “functional” depends on how stringent one is, and preempts some of the most important technical criticisms addressed at the project.

  5. Doolittle for instance writes: “Those of us who speak of excess DNA as informationally junk mean that its presence is not to be explained by past and/or current selection at the level of organisms—that it has no informational function construable historically as an SE [selected effect]. Those who say that almost the whole of the human genome is functional informationally do so on the basis of an operational diagnosis embracing a non-historical CR [causal role] definition of function.” (Doolittle 2013, p. 5299).

  6. In a similar way, Weber (2005) has argued that “elucidating the evolutionary history of some system or subsystem is supplementary to analyzing its function; it is not part of it.” (Weber 2005, p. 40) This is easily shown by the fact that expatiations (features which start being used in a way for which they have not been selected) are generally considered functional.

  7. Furthermore, the reader should be aware that different lines of research suggest that the modern synthesis is insufficient to understand evolution. See for instance the work edited by Pigliucci and Müller (2010) on the need of extending the standard evolutionary paradigm. Other directions/suggestions have been explored by Shapiro (2011), Gissis and Jablonka (2011 edited by), and Kauffman (1993, 1996).

  8. Ernst Mayr (1961) proposed a distinction between two research projects within biology, which he labeled functional biology and evolutionary biology. According to Mayr, while functional biology seeks proximate causes and therefore investigates how certain phenomena occur, evolutionary biology is devoted to understand why or the evolutionary reason for the presence of the very same phenomena. Our point is that given traits can be involved in the explanations of functional biology independently of whether they have been selected for—as Cummins puts it: “Flight is a capacity that cries out for explanation in terms of anatomical functions regardless of its contribution to the capacity to maintain the species.” (Cummins 1975, p. 756). We obviously do not claim that Mayr's two domains of biology are insulated from each other, but rather that they pursue two legitimate aims which share many, though not all of their means (Laland et al. 2011). Each domain is important in the investigation of the other. However, a reduction of all the functional relevance of the genome to its evolutionary dimension (i.e. function as selected effect or biological advantage) fails to give enough attention to the different research projects of the life sciences.

  9. Note that the two-step strategy we propose is not in conflict with Bechtel and Richardson's (2010) strategy of decomposition and localization. According to the latter, scientists decompose a complex phenomenon into less complex subsystems or contributions, and attempt to localize these functions to physical components of the system (e.g. organelles). Our point is that in the step of localization, the physical components are not entirely uncharacterized, and the earlier characterization of their activities provides important hints as to which function localizes where. Our two-step strategy is however to be distinguished from another very common strategy in biology. Mutants identified through reverse genetics, for instance, can establish the relevance of a part in a given phenomenon before identifying any of its activities. Obviously, the strategy we describe is but one of the many general strategies available to biologists.

  10. This has to be understood as making a relevant difference, for any change to DNA makes a phenotypic difference at least insofar as the genome is also part of the structure of the organism. Even a transcription factor binding site in the middle of nowhere, not leading to any transcription, is having an effect on relevant gene functions, at least insofar as it sequesters the transcription factor and hence reduces the amount of the protein available for important binding sites. In the same way, non-coding RNA can have an influence on the expression of coding genes because they are bound by miRNAs which would normally regulate the coding genes (Salmena et al. 2011). However, such impacts may be so small as to be imperceptible.

  11. This is well illustrated by Brenner's (1998) distinction between “junk” and “garbage”: “Some years ago I noticed that there are two kinds of rubbish in the world and that most languages have different words to distinguish them. There is the rubbish we keep, which is junk, and the rubbish we throw away, which is garbage. The excess DNA in our genomes is junk, and it is there because it is harmless, as well as being useless, and because the molecular processes generating extra DNA outpace those getting rid of it. Were the extra DNA to become disadvantageous, it would become subject to selection, just as junk that takes up too much space, or is beginning to smell, is instantly converted to garbage…” (Brenner S, Refuge of spandrels. Curr. Biol. 8: R669, 1998, quoted in Graur et al. 2013, p. 586).

  12. E.g. “Lean Gene Machine”, Scientific American, accessed at http://www.scientificamerican.com/article.cfm?id=lean-gene-machine.

  13. If evidence is required to support this claim, the reader may consider as an example the effect of budget and workforce cuts on the Greek National Health System (Kentikelenis et al 2011).

  14. This problematic move is also present in another critique of ENCODE's claims, which contrasts the question of “How much DNA does it take to design a human?” with that of “How much DNA does it take to evolve a human?” (Eddy 2013, p. R260), relating the former to function and the latter to junk (see also the interview with Eddy in Diep 2013). Function, however, does not mean ideal design.

  15. Chris Ponting (personal communication) for instance made this claim, but also emphasized the immense difficulty of identifying the remaining 10 % scattered across the genome.

  16. Perhaps the most interesting study regarding this question is that of Nobrega et al. (2004), who deleted two megabase-long non-coding regions of the mouse genome and failed to detect any relevant phenotypic difference.

  17. According to Eddy (2013), “[t]here are three categories of big science: the big experiment, the map, and the leading wedge. A big experiment is driven by a single question or hypothesis test, but requires a large scale community investment. […] A map is a data resource—comprehensive, complete, closed ended—to be used by multiple groups, over a long time, for multiple purposes. […] A leading wedge is a massed technology development effort, in an area where we need radically better methods.” (Eddy 2013, p. R261) While the success of “big experiments” is generally easy to appraise, Eddy deplores that “[w]e have been too shy to defend maps and leading wedges in biology” (Eddy 2013, p. R261).

References

Download references

Acknowledgments

We wish to acknowledge Fridolin Groß, who was part of the many discussions at the origin of this paper and carefully commented several versions of the paper. In addition, we wish to thank all those who have read drafts of this paper: Michel Morange, Michael Weisberg, Iros Barozzi, Lorenzo Del Savio, Marcel Weber and the lgBIG group in Geneva (in which the paper was discussed), Alkistis Elliot-Graves and Vera Pendino. We are also thankful to our colleagues of the FOLSATEC programme. Finally, we wish to acknowledge the two anonymous reviewers for their help in improving the text.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pierre-Luc Germain.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Germain, PL., Ratti, E. & Boem, F. Junk or functional DNA? ENCODE and the function controversy. Biol Philos 29, 807–831 (2014). https://doi.org/10.1007/s10539-014-9441-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10539-014-9441-3

Keywords

  • Biological function
  • Causal role
  • Selected effect
  • ENCODE
  • Junk DNA