Abstract
The general goal of the project is to find and verify new overlapping protein-coding DNA sequences in prokaryotes and to understand the underlying mechanisms with the help of models from information and communication theory. To reach these goals, a cooperation of three groups is necessary, namely a group performing in vivo and in vitro molecular biology experiments, an informatic group which can handle the huge amount of widely distributed data on gene sequences, and a group working in information and communication theory. With methods from information theory, especially from error correcting codes, the process of coding proteins via embedded genes will be studied, using new distance measures. Further, the powerful concept of random coding will be used to obtain bounds. Embedded genes will be analyzed using a coding-theoretic approach. Communication theory provides models and mechanisms in order to transmit information reliably over channels which introduce errors. Evolution, as well as the process of coding proteins by overlapping genes, can be viewed as such a communication system. Both will be described and analyzed with the theory from communication systems, including synchronization mechanisms. The parameters of the models need to be verified and/or determined. Therefore, aspects of bioinformatics and molecular biology are essential. Algorithms will be developed which efficiently search databases at a large scale for new protein-coding DNA sequences in prokaryotes, embedded in annotated genes in overlapping alternative reading frames. Based on these results, experimental evaluation of embedded genes using molecular biology tools to determine function of selected candidate genes will be performed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Publications Within the Project
Behrisch M et al (2013) Visual comparison of orderings and rankings. In: Pohl M, Schumann H (eds) EuroVis workshop on visual analytics. The Eurographics Association, pp. 7–11
Fellner L et al (2014) Phenotype of htgA (mbiA), a recently evolved orphan gene of Escherichia coli and Shigella, completely overlapping in antisense to yaaW. FEMS Microbiol Lett 350(1):57–64
Fellner L et al (2015) Evidence for the recent origin of a bacterial protein-coding, overlapping orphan gene by evolutionary overprinting. BMC Evol Biol 15:283
Fellner L et al (2016) Draft genome sequences of three european laboratory derivatives from enterohemorrhagic Escherichia coli O157:H7 strain EDL933, including two plasmids. Genome Announcements 4(2):e01331-15
Hücker SM et al (2017) Transcriptional and translational regulation by RNA thermometers, riboswitches and the sRNA DsrA in Escherichia coli O157:H7 Sakai under combined cold and osmotic stress adaptation. FEMS Microbiol Lett 364(2):fnw262
Landstorfer R et al (2014) Comparison of strand-specific transcriptomes of enterohemorrhagic Escherichia coli O157:H7 EDL933 (EHEC) under eleven different environmental conditions including radish sprouts and cattle feces. BMC Genomics 15:353
Mir K et al (2012) Predicting statistical properties of open reading frames in bacterial genomes. PLoS ONE 7(9):e45103
Mir K et al (2013) Short barcodes for next generation sequencing. PLoS ONE 8(12):e82933
Mir K, Schober S (2014a) Investigation of genetic code optimality for overlapping protein coding sequences. In: Proceedings of the 8th international symposium on turbo codes and iterative information processing (ISTC), Ulm, Germany
Mir K, Schober S (2014b) Selection pressure in alternative reading frames. PLoS ONE 9(10):e108768
Neuhaus K et al (2016) Translatomics combined with transcriptomics and proteomics reveals novel functional, recently evolved orphan genes in Escherichia coli O157:H7 (EHEC). BMC Genomics 17:133
Neuhaus K et al (2017) Differentiation of ncRNAs from small mRNAs in Escherichia coli O157:H7 EDL933 (EHEC) by combined RNAseq and RIBOseq—ryhB encodes the regulatory RNA RyhB and a peptide, RyhP. BMC Genomics 18:216
Oelke D et al (2011) Visual boosting in pixel-based visualizations. Comput Gr Forum 30(3):871–880
Schilling K (2015) Theoretical aspects of overlapping genes. http://vts.uniulm.de/doc.asp?id=9397 Faculty of Engineering and Computer Science, Ulm University. http://vts.uni-ulm.de/doc.asp?id=9397
Schober S et al (2012) Design of short barcodes for next generation sequencing of DNA and RNA. In: Genomic signal processing and statistics (GENSIPS), pp. 31–34
Simon S et al (2011) Visual analysis of next-generation sequencing data to detect overlapping genes in bacterial genomes. In: Proceedings of IEEE symposium on biological data visualization, Providence, Rhode Island, USA, vol 1, pp. 47–54, 23–24 October 2011
Simon S et al (2015) Bridging the gap of domain and visualization experts with a Liaison. In: Bertini E, Kennedy J, Puppo P (eds) Eurographics conference on visualization (EuroVis) - short papers, Cagliari, Italy, 25–29 May 2015. The Eurographics Association, pp. 127–133
Simon S et al (2015) VisExpress - visual exploration of differential gene expression data. Inf Vis 16(1): 48–73
Other Publications
Altschul SF et al (1990) Basic local alignment search tool. J Mol Biol 215(3):403–410
Altschul SF et al (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25(17):3389–3402
Behrens M, Sheikh J, Nataro JP (2002) Regulation of the overlapping pic/set locus in Shigella flexneri and enteroaggregative Escherichia coli. Infect Immun 70:2915–2925
Chirico N, Vianelli A, Belshaw R (2010) Why genes overlap in viruses. Proc R Soc B Biol Sci 277(1701):3809–3817
Grassl M (2006) Searching for linear codes with large minimum distance. In: Bosma W, Cannon J (eds) Discovering mathematics with magma – reducing the abstract to the concrete. Algorithms and computation in mathematics, vol 19. Springer, Heidelberg, pp 287–313
Grassl M (2007) Bounds on the minimum distance of linear codes and quantum codes. http://www.codetables.de. Accessed 08 Aug 2012
Itzkovitz S, Alon U (2007) The genetic code is nearly optimal for allowing additional information within protein-coding sequences. Genome Res 17(4):405
Jensen KT et al (2006) Novel overlapping coding sequences in Chlamydia trachomatis. FEMS Microbiol Lett 265(1):106–117
Johnson ZI, Chisholm SW (2004) Properties of overlapping genes are conserved across microbial genomes. Genome Res 14(11):2268–72
Kim W et al (2009) Proteomic detection of non-annotated protein-coding genes in Pseudomonas fluorescens Pf0-1. PloS ONE 4(12):e8455
Koonin EV, Novozhilov AS (2009) Origin and evolution of the genetic code: the universal enigma. Int Union Biochem Mol Biol Life 61(2):99–111
Krakauer DC (2000) Stability and evolution of overlapping genes. Evol Int J Org Evol 54(3):731–739
Kryazhimskiy S, Plotkin JB (2008) The population genetics of dN/dS. PLoS Genet 4(12):e1000304
Latif H et al (2014) A gapless, unambiguous genome sequence of the enterohemorrhagic Escherichia coli O157: H7 strain EDL933. Genome Announce 2(4):e00821–14
Miyata T, Yasunaga T (1980) Molecular evolution of mRNA: a method for estimating evolutionary rates of synonymous and amino acid substitutions from homologous nucleotide sequences and its application. Genetics 16:641–657
Perna NT et al (2001) Genome sequence of enterohaemorrhagic Escherichia coli O157:H7. Nature 409(6819):529–533
Silby MW, Rainey PB, Levy SB (2004) IVET experiments in Pseudomonas fluorescens reveal cryptic promoters at loci associated with recognizable overlapping genes. Microbiology 150:518–520
Simon S et al (2012) Visualization of the sensitivity of BLAST to changes in the parameter settings. In: Poster at GCB 2012 - German conference on bioinformatics 2012, Jena, Germany (Poster)
Tunca S et al (2009) Two overlapping antiparallel genes encoding the iron regulator DmdR1 and the Adm proteins control siderophore and antibiotic biosynthesis in Streptomyces coelicolor A3(2). FEBS J 276(17):4814–4827
Yockey HP (1992) Information theory in molecular biology. Cambridge University Press, Cambridge
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this chapter
Cite this chapter
Scherer, S., Neuhaus, K., Bossert, M., Mir, K., Keim, D., Simon, S. (2018). Finding New Overlapping Genes and Their Theory (FOG Theory). In: Bossert, M. (eds) Information- and Communication Theory in Molecular Biology. Lecture Notes in Bioengineering. Springer, Cham. https://doi.org/10.1007/978-3-319-54729-9_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-54729-9_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-54728-2
Online ISBN: 978-3-319-54729-9
eBook Packages: EngineeringEngineering (R0)