Skip to main content

Introduction

  • Chapter
  • First Online:
Comparative Gene Finding

Part of the book series: Computational Biology ((COBO,volume 11))

  • 986 Accesses

Abstract

This book is meant to serve as an introduction to the new and very exciting field of comparative gene finding. We introduce the field in its current state, and go through the process of constructing a comparative gene finder by breaking it down into its separate building blocks. But before we can dive into the algorithmic details of such a process, we begin by giving a brief introduction to the underlying biological theory. In this chapter we introduce the basic concepts of genetics needed for this book, and define the gene finding problem we have set out to solve. We round off by giving a brief account of the historical developments of approaching the gene finding problem up to where it stands today. In the last section we split the process of building a gene finder into its smaller parts, and the rest of the book is structured in the same manner.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alexandersson, M., Cawley, S., Pachter, L.: SLAM: Cross-species gene finding and alignment with a generalized pair hidden Markov model. Genome Res. 13, 496–502 (2003)

    Article  Google Scholar 

  2. Allen, J.E., Salzberg, S.L.: JIGSAW: integration of multiple sources of evidence for gene prediction. Bioinformatics 21, 3596–3603 (2005)

    Article  Google Scholar 

  3. Audic, S., Claverie, J.-M.: Self-identification of protein-coding regions in microbial genomes. Proc. Natl. Acad. Sci. USA 95, 10026–10031 (1998)

    Article  Google Scholar 

  4. Axelson-Fisk, M., Sunnerhagen, P.: Comparative genomics and gene finding in fungi. In: Sunnerhagen, P., Piskur, J. (eds.) Topics in Current Genetics: Comparative Genomics Using Fungi as Models, pp. 1–28. Springer, Berlin (2005)

    Google Scholar 

  5. Badger, J.H., Olsen, G.J.: CRITICA: coding region identification tool invoking comparative analysis. Mol. Biol. Evol. 16, 512–524 (1999)

    Google Scholar 

  6. Bafna, V., Huson, D.H.: The conserved Exon method for gene finding. Int. Conf. Intell. Syst. Mol. Biol. 8, 3–12 (2000)

    Google Scholar 

  7. Batzoglou, S., Pachter, L., Mesirov, J., Berger, B., Lander, E.S.: Human and mouse gene structure: comparative analysis and application to Exon prediction. Genome Res. 10, 950–958 (2000)

    Article  Google Scholar 

  8. Beadle, G., Tatum, E.: Genetic control of biochemical reactions in Neurospora. Proc. Natl. Acad. Sci. USA 27, 499–506 (1941)

    Article  Google Scholar 

  9. Besemer, J., Lomsadze, A., Borodovsky, M.: GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions. Nucleic Acids Res. 29, 2607–2618 (2001)

    Article  Google Scholar 

  10. Biémont, C., Vieira, C.: Junk DNA as an evolutionary force. Nature 443, 521–524 (2006)

    Article  Google Scholar 

  11. Birney, E., Durbin, R.: Dynamite: a flexible code generating system for dynamic programming methods used in sequence comparison. Proc. Int. Conf. Intell. Syst. Mol. Biol. 5, 56–64 (1997)

    Google Scholar 

  12. Birney, E., Clamp, M., Durbin, R.: GeneWise and GenomeWise. Genome Res. 14, 988–995 (2004)

    Article  Google Scholar 

  13. Blandin, G., Durrens, P., Tekaia, F., Aigle, M., Bolotin-Fukuhara, M., Bon, E., Casarégola, S., de Montigny, J., Gaillardin, C., Lépingle, A., Llorente, B., Malpertuy, A., Neuvéglise, C., Ozier-Kalogeropoulus, O., Perrin, A., Potier, S., Souciet, J.-L., Talla, E., Toffano-Nioche, C., Wésolowski-Louvel, M., Marck, C., Dujon, B.: Genomic exploration of the hemiascomycetous yeasts: 4. The genome of Saccharomyces cerevisiae revisited. FEBS Lett. 487, 31–36 (2000)

    Article  Google Scholar 

  14. Boeckmann, B., Bairoch, A., Apweiler, R., Blatter, M.C., Estreicher, A., Gasteiger, E., Martin, M.J., Michoud, K., O’Donovan, C., Phan, I., Pilbout, S., Schneider, M.: The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res. 31, 365–370 (2003)

    Article  Google Scholar 

  15. Borodovsky, M., McIninch, J.: GENMARK: parallel gene recognition for both DNA strands. Comput. Chem. 17, 123–133 (1993)

    Article  MATH  Google Scholar 

  16. Brejova, B., Brown, D.G., Li, M., Vinar, T.: ExonHunter: a comprehensive approach to gene finding. Bioinformatics 21, i57–i65 (2005)

    Article  Google Scholar 

  17. Brunak, S., Engelbrecht, J., Knudsen, S.: Prediction of human mRNA donor and acceptor sites from the DNA sequence. J. Mol. Biol. 220, 49–65 (1991)

    Article  Google Scholar 

  18. Burge, C., Karlin, S.: Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268, 78–94 (1997)

    Article  Google Scholar 

  19. Carter, D., Durbin, R.: Vertebrate gene finding from multiple-species alignments using a two-level strategy. Genome Biol. 7, S6.1–S6.12 (2006)

    Article  Google Scholar 

  20. Cawley, S.E., Wirth, A.I., Speed, T.P.: Phat—a gene finding program for Plasmodium falciparum. Mol. Biochem. Parasitol. 118, 167–174 (2001)

    Article  Google Scholar 

  21. Cebrat, S., Dudek, M.R., Machiewicz, P., Kowalczuk, M., Fita, M.: Asymmetry of coding versus noncoding strand in coding sequences of different genomes. Microb. Comp. Genomies 2, 259–268 (1997)

    Google Scholar 

  22. Chatterji, S., Pachter, L.: Reference based annotation with GeneMapper. Genome Biol. 7, R29 (2006)

    Article  Google Scholar 

  23. Chen, T., Zhang, M.Q.: Pombe: A gene-finding and exon–intron structure prediction system for fission yeast. Yeast 14, 701–710 (1998)

    Article  Google Scholar 

  24. Cherry, J.M., Adler, C., Ball, C., Chervitz, S.A., Dwight, S.S., Hester, E.T., Jia, Y., Juvik, G., Roe, T., Schroeder, M., Weng, S., Botstein, D.: SGD: Saccharomyces genome database. Nucleic Acids Res. 26, 73–79 (1998)

    Article  Google Scholar 

  25. Claverie, J.M.: Gene number: what if there are only 30,000 human genes? Science 291, 1255–1257 (2001)

    Article  Google Scholar 

  26. Comings, D.E.: The structure and function of chromatin. Adv. Hum. Genet. 3, 237–431 (1972)

    Google Scholar 

  27. Crick, F.: Central dogma of molecular biology. Nature 227, 561–563 (1970)

    Article  Google Scholar 

  28. Curwen, V., Eyras, E., Andrews, T.D., Clarke, L., Mongin, E., Searle, S.M.J., Clamp, M.: The Ensembl automatic gene annotation system. Genome Res. 14, 942–950 (2004)

    Article  Google Scholar 

  29. DeCaprio, D., Vinson, J.P., Pearson, M.D., Montgomery, P., Doherty, M., Galagan, J.E.: Conrad: gene prediction using conditional random fields. Genome Res. 17, 1389–1398 (2007)

    Article  Google Scholar 

  30. Delcher, A.L., Harmon, D., Kasif, S., White, O., Salzberg, S.L.: Improved microbial gene identification with GLIMMER. Nucleic Acids Res. 27, 4636–4641 (1999)

    Article  Google Scholar 

  31. Dong, S., Searls, D.B.: Gene structure prediction by linguistic models. Genomics 23, 540–551 (1994)

    Article  Google Scholar 

  32. The FANTOM Consortium and RIKEN Genome Exploration Research Group, and Genome Science Group (Genome Network Project Core Group): The transcriptional landscape of the mammalian genome. Science 309, 1559–1563 (2005)

    Article  Google Scholar 

  33. Fickett, J.W.: Recognition of protein coding regions in DNA sequences. Nucleic Acids Res. 10, 5303–5318 (1982)

    Article  Google Scholar 

  34. Fields, C.A., Söderlund, C.A.: gm: a practical tool for automating DNA sequence analysis. Comput. Appl. Biosci. 6, 263–270 (1990)

    Google Scholar 

  35. Flicek, P., Aken, B.L., Beal, K., Ballester, B., Caccamo, M., Chen, Y., Clarke, L., Coates, G., Cunningham, F., Cutts, T., Down, T., Dyer, S.C., Eyre, T., Fitzgerald, S., Fernandez-Banet, J., Gräf, S., Haider, S., Hammond, M., Holland, R., Howe, K.L., Howe, K., Johnson, N., Jenkinson, A., Kähäri, A., Keefe, D., Kokocinski, F., Kulesha, E., Lawson, D., Longden, I., Megy, K., Meidl, P., Overduin, B., Parker, A., Pritchard, B., Prlic, A., Rice, S., Rios, D., Schuster, M., Sealy, I., Slater, G., Smedley, D., Spudich, G., Trevanion, S., Vilella, A.J., Vogel, J., White, S., Wood, M., Birney, E., Cox, T., Curwen, V., Durbin, R., Fernandez-Suarez, X.M., Herrero, J., Hubbard, T.J., Kasprzyk, A., Proctor, G., Smith, J., Ureta-Vidal, A., Searle, S.: Ensembl 2008. Nucleic Acids Res. 36, D707–D714 (2008)

    Article  Google Scholar 

  36. Frishman, D., Mironov, A., Mewes, H.-W., Gelfand, M.: Combining diverse evidence for gene recognition in completely sequenced bacterial genomes. Nucleic Acids Res. 26, 2941–2947 (1998)

    Article  Google Scholar 

  37. Gelfand, M.S.: Computer prediction of the exon–intron structure of mammalian pre-mRNAs. Nucleic Acids Res. 18, 5865–5869 (1990)

    Article  Google Scholar 

  38. Gelfand, M.S., Roytberg, M.A.: Prediction of the exon–intron structure by a dynamic programming approach. BioSystems 30, 173–182 (1993)

    Article  Google Scholar 

  39. Gelfand, M.S., Mironov, A.A., Pevzner, P.A.: Gene recognition via spliced sequence alignment. Proc. Natl. Acad. Sci. USA 93, 9061–9066 (1996)

    Article  Google Scholar 

  40. Gerstein, M.B., Bruce, C., Rozowsky, J.S., Zheng, D., Du, J., Korbel, J.O., Emanuelsson, O., Zhang, Z.D., Wiessman, S., Snyder, M.: What is a gene, post-ENCODE? History and updated definition. Genome Res. 17, 669–681 (2007)

    Article  Google Scholar 

  41. Gish, W., States, D.J.: Identification of protein coding regions by database similarity search. Nat. Genet. 3, 266–272 (1993)

    Article  Google Scholar 

  42. Goffeau, A., Barrell, B.G., Bussey, H., Davis, R.W., Dujon, B., Feldmann, H., Galibert, F., Hoheisel, J.D., Jacq, C., Johnston, M., Louis, E.J., Mewes, H.W., Murakami, Y., Philippsen, P., Tettelin, H., Oliver, S.G.: Life with 6000 genes. Science 274, 563–567 (1996)

    Article  Google Scholar 

  43. Gregory, T.R.: Coincidence, coevolution, or causation? DNA content, cell size, and the C-value enigma. Biol. Rev. 76, 65–101 (2001)

    Article  Google Scholar 

  44. Gregory, T.R.: The C-value enigma in plants and animals: a review of parallels and an appeal for partnership. Ann. Bot. 95, 133–146 (2005)

    Article  Google Scholar 

  45. Gremme, G., Brendel, V., Sparks, M.E., Kurtz, S.: Engineering a software tool for gene structure prediction in higher organisms. Inf. Softw. Techol. 47, 965–978 (2005)

    Article  Google Scholar 

  46. Gross, S.S., Brent, M.R.: Using multiple alignments to improve gene prediction. J. Comput. Biol. 13, 379–393 (2006)

    Article  MathSciNet  Google Scholar 

  47. Guigó, R., Knudsen, S., Drake, N., Smith, T.: Prediction of gene structure. J. Mol. Biol. 226, 141–157 (1992)

    Article  Google Scholar 

  48. Guo, F.-B., Ou, H.-Y., Zhang, C.-T.: ZCURVE: a new system for recognizing protein-coding genes in bacterial and archaeal genomes. Nucleic Acids Res. 31, 1780–1789 (2003)

    Article  Google Scholar 

  49. Harrison, P.M., Kumar, A., Lang, N., Snyder, M., Gerstein, M.: A question of size: the eukaryotic proteome and the problems in defining it. Nucleic Acids Res. 30, 1083–1090 (2002)

    Article  Google Scholar 

  50. Henderson, J., Salzberg, S., Fasman, K.H.: Finding genes in DNA with a hidden Markov model. J. Comput. Biol. 4, 127–141 (1997)

    Article  Google Scholar 

  51. Howe, K.L., Chothia, T., Durbin, R.: GAZE: a generic framework for the integration of gene-prediction data by dynamic programming. Genome Res. 12, 1418–1427 (2002)

    Article  Google Scholar 

  52. Hsieh, S.J., Lin, C.Y., Liu, N.H., Chow, W.Y., Tang, C.Y.: GeneAlign: a coding exon prediction tool based on phylogenetical comparisons. Nucleic Acids Res. 34, W280–W284 (2006)

    Article  Google Scholar 

  53. Human Genome Sequencing Consortium: Initial sequencing and analysis of the human genome. Nature 409, 745–964 (2001)

    Article  Google Scholar 

  54. Hutchinson, G.B., Hayden, M.R.: The prediction of exons through an analysis of spliceable open reading frames. Nucleic Acids Res. 20, 3453–3462 (1992)

    Article  Google Scholar 

  55. Issac, B., Raghava, G.P.S.: EGPred: prediction of eukaryotic genes using ab initio methods after combining with sequence similarity approaches. Genome Res. 14, 1756–1766 (2004)

    Article  Google Scholar 

  56. Kanno, H., Huang, I.-Y., Kan, Y.W., Yoshida, A.: Two structural genes on different chromosomes are required for encoding the major subunit of human red cell glucose-6-phosphate dehydrogenase. Cell 58, 595–606 (1989)

    Article  Google Scholar 

  57. Kellis, M., Patterson, N., Endrizzi, M., Birren, B., Lander, E.S.: Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423, 241–254 (2003)

    Article  Google Scholar 

  58. Kim, H., Klein, R., Majewski, J., Ott, J.: Estimating rates of alternative splicing in mammals and invertebrates. Nat. Genet. 36, 915–917 (2004)

    Article  Google Scholar 

  59. Korf, I., Flicek, P., Duan, D., Brent, M.R.: Integrating genomic homology into gene structure prediction. Bioinformatics 17, S140–S148 (2001)

    Google Scholar 

  60. Kowalczuk, M., Mackiewicz, P., Gierlik, A., Dudek, M.R., Cebrat, S.: Total number of coding open reading frames in the yeast genome. Yeast 15, 1031–1034 (1999)

    Article  Google Scholar 

  61. Krogh, A.: Two methods for improving performance of an HMM and their application for gene finding. Proc. Int. Conf. Intell. Syst. Mol. Biol. 5, 179–186 (1997)

    Google Scholar 

  62. Krogh, A.: Using database matches with HMMGene for automated gene detection in Drosophila. Genome Res. 10, 523–528 (2000)

    Article  Google Scholar 

  63. Krogh, A., Mian, I.S., Haussler, D.: A hidden Markov model that finds genes in E. coli DNA. Nucleic Acids Res. 22, 4768–4778 (1994)

    Article  Google Scholar 

  64. Krogh, A., Brown, M., Mian, I.S., Sjölander, K., Haussler, D.: Hidden Markov models in computational biology: applications to protein modeling. J. Mol. Biol. 235, 1501–1531 (2002)

    Article  Google Scholar 

  65. Kulp, D., Haussler, D., Reese, M.G., Eeckman, F.H.: A generalized hidden Markov model for the recognition of human genes in DNA. Proc. Int. Conf. Intell. Syst. Mol. Biol. 4, 134–142 (1996)

    Google Scholar 

  66. Kulp, D., Haussler, D., Reese, M.G., Eeckman, F.H.: Integrating database homology in a probabilistic gene structure model. Pac. Symp. Biocomput. 2, 232–244 (1997)

    Google Scholar 

  67. Kumar, A., Harrison, P.M., Cheung, K.-H., Lan, N., Echols, N., Bertone, P., Miller, P., Gerstein, M.B., Snyder, M.: An integrated approach for finding overlooked genes in yeast. Nat. Biotechnol. 20, 58–63 (2002)

    Article  Google Scholar 

  68. Larsen, T.S., Krogh, A.: Easy-gene—a prokaryotic gene finder that ranks ORFs by statistical significance. BMC Bioinf. 4, 21–35 (2003)

    Article  Google Scholar 

  69. Lomsadze, A., Ter-Hovhannisyan, V., Chernoff, Y.O., Borodovsky, M.: Gene identification in novel eukaryotic genomes by self-training algorithm. Nucleic Acids Res. 33, 6494–6506 (2005)

    Article  Google Scholar 

  70. Mackiewicz, P., Kowalczuk, M., Mackiewicz, D., Nowicka, A., Dudkiewicz, M., Laszkiewicz, A., Dudek, M.R., Cebrat, S.: How many protein-coding genes are there in the Saccharomyces cerevisiae genome? Yeast 19, 619–629 (2002)

    Article  Google Scholar 

  71. Majoros, W.H., Pertea, M., Antonescu, C., Salzberg, S.L.: GlimmerM, Exonomy and Unveil: three ab initio eukaryotic gene finders. Nucleic Acids Res. 31, 3601–3604 (2003)

    Article  Google Scholar 

  72. Majoros, W.H., Pertea, M., Salzberg, S.L.: TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene finders. Bioinformatics 20, 2878–2879 (2004)

    Article  Google Scholar 

  73. Majoros, W.H., Pertea, M., Delcher, A.L., Salzberg, S.L.: Efficient decoding algorithms for generalized hidden Markov model gene finders. BMC Bioinf. 6, 16–28 (2005)

    Article  Google Scholar 

  74. Majoros, W.H., Pertea, M., Salzberg, S.L.: Efficient implementation of a generalized pair hidden Markov model for comparative gene finding. Bioinformatics 21, 1782–1788 (2005)

    Article  Google Scholar 

  75. Mewes, H.W., Heumann, K., Kaps, A., Mayer, K., Pfeiffer, F., Stocker, S., Frishman, D.: MIPS: a database for genomes and protein sequences. Nucleic Acids Res. 27, 44–48 (1999)

    Article  Google Scholar 

  76. Meyer, I.M., Durbin, R.: Comparative ab initio prediction of gene structures using pair HMMs. Bioinformatics 18, 1309–1318 (2002)

    Article  Google Scholar 

  77. Meyer, I.M., Durbin, R.: Gene structure conservation aids similarity based gene prediction. Nucleic Acids Res. 32, 776–783 (2004)

    Article  Google Scholar 

  78. Milanesi, L., D’Angelo, D., Rogozin, I.B.: GeneBuilder: interactive in silico prediction of gene structure. Bioinformatics 15, 612–621 (1999)

    Article  Google Scholar 

  79. Mironov, A.A., Novichkov, P.S., Gelfand, M.S.: Pro-frame: similarity-based gene recognition in eukaryotic DNA sequences with errors. Bioinformatics 17, 13–15 (2001)

    Article  Google Scholar 

  80. Mouse Genome Sequencing Consortium: Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520–562 (2002)

    Article  Google Scholar 

  81. Munch, K., Krogh, A.: Automatic generation of gene finders for eukaryotic species. BMC Bioinf. 7, 263–274 (2006)

    Article  Google Scholar 

  82. Novichkov, P.S., Gelfand, M.S., Mironov, A.A.: Gene recognition in eukaryotic DNA by comparison of genomic sequences. Bioinformatics 17, 1011–1018 (2001)

    Article  Google Scholar 

  83. Ovcharenko, I., Boffelli, D., Loots, G.G.: eShadow: a tool for comparing closely related sequences. Genome Res. 14, 1191–1198 (2004)

    Article  Google Scholar 

  84. Parra, G., Agarwal, P., Abril, J.F., Wiehe, T., Fickett, J.W., Guigó, R.: Comparative gene prediction in human and mouse. Genome Res. 13, 108–117 (2003)

    Article  Google Scholar 

  85. Pedersen, J.S., Hein, J.: Gene finding with a hidden Markov model of genome structure and evolution. Bioinformatics 19, 219–227 (2003)

    Article  Google Scholar 

  86. RIKEN Genome Exploration Research Group, and Genome Science Group (Genome Network Project Core Group) and the FANTOM Consortium: Antisense transcription in the mammalian transcriptome. Science 309, 1564–1566 (2005)

    Article  Google Scholar 

  87. Salamov, A.A., Solovyev, V.V.: Ab initio gene finding in Drosophila genomic DNA. Genome Res. 10, 516–522 (2000)

    Article  Google Scholar 

  88. Salzberg, S.L., Delcher, A.L., Fasman, K.H., Henderson, J.: A decision tree system for finding genes in DNA. J. Comput. Biol. 5, 667–680 (1998)

    Article  Google Scholar 

  89. Salzberg, S.L., Delcher, A.L., Kasif, S., White, O.: Microbial gene identification using interpolated Markov models. Nucleic Acids Res. 26, 544–548 (1998)

    Article  Google Scholar 

  90. Schiex, T., Moisan, A., Rouzé, P.: EuGene: an eucaryotic gene finder that combines several sources of evidence. In: Gascuel, O., Sagot, M.-F. (eds.) Computational Biology, pp. 111–125. Springer, Berlin (2001)

    Chapter  Google Scholar 

  91. Schweikert, G., Zien, A., Zeller, G., Behr, J., Dieteric, C., Ong, C.S., Philips, P., De Bona, F., Hartmann, L., Bohlen, A., Krüger, N., Sonnenburg, S., Rätsch, G.: mGene: accurate SVM-based gene finding with an application to nematode genomes. Genome Res. 19(11), 2133–2143 (2009)

    Article  Google Scholar 

  92. Siepel, A., Haussler, D.: Computational identification of evolutionary conserved exons. RECOMB 8, 177–186 (2004)

    Article  Google Scholar 

  93. Smit, A.F.A., Hubley, R., Green, P.: RepeatMasker at http:www.repeatmasker.org

  94. Snyder, E.E., Stormo, G.D.: Identification of coding regions in genomic DNA sequences: an application of dynamic programming and neural networks. Nucleic Acids Res. 21, 607–613 (1993)

    Article  Google Scholar 

  95. Snyder, E.E., Stormo, G.D.: Identification of protein coding regions in genomic DNA. J. Mol. Biol. 248, 1–18 (1995)

    Article  Google Scholar 

  96. Solovyev, V.V., Salamov, A.A., Lawrence, C.B.: Predicting internal exons by oligonucleotide composition and discriminant analysis of spliceable open reading frames. Nucleic Acids Res. 22, 5156–5163 (1994)

    Article  Google Scholar 

  97. Southan, C.: Has the yo-yo stopped? An assessment of human protein-coding gene number. Proteomics 4, 1712–1726 (2004)

    Article  Google Scholar 

  98. Staden, R.: Computer methods to locate signals in nucleic acid sequences. Nucleic Acids Res. 12, 505–519 (1984)

    Article  Google Scholar 

  99. Staden, R., McLachlan, A.D.: Codon preference and its use in identifying protein coding regions in long DNA sequences. Nucleic Acids Res. 10, 141–156 (1982)

    Article  Google Scholar 

  100. Stanke, M., Waack, S.: Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics 19, ii215–ii225 (2003)

    Article  Google Scholar 

  101. Swift, H.: The constancy of desoxyribose nucleic acid in plant nuclei. Proc. Natl. Acad. Sci. USA 36, 643–654 (1950)

    Article  Google Scholar 

  102. Taher, L., Rinner, O., Garg, S., Sczyrba, A., Brudno, M., Batzoglou, S., Morgenstern, B.: AGenDA: homology-based gene prediction. Bioinformatics 19(12), 1575–1577 (2003)

    Article  Google Scholar 

  103. Vendrely, R., Vendrely, C.: La teneur du noyau cellulaire en acide désoxyribonucléique à travers les organes, les individus et les espéces animales: techniques et premiers résultats. Experientia 4, 434–436 (1948)

    Article  Google Scholar 

  104. Wade, N.: Gene sweepstakes end, but winner may well be wrong. New York Times, June 3 (2003)

    Google Scholar 

  105. Wain, H.M., Bruford, E.A., Lovering, E.C., Lush, M.J., Wright, M.W., Povey, S.: Guidelines for human gene nomenclature. Genomics 79, 464–470 (2002)

    Article  Google Scholar 

  106. Wiehe, T., Gebauer-Jung, S., Mitchell-Olds, T., Guigó, R.: SGP-1: prediction and validation of homologous genes based on sequence alignments. Genome Res. 11, 1574–1583 (2001)

    Article  Google Scholar 

  107. Wood, V., Rutherford, K.M., Ivens, A., Rajandream, M.-A., Barrell, B.: A re-annotation of the Saccharomyces cerevisiae genome. Comp. Funct. Genomics 2, 143–154 (2001)

    Article  Google Scholar 

  108. Wu, J., Haussler, D.: Coding exon detection using comparative sequences. J. Comput. Biol. 13, 1148–1164 (2006)

    Article  MathSciNet  Google Scholar 

  109. Xu, Y., Mural, R.J., Einstein, J.R., Shah, M.B., Uberbacher, E.C.: GRAIL: a multi-agent neural network system for gene identification. Proc. IEEE 84, 1544–1552 (1996)

    Article  Google Scholar 

  110. Xu, Y., Uberbacher, E.C.: In: Salzberg, S.L., Searls, D.B., Kasif., S. (eds.) Computational Methods in Molecular Biology, pp. 109–128. Elsevier Science B.V., Amsterdam (1998)

    Chapter  Google Scholar 

  111. Yada, T., Takagi, T., Totoki, Y., Sakaki, Y., Takaeda, Y.: DIGIT: a novel gene finding program by combining gene-finders. Pac. Symp. Biocomput. 8, 375–387 (2003)

    Google Scholar 

  112. Zhang, C.-T., Wang, J.: Recognition of protein coding genes in the yeast genome at better than 95% accuracy based on the Z curve. Nucleic Acids Res. 28, 2804–2814 (2000)

    Article  Google Scholar 

  113. Zhang, M.Q.: Identification of protein coding regions in the human genome by quadratic discriminant analysis. Proc. Natl. Acad. Sci. USA 94, 565–568 (1997)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marina Axelson-Fisk .

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag London

About this chapter

Cite this chapter

Axelson-Fisk, M. (2010). Introduction. In: Comparative Gene Finding. Computational Biology, vol 11. Springer, London. https://doi.org/10.1007/978-1-84996-104-2_1

Download citation

  • DOI: https://doi.org/10.1007/978-1-84996-104-2_1

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-84996-103-5

  • Online ISBN: 978-1-84996-104-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics