Skip to main content

Comparative Gene Finding

  • Chapter
  • First Online:
  • 1371 Accesses

Part of the book series: Computational Biology ((COBO,volume 20))

Abstract

Comparative gene finding is a fairly new and emerging field within computational biology. The new generation of gene finders has a considerable number of advantages over its single species predecessors, including higher prediction accuracy, and the ability to annotate more varying gene features that previously have eluded computational approaches. In Chap. 2 we described some of the most common algorithms used as main algorithms in single species gene finding. In this chapter we exemplify some of the corresponding algorithms in comparative gene finding, ranging from similarity-based techniques, to pair hidden Markov models and generalized pair hidden Markov models, to gene mapping. Lastly, we present some of the first attempts to extend the pairwise approaches to multiple sequence gene finding. Each section is finished off by an example of a gene finding software using the method in question.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Alexandersson, M., Cawley, S., Pachter, L.: SLAM: cross-species gene finding and alignment with a generalized pair hidden Markov model. Genome Res. 13, 496–502 (2003)

    Article  Google Scholar 

  2. Altschul, S.F., Gish, W., Miller, W., Myers, E.M., Lipman, D.J.: Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990)

    Article  Google Scholar 

  3. Ansari-Lari, M.A., Oeltjen, J.C., Schwartz, S., Zhang, Z., Muzny, D.M., Lu, J., Gorrell, J.H., Chinault, A.C., Belmont, J.W., Miller, W., Gibbs, R.A.: Comparative sequence analysis of a gene-rich cluster at human chromosome 12p13 and its syntenic region in mouse chromosome 6. Genome Res. 8, 29–40 (1998)

    Google Scholar 

  4. Bafna, V., Huson, D.H.: The conserved exon method for gene finding. Proc. Int. Conf. Intell. Syst. Mol. Biol. 8, 3–12 (2000)

    Google Scholar 

  5. Batzoglou, S., Pachter, L., Mesirov, J., Berger, B., Lander, E.S.: Human and mouse gene structure: comparative analysis and application to exon prediction. Genome Res. 10, 950–958 (2000)

    Article  Google Scholar 

  6. Birney, E., Clamp, M., Durbin, R.: Genewise and genomewise. Genome Res. 14, 988–995 (2004)

    Article  Google Scholar 

  7. Burge, C.B.: Modeling dependencies in pre-mRNA splicing signals. In: Salzberg, S.L., Searls, D.B., Kasif, S. (eds.) Computational Methods in Molecular Biology, pp. 109–128. Elsevier Science B.V. (1998)

    Google Scholar 

  8. Burge, C.: Identification of genes in human genomic DNA. Ph.D. thesis, Stanford University, Stanford CA (1997)

    Google Scholar 

  9. Burge, C., Karlin, S.: Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268, 78–94 (1997)

    Article  Google Scholar 

  10. Chatterji, S., Pachter, L.: Reference based annotation with GeneMapper. Genome Biol. 7, R29 (2006)

    Article  Google Scholar 

  11. Dewey, C., Wu, J.Q., Cawley, S., Alexandersson, M., Gibbs, R., Pachter, L.: Accurate identification of novel human genes through simultaneous gene prediction in human, mouse, and rat. Genome Res. 14, 661–664 (2004)

    Article  Google Scholar 

  12. Gelfand, M.S., Mironov, A.A., Pevzner, P.A.: Gene recognition via spliced sequence alignment. Proc. Natl. Acad. Sci. USA 93, 9061–9066 (1996)

    Article  Google Scholar 

  13. Gish, W., States, D.J.: Identification of protein coding regions by database similarity search. Nat. Genet. 3, 266–272 (1993)

    Article  Google Scholar 

  14. Gross, S.S., Brent, M.R.: Using multiple alignments to improve gene prediction. J. Comput. Biol. 13, 379–393 (2006)

    Article  MathSciNet  Google Scholar 

  15. Hardison, R.C., Oeltjen, J., Miller, W.: Long human-mouse sequence alignments reveal novel regulatory elements: a reason to sequence the mouse genome. Genome Res. 7, 959–966 (1997)

    Google Scholar 

  16. Hirschberg, D.S.: A linear space algorithm for the computing maximal common subsequences. Comm. ACM 18, 341–343 (1975)

    Article  MATH  MathSciNet  Google Scholar 

  17. Kent, W.J.: BLAT—the BLAST-like alignment tool. Genome Res. 12, 656–664 (2002)

    Article  MathSciNet  Google Scholar 

  18. Kim, N., Shin, S., Lee, S.: ECgene: genome-based EST clustering and gene modeling for alternative splicing. Genome Res. 15, 566–576 (2005)

    Article  Google Scholar 

  19. Korf, I., Flicek, P., Duan, D., Brent, M.R.: Integrating genomic homology into gene structure prediction. Bioinformatics 17, S140–S148 (2001)

    Article  Google Scholar 

  20. Krogh, A.: Using database matches with HMMGene for automated gene detection in drosophila. Genome Res. 10, 523–528 (2000)

    Article  Google Scholar 

  21. Kulp, D., Haussler, D., Reese, M.G., Eeckman, F.H.: A generalized hidden Markov model for the recognition of human genes in DNA. Proc. Int. Conf. Intell. Syst. Mol. Biol. 4, 134–142 (1996)

    Google Scholar 

  22. Kulp, D., Haussler, D., Reese, M.G., Eeckman, F.H.: Integrating database homology in a probabilistic gene structure model. Pac. Symp. Biocomput. 2, 232–244 (1997)

    Google Scholar 

  23. Levine, A.: StrataSplice at http://www.sanger.ac.uk/Software/analysis/stratasplice/

  24. Meyer, I.M., Durbin, R.: Comparative ab initio prediction of gene structures using pair HMMs. Bioinformatics 18, 1309–1318 (2002)

    Article  Google Scholar 

  25. Meyer, I.M., Durbin, R.: Gene structure conservation aids similarity based gene prediction. Nucleic Acids Res. 32, 776–783 (2004)

    Article  Google Scholar 

  26. Mouse Genome Sequencing Consortium: Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520–562 (2002)

    Article  Google Scholar 

  27. Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443–453 (1970)

    Article  Google Scholar 

  28. Pachter, L., Alexandersson, M., Cawley, S.: Applications of generalized pair hidden Markov models to alignment and gene finding problems. J. Comput. Biol. 9, 389–399 (2002)

    Article  Google Scholar 

  29. Pachter, L., Batzoglou, S., Spitkovsky, V.I., Banks, E., Lander, E.S., Kleitman, D.J., Berger, B.: A dictionary based approach for gene annotation. J. Comput. Biol. 6, 419–430 (1999)

    Article  Google Scholar 

  30. Parra, G., Agarwal, P., Abril, J.F., Wiehe, T., Fickett, J.W., Guigó, R.: Comparative gene prediction in human and mouse. Genome Res. 13, 108–117 (2003)

    Article  Google Scholar 

  31. Rat Genome Sequencing Consortium: Genome sequence of the Brown Norway rat yields insights into mammalian evolution. Nature 428, 493–521 (2004)

    Article  Google Scholar 

  32. Schwartz, S., Zhang, Z., Frazer, K.A., Smit, A., Riemer, C., Bouck, J., Gibbs, R., Hardison, R., Miller, W.: PipMaker—a web server for aligning two genomic DNA sequences. Genome Res. 10, 577–586 (2000)

    Article  Google Scholar 

  33. Siepel, A., Haussler, D.: Phylogenetic estimation of context-dependent substitution rates by maximum likelihood. Mol. Biol. Evol. 21, 468–488 (2004)

    Article  Google Scholar 

  34. Smit, A.F.A., Hubley, R., Green, P.: RepeatMasker at http://www.repeatmasker.org

  35. Snyder, E.E., Stormo, G.D.: Identification of protein coding regions in genomic DNA. J. Mol. Biol. 248, 1–18 (1995)

    Article  Google Scholar 

  36. Wu, T.D., Watanabe, C.K.: GMAP: a genomic mapping and alignment program for mRNA and EST sequences. Bioinformatics 21, 1859–1875 (2005)

    Article  Google Scholar 

  37. Xu, Y., Mural, R.J., Einstein, J.R., Shah, M.B., Uberbacher, E.C.: GRAIL: a multi-agent neural network system for gene identification. Proc. IEEE 84, 1544–1552 (1996)

    Article  Google Scholar 

  38. Xu, Y., Uberbacher, E.C.: In: Salzberg, S.L., Searls, D.B., Kasif, S. (eds.) Computational Methods in Molecular Biology, pp. 109–128. Elsevier Science B.V. (1998)

    Google Scholar 

  39. Yeh, R.F., Lim, L.P., Burge, C.B.: Computational inference of homologous gene structures in the human genome. Genome Res. 11, 803–816 (2001)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marina Axelson-Fisk .

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer-Verlag London

About this chapter

Cite this chapter

Axelson-Fisk, M. (2015). Comparative Gene Finding. In: Comparative Gene Finding. Computational Biology, vol 20. Springer, London. https://doi.org/10.1007/978-1-4471-6693-1_4

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-6693-1_4

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-6692-4

  • Online ISBN: 978-1-4471-6693-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics