Applying Logic Programming to Derive Novel Functional Information of Genomes

  • Arvind K. BansalEmail author
  • Peer Bork
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1551)


This paper describes an application of the logic programming paradigm to large-scale comparison of complete microbial genomes each containing four-million amino acid characters and approximately two thousand genes. We present algorithms and a Sicstus Prolog based implementation to model genome comparisons as bipartite graph matching to identify orthologs — genes across different genomes with the same function — and groups of orthologous genes — orthologous genes in close proximity, and gene duplications. The application is modular, and integrates logic programming with Unix-based programming and a Prolog based text-processing library developed during this project. The scheme has been successfully applied to compare eukaryotes such as yeast. The data generated by the software is being used by microbiologists and computational biologists to understand the regulation mechanisms and the metabolic pathways in microbial genomes.


declarative programming gene groups genome comparison logic programming metabolic pathway microbes Prolog application operons orthologs 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    Alberts, B., Bray, D., Lewis, J., Raff, M., Roberts, K., and Watson, J. D.: Molecular Biology of THE CELL, Garland Publishing Inc. 1983Google Scholar
  2. [2]
    Almgren, J., Anderson J., Anderson S., et. al.: Sicstus 3 Prolog Manual. Swedish Institute of Computer Science, 1995Google Scholar
  3. [3]
    Altschul, S. F., Gish, W., Miller, W., Myers, E. W., Lipman, D. J: Basic Alignment Search Tools, J. Mol. Biol., vol. 215, (1991) 403–410Google Scholar
  4. [4]
    Baby, O. and: Non-Deterministic, Constraint-Based Parsing of Human Gene, Ph. D. thesis, Brandeis University, USA, (
  5. [5]
    Bansal, A. K.: Establishing a Framework for Comparative Analysis of Genome Sequences. Proceedings of the International Symposium of Intelligence in Neural and Biological Systems, Herndon VA USA (1995) 84–91Google Scholar
  6. [6]
    Bansal, A. K.: Automated Reconstruction of Metabolic Pathway of Newly Sequenced Microbial Genomes using Ortholog Analysis. to be submittedGoogle Scholar
  7. [7]
    Bansal, A. K., Bork, P., and Stuckey P.: Automated Pair-wise Comparisons of Microbial Genomes. Mathematical Modeling and Scientific Computing, Vol. 9 Issue 1 (1998) 1–23CrossRefGoogle Scholar
  8. [8]
    Fitch, W. M.: Distinguishing Homologous from Analogous Proteins. Systematic Zoology, (1970) 99–113Google Scholar
  9. [9]
    Fleischmann, R. D., Adams, M. D., White O., et. al.: Whole-Genome Random Sequencing and Assembly of Haemophilus influenzae Rd. Science 269 (1995) 496–512CrossRefGoogle Scholar
  10. [10]
    Gaasterland, T., Maltsev, N., and Overbeek, R.: The Role of Integrated Databases In Microbial Genome Sequence Analysis and Metabolic Reconstruction. In: Proceedings of the Second International Meeting on Integration of Molecular Biological Databases, Cambridge, England, July 1995.Google Scholar
  11. [11]
    Gish, W.: WU-BLAST version 2.0. Washington University, St. Louis MO USA, (
  12. [12]
    Hardy, P. and Waterman, M. S.: The Sequence Alignment Software Library. University of Southern California LA USA, (
  13. [13]
    Olsen, J., Woese, C. R., and Overbeek R.: The Winds of Evolutionary Change: Breathing New Life into Microbiology. Journal of Bacteriology, Vol. 176 issue 1 (1994) 1–6Google Scholar
  14. [14]
    Papadimitrou, C. H., and Steiglitz, K.: Combinatorial Optimization: Algorithm and Complexity. Prentice Hall, (1982)Google Scholar
  15. [15]
    Searls, D. B.: The Linguistics of DNA. American Scientist 80: 579–591Google Scholar
  16. [16]
    Setubal, J. and Meidanis J.: Introduction to Computational Biology. PWS Publishing Company, (1997)Google Scholar
  17. [17]
    Sterling, L. S.. and Shapiro, E. Y.: The Art of Prolog. MIT Press, (1994)Google Scholar
  18. [18]
    Tatusov, R. L., Mushegian, M., Bork P. et. al.: Metabolism and Evolution of Haemophilius Influenzae Deduced From a Whole-Genome Comparison with Escherichia Coli. Current Biology, Vol. 6 issue 3 (1996) 279–291CrossRefGoogle Scholar
  19. [19]
    Tatusov, R. L., Koonin, E. V., Lipman, D. J.: A Genomic Perspective on Protein Families. Science, Vol. 278 (1997) 631–637CrossRefGoogle Scholar
  20. [20]
    Vitreschak, A., Bansal, A. K., Gelfand, M. S.: Conserved RNA structures regulation initiation of translation of Escerichia coli and Haemophilus influenzae ribosomal protein operons. First International Conference on Bioinformatics of Genome Regulation and Structure, Novosibirsk, Russia, (August 1998) 229Google Scholar
  21. [21]
    Waterman, M.S.: Introduction to Computational Biology: Maps, Sequences, and Genomes. Chapman & Hall, (1995)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1998

Authors and Affiliations

  1. 1.Department of Mathematics and Computer ScienceKent State UniversityKentUSA
  2. 2.Computer Informatics DivisionEuropean Molecular Biology LaboratoryHeidelbergGermany

Personalised recommendations