Determining DNA sequence similarity using maximum independent set algorithms for interval graphs

  • Deborah Joseph
  • Joao Meidanis
  • Prasoon Tiwari
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 621)


Motivated by the problem of finding similarities in DNA and amino acid sequences, we study a particular class of two dimensional interval graphs and present an algorithm that finds a maximum weight “increasing” independent set for this class. Our class of interval graphs is a subclass of the graphs with interval number 2. The algorithm we present runs in O(n log n) time, where n is the number of nodes, and its implementation provides a practical solution to a common problem in genetic sequence comparison.


Maximum Weight Query Sequence Interval Graph Interval Number Response Sequence 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    A. V. Aho, J. E. Hopcroft, and J. D. Ullman. The Design and Analysis of Computer Algorithms. Addison-Wesley, Reading, MA, 1974.Google Scholar
  2. 2.
    Stephen F. Altschul, Warren Gish, Webb Miller, Eugene W. Myers, and David J. Lipman. A basic local alignment search tool. J. Mol. Biol., 215, 1990.Google Scholar
  3. 3.
    Alan A. Bertossi and Alessandro Gori. Total domination and irredundance in weighted interval graphs. SIAM J. Disc. Math., 1(3):317–327, 1988.Google Scholar
  4. 4.
    Martin Charles Golumbic. Algorithmic Graph Theory and Perfect Graphs. Academic Press, 1980.Google Scholar
  5. 5.
    Osamu Gotoh. Optimal sequence alignment allowing for long gaps. Bull. Math. Biol., 52(3):359–373, 1990.Google Scholar
  6. 6.
    U. I. Gupta, D. T. Lee, and J. Y.-T. Leung. Efficient algorithms for interval graphs and circular-arc graphs. Networks, 12:459–467, 1982.Google Scholar
  7. 7.
    Steven Henikoff, James C. Wallace, and Joseph P. Brown. Finding protein similarities with nucleotide sequence databases. In Russell F. Doolittle, editor, Molecular Evolution: Computer Analysis of Protein and Nucleic Acid Sequences, volume 183 of Methods in Enzymology, pages 111–132. Academic Press, 1990.Google Scholar
  8. 8.
    Xiaoqiu Huang, Ross C. Hardison, and Webb Miller. A space-efficient algorithm for local similarities. Comput. Applic. Biosci., 6(4):373–381, 1990.Google Scholar
  9. 9.
    Eric Lander, Jill P. Mesirov, and Washington Taylor. Study of protein sequence comparison metrics on the connection machine CM-2. J. Supercomp., pages 255–269, 1989.Google Scholar
  10. 10.
    D. J. Lipman and W. R. Pearson. Rapid and sensitive protein similarity search. Science, 227:1435–1441, 1985.PubMedGoogle Scholar
  11. 11.
    J. Maizel and R. Lenk. Enhanced graphic matrix analysis of nucleic acid and protein sequences. Proc. Nat. Acad. Sci. USA, 78:7665–7669, 1981.Google Scholar
  12. 12.
    Hugo M. Martinez. An efficient method for finding repeats in molecular sequences. Nucleic Acids Research, 11(13):4629–4634, 1983.Google Scholar
  13. 13.
    Webb Miller and Eugene W. Myers. Sequence comparison with concave weighting functions. Bull. Math. Biol., 50(2):97–120, 1988.Google Scholar
  14. 14.
    Saul B. Needleman and Christian D. Wunsch. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol., 48:443–s453, 1970.PubMedGoogle Scholar
  15. 15.
    William R. Pearson and David J. Lipman. Improved tools for biological sequence comparison. Proc. Nat. Acad. Sci. USA, 85:2444–2448, 1988.Google Scholar
  16. 16.
    Jude Shavlik. Finding genes by case-based reasoning in the presence of noisy case boundaries. In Proc. DARPA Cased-Based Reasoning Workshop, pages 327–338, Washington, DC, 1991.Google Scholar
  17. 17.
    T. F. Smith and M. S. Waterman. Identification of common molecular subsequences. J. Mol. Biol., 147:195–197, 1981.PubMedGoogle Scholar
  18. 18.
    T. F. Smith, M. S. Waterman, and W. M. Fitch. Comparative biosequence metrics. J. Molec. Evol., 18:38–46, 1981.Google Scholar
  19. 19.
    William T. Trotter, Jr. and Frank Harary. On double and multiple interval graphs. J. Graph Theory, 3:205–211, 1979.Google Scholar
  20. 20.
    M. S. Waterman and J. R. Griggs. Interval graphs and maps of DNA. Bull. Math. Biol., 48(2):189–195, 1986.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1992

Authors and Affiliations

  • Deborah Joseph
    • 1
  • Joao Meidanis
    • 1
  • Prasoon Tiwari
    • 1
  1. 1.Computer Sciencés DepartmentUniversity of Wisconsin-MadisonMadisonUSA

Personalised recommendations