Determining DNA sequence similarity using maximum independent set algorithms for interval graphs

Joseph, Deborah; Meidanis, Joao; Tiwari, Prasoon

doi:10.1007/3-540-55706-7_29

Deborah Joseph¹,
Joao Meidanis¹^nAff2 &
Prasoon Tiwari¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 621))

Included in the following conference series:

Scandinavian Workshop on Algorithm Theory

219 Accesses
21 Citations

Abstract

Motivated by the problem of finding similarities in DNA and amino acid sequences, we study a particular class of two dimensional interval graphs and present an algorithm that finds a maximum weight “increasing” independent set for this class. Our class of interval graphs is a subclass of the graphs with interval number 2. The algorithm we present runs in O(n log n) time, where n is the number of nodes, and its implementation provides a practical solution to a common problem in genetic sequence comparison.

Supported by NSF Presidential Young Investigator Grant DCR-8451387.

Partially supported by FAPESP, Brazil, under grant 87/0197-2.

Supported by Wisconsin Alumini Research Foundation and by National Science Foundation under grant CCR-9024516

This article was processed using the LaT_EX macro package LLNCS style

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

A. V. Aho, J. E. Hopcroft, and J. D. Ullman. The Design and Analysis of Computer Algorithms. Addison-Wesley, Reading, MA, 1974.
Google Scholar
Stephen F. Altschul, Warren Gish, Webb Miller, Eugene W. Myers, and David J. Lipman. A basic local alignment search tool. J. Mol. Biol., 215, 1990.
Google Scholar
Alan A. Bertossi and Alessandro Gori. Total domination and irredundance in weighted interval graphs. SIAM J. Disc. Math., 1(3):317–327, 1988.
Google Scholar
Martin Charles Golumbic. Algorithmic Graph Theory and Perfect Graphs. Academic Press, 1980.
Google Scholar
Osamu Gotoh. Optimal sequence alignment allowing for long gaps. Bull. Math. Biol., 52(3):359–373, 1990.
Google Scholar
U. I. Gupta, D. T. Lee, and J. Y.-T. Leung. Efficient algorithms for interval graphs and circular-arc graphs. Networks, 12:459–467, 1982.
Google Scholar
Steven Henikoff, James C. Wallace, and Joseph P. Brown. Finding protein similarities with nucleotide sequence databases. In Russell F. Doolittle, editor, Molecular Evolution: Computer Analysis of Protein and Nucleic Acid Sequences, volume 183 of Methods in Enzymology, pages 111–132. Academic Press, 1990.
Google Scholar
Xiaoqiu Huang, Ross C. Hardison, and Webb Miller. A space-efficient algorithm for local similarities. Comput. Applic. Biosci., 6(4):373–381, 1990.
Google Scholar
Eric Lander, Jill P. Mesirov, and Washington Taylor. Study of protein sequence comparison metrics on the connection machine CM-2. J. Supercomp., pages 255–269, 1989.
Google Scholar
D. J. Lipman and W. R. Pearson. Rapid and sensitive protein similarity search. Science, 227:1435–1441, 1985.
PubMed Google Scholar
J. Maizel and R. Lenk. Enhanced graphic matrix analysis of nucleic acid and protein sequences. Proc. Nat. Acad. Sci. USA, 78:7665–7669, 1981.
Google Scholar
Hugo M. Martinez. An efficient method for finding repeats in molecular sequences. Nucleic Acids Research, 11(13):4629–4634, 1983.
Google Scholar
Webb Miller and Eugene W. Myers. Sequence comparison with concave weighting functions. Bull. Math. Biol., 50(2):97–120, 1988.
Google Scholar
Saul B. Needleman and Christian D. Wunsch. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol., 48:443–s453, 1970.
PubMed Google Scholar
William R. Pearson and David J. Lipman. Improved tools for biological sequence comparison. Proc. Nat. Acad. Sci. USA, 85:2444–2448, 1988.
Google Scholar
Jude Shavlik. Finding genes by case-based reasoning in the presence of noisy case boundaries. In Proc. DARPA Cased-Based Reasoning Workshop, pages 327–338, Washington, DC, 1991.
Google Scholar
T. F. Smith and M. S. Waterman. Identification of common molecular subsequences. J. Mol. Biol., 147:195–197, 1981.
PubMed Google Scholar
T. F. Smith, M. S. Waterman, and W. M. Fitch. Comparative biosequence metrics. J. Molec. Evol., 18:38–46, 1981.
Google Scholar
William T. Trotter, Jr. and Frank Harary. On double and multiple interval graphs. J. Graph Theory, 3:205–211, 1979.
Google Scholar
M. S. Waterman and J. R. Griggs. Interval graphs and maps of DNA. Bull. Math. Biol., 48(2):189–195, 1986.
Google Scholar

Download references

Author information

Joao Meidanis
Present address: Computer Science Dept., State University of Campinas, Cx. Postal 6065, 13081, Campinas- SP, Brazil

Authors and Affiliations

Computer Sciencés Department, University of Wisconsin-Madison, 53705, Madison, WI, USA
Deborah Joseph, Joao Meidanis & Prasoon Tiwari

Authors

Deborah Joseph
View author publications
You can also search for this author in PubMed Google Scholar
Joao Meidanis
View author publications
You can also search for this author in PubMed Google Scholar
Prasoon Tiwari
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Otto Nurmi Esko Ukkonen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Joseph, D., Meidanis, J., Tiwari, P. (1992). Determining DNA sequence similarity using maximum independent set algorithms for interval graphs. In: Nurmi, O., Ukkonen, E. (eds) Algorithm Theory — SWAT '92. SWAT 1992. Lecture Notes in Computer Science, vol 621. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-55706-7_29

Download citation

DOI: https://doi.org/10.1007/3-540-55706-7_29
Published: 02 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-55706-7
Online ISBN: 978-3-540-47275-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics