Permutation Editing and Matching via Embeddings

  • Graham Cormode
  • S. Muthukrishnan
  • Süleyman Cenk Sahinalp
Conference paper

DOI: 10.1007/3-540-48224-5_40

Volume 2076 of the book series Lecture Notes in Computer Science (LNCS)
Cite this paper as:
Cormode G., Muthukrishnan S., Sahinalp S.C. (2001) Permutation Editing and Matching via Embeddings. In: Orejas F., Spirakis P.G., van Leeuwen J. (eds) Automata, Languages and Programming. ICALP 2001. Lecture Notes in Computer Science, vol 2076. Springer, Berlin, Heidelberg

Abstract

If the genetic maps of two species are modelled as permutations of (homologous) genes, the number of chromosomal rearrangements in the form of deletions, block moves, inversions etc. to transform one such permutation to another can be used as a measure of their evolutionary distance. Motivated by such scenarios, we study problems of computing distances between permutations as well as matching permutations in sequences, and finding most similar permutation from a collection (“nearest neighbor”).

We adopt a general approach: embed permutation distances of relevance into well-known vector spaces in an approximately distance-preserving manner, and solve the resulting problems on the well-known spaces. Our results are as follows:
  1. We present the first known approximately distance preserving embeddings of these permutation distances into well-known spaces.

     
  2. Using these embeddings, we obtain several results, including the first known efficient solution for approximately solving nearest neighbor problems with permutations and the first known algorithms for finding permutation distances in the “data stream” model.

     
  3. We consider a novel class of problems called permutation matching problems which are similar to string matching problems, except that the pattern is a permutation (rather than a string) and present linear or near-linear time algorithms for approximately solving permutation matching problems; in contrast, the corresponding string problems take significantly longer.

     

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Graham Cormode
    • 1
  • S. Muthukrishnan
    • 2
  • Süleyman Cenk Sahinalp
    • 3
  1. 1.University of WarwickCoventryUK
  2. 2.AT&T ResearchUSA
  3. 3.EECSCase Western Reserve UniversityCleveland