Skip to main content

Towards Normalizing the Edit Distance Using a Genetic Algorithms–Based Scheme

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7713))

Abstract

The normalized edit distance is one of the distances derived from the edit distance. It is useful in some applications because it takes into account the lengths of the two strings compared. The normalized edit distance is not defined in terms of edit operations but rather in terms of the edit path. In this paper we propose a new derivative of the edit distance that also takes into consideration the lengths of the two strings, but the new distance is related directly to the edit distance. The particularity of the new distance is that it uses the genetic algorithms to set the values of the parameters it uses. We conduct experiments to test the new distance and we obtain promising results.

This work was carried out during the tenure of an ERCIM “Alain Bensoussan” Fellowship Programme. This Programme is supported by the Marie-Curie Co-funding of Regional, National and International Programmes (COFUND) of the European Commission.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Affenzeller, M., Winkler, S., Wagner, S., Beham, A.: Genetic Algorithms and Genetic Programming Modern Concepts and Practical Applications. Chapman and Hall/CRC (2009)

    Google Scholar 

  2. Haupt, R.L., Haupt, S.E.: Practical Genetic Algorithms with CD-ROM. Wiley-Interscience (2004)

    Google Scholar 

  3. Keogh, E., Chakrabarti, K., Pazzani, M., Mehrotra, S.: Dimensionality Reduction for Fast Similarity Search in Large Time Series Databases. J. of Know. and Inform. Sys (2000)

    Google Scholar 

  4. Keogh, E., Zhu, Q., Hu, B., Hao. Y., Xi, X., Wei, L., Ratanamahatana,C.A.: The UCR Time Series Classification/Clustering Homepage (2011), www.cs.ucr.edu/~eamonn/time_series_data/

  5. Kurtz, S. : Lecture Notes for Foundations of Sequence Analysis (2001)

    Google Scholar 

  6. Laguna, M., Marti, R.: Scatter Search: Methodology and Implementations in C. Springer, Heidelberg (2003)

    Book  Google Scholar 

  7. Lin, J., Keogh, E., Lonardi, S., Chiu, B.Y.: A Symbolic Representation of Time Series, with Implications for Streaming Algorithms. DMKD, 2–11 (2003)

    Google Scholar 

  8. Marzal, A., Vidal, E., Computation, E.: of Normalized Edit Distances and Applications. IEEE Transactions on Pattern Analysis and Machine Intelligence 15(9), 926–932 (1993)

    Article  Google Scholar 

  9. Mitchell, M.: An Introduction to Genetic Algorithms. MIT Press, Cambridge (1996)

    Google Scholar 

  10. Muhammad Fuad, M.M., Marteau, P.-F.: Extending the Edit Distance Using Frequencies of Common Characters. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds.) DEXA 2008. LNCS, vol. 5181, pp. 150–157. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  11. Muhammad Fuad, M.M., Marteau, P.F.: The Extended Edit Distance Metric. In: Sixth International Workshop on Content-Based Multimedia Indexing (CBMI 2008), London, UK, June 18-20 (2008)

    Google Scholar 

  12. Muhammad Fuad, M.M., Marteau, P.F.: The Multi-resolution Extended Edit Distance. In: Third International ICST Conference on Scalable Information Systems, Infoscale, 2008, ACM Digital Library, Vico Equense, Italy, June 4-6 (2008)

    Google Scholar 

  13. Wagner, R.A., Fischer, M.J.: The String-to-String Correction Problem. Journal of the Association for Computing Machinery 21(I), 168–173 (1974)

    Article  MathSciNet  MATH  Google Scholar 

  14. Yi, B.K., Faloutsos, C.: Fast Time Sequence Indexing for Arbitrary Lp norms. In: Proceedings of the 26st International Conference on Very Large Databases, Cairo, Egypt (2000)

    Google Scholar 

  15. Yu, X., Gen, M.: Introduction to Evolutionary Algorithms. Springer (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Muhammad Fuad, M.M. (2012). Towards Normalizing the Edit Distance Using a Genetic Algorithms–Based Scheme. In: Zhou, S., Zhang, S., Karypis, G. (eds) Advanced Data Mining and Applications. ADMA 2012. Lecture Notes in Computer Science(), vol 7713. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35527-1_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35527-1_40

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35526-4

  • Online ISBN: 978-3-642-35527-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics