Phrase Similarity through the Edit Distance
This work intends to capture the concept of similarity between phrases. The algorithm is based on a dynamic programming approach integrating both the edit distance between parse trees and single-term similarity. Our work stresses the use of the underlying grammatical structure, which serves as a guide in the computation of semantic similarity between words. This proposal allows us to obtain a more accurate notion of semantic proximity at sentence level, without increasing the complexity of the pattern-matching algorithm on which it is based.
Unable to display preview. Download preview PDF.
- 1.Hammouda, K., Kamel, M.: Phrase-based document similarity based on an index graph model. In: 2002 IEEE Int. Conf. on Data Mining, Maebashi, Japan, pp. 203–210 (2002)Google Scholar
- 3.Lin, D.: An information-theoretic definition of similarity. In: Proc. 15th International Conf. on Machine Learning, pp. 296–304 (1998)Google Scholar
- 4.Miller, G.: WordNet: An online lexical database. International Journal of Lexico- graphy 3(4) (1990)Google Scholar
- 5.Mitchell: Machine learning and data mining. CACM: Communications of the ACM 42 (1999)Google Scholar
- 7.Vilares, M., Dion, B.A.: Efficient incremental parsing for context-free languages. In: Proc. of the 5th IEEE Int. Conf. on Computer Languages, Toulouse, France, pp. 241–252 (1994)Google Scholar