Learning Good Edit Similarities with Generalization Guarantees
Similarity and distance functions are essential to many learning algorithms, thus training them has attracted a lot of interest. When it comes to dealing with structured data (e.g., strings or trees), edit similarities are widely used, and there exists a few methods for learning them. However, these methods offer no theoretical guarantee as to the generalization performance and discriminative power of the resulting similarities. Recently, a theory of learning with (ε, γ,τ)-good similarity functions was proposed. This new theory bridges the gap between the properties of a similarity function and its performance in classification. In this paper, we propose a novel edit similarity learning approach (GESL) driven by the idea of (ε,γ,τ)-goodness, which allows us to derive generalization guarantees using the notion of uniform stability. We experimentally show that edit similarities learned with our method induce classification models that are both more accurate and sparser than those induced by the edit distance or edit similarities learned with a state-of-the-art method.
KeywordsEdit Similarity Learning Good Similarity Functions
Unable to display preview. Download preview PDF.
- 1.Yang, L., Jin, R.: Distance Metric Learning: A Comprehensive Survey. Technical report, Dep. of Comp. Science and Eng., Michigan State University (2006)Google Scholar
- 2.Davis, J.V., Kulis, B., Jain, P., Sra, S., Dhillon, I.S.: Information-theoretic metric learning. In: Proc. of the Int. Conf. on Machine Learning (ICML), pp. 209–216 (2007)Google Scholar
- 4.Jin, R., Wang, S., Zhou, Y.: Regularized distance metric learning: Theory and algorithm. In: Adv. in Neural Inf. Proc. Sys. (NIPS), pp. 862–870 (2009)Google Scholar
- 6.Bilenko, M., Mooney, R.J.: Adaptive Duplicate Detection Using Learnable String Similarity Measures. In: Proc. of the Int. Conf. on Knowledge Discovery and Data Mining (SIGKDD), pp. 39–48 (2003)Google Scholar
- 9.Takasu, A.: Bayesian Similarity Model Estimation for Approximate Recognized Text Search. In: Proc. of the Int. Conf. on Doc. Ana. and Reco., pp. 611–615 (2009)Google Scholar
- 10.Saigo, H., Vert, J.-P., Akutsu, T.: Optimizing amino acid substitution matrices with a local alignment kernel. BMC Bioinformatics 7(246), 1–12 (2006)Google Scholar
- 11.Balcan, M.F., Blum, A.: On a Theory of Learning with Similarity Functions. In: Proc. of the Int. Conf. on Machine Learning (ICML), pp. 73–80 (2006)Google Scholar
- 12.Balcan, M.F., Blum, A., Srebro, N.: Improved Guarantees for Learning via Similarity Functions. In: Proc. of the Conf. on Learning Theory (COLT), pp. 287–298 (2008)Google Scholar
- 14.Wang, L., Yang, C., Feng, J.: On Learning with Dissimilarity Functions. In: Proc. of the Int. Conf. on Machine Learning (ICML), pp. 991–998 (2007)Google Scholar
- 15.Zhu, J., Rosset, S., Hastie, T., Tibshirani, R.: 1-norm Support Vector Machines. In: Adv. in Neural Inf. Proc. Sys. (NIPS), vol. 16, pp. 49–56 (2003)Google Scholar
- 17.McCallum, A., Bellare, K., Pereira, F.: A Conditional Random Field for Discriminatively-trained Finite-state String Edit Distance. In: Conference on Uncertainty in AI, pp. 388–395 (2005)Google Scholar
- 18.McDiarmid, C.: On the method of bounded differences. In: Surveys in Combinatorics, pp. 148–188. Cambridge University Press, Cambridge (1989)Google Scholar