Advertisement

Soft Computing

, Volume 23, Issue 21, pp 10939–10952 | Cite as

An empirical approach for probing the definiteness of kernels

  • Martin ZaeffererEmail author
  • Thomas Bartz-Beielstein
  • Günter Rudolph
Methodologies and Application
  • 35 Downloads

Abstract

Models like support vector machines or Gaussian process regression often require positive semi-definite kernels. These kernels may be based on distance functions. While definiteness is proven for common distances and kernels, a proof for a new kernel may require too much time and effort for users who simply aim at practical usage. Furthermore, designing definite distances or kernels may be equally intricate. Finally, models can be enabled to use indefinite kernels. This may deteriorate the accuracy or computational cost of the model. Hence, an efficient method to determine definiteness is required. We propose an empirical approach. We show that sampling as well as optimization with an evolutionary algorithm may be employed to determine definiteness. We provide a proof of concept with 16 different distance measures for permutations. Our approach allows to disprove definiteness if a respective counterexample is found. It can also provide an estimate of how likely it is to obtain indefinite kernel matrices. This provides a simple, efficient tool to decide whether additional effort should be spent on designing/selecting a more suitable kernel or algorithm.

Keywords

Definiteness Kernel Distance Sampling Optimization Evolutionary algorithm 

Notes

Compliance with ethical standards

Conflict of interest

All authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

References

  1. Bader DA, Moret BM, Warnow T, Wyman SK, Yan M, Tang J, Siepel AC, Caprara A (2004) Genome rearrangements analysis under parsimony and other phylogenetic algorithms (grappa) 2.0. https://www.cs.unm.edu/~moret/GRAPPA/. Accessed 16 Nov 2016
  2. Bartz-Beielstein T, Zaefferer M (2017) Model-based methods for continuous and discrete global optimization. Appl Soft Comput 55:154–167CrossRefGoogle Scholar
  3. Berg C, Christensen JPR, Ressel P (1984) Harmonic analysis on semigroups, volume 100 of graduate texts in mathematics. Springer, New YorkCrossRefGoogle Scholar
  4. Beume N, Naujoks B, Emmerich M (2007) SMS-EMOA: multiobjective selection based on dominated hypervolume. Eur J Oper Res 181(3):1653–1669CrossRefGoogle Scholar
  5. Boytsov L (2011) Indexing methods for approximate dictionary searching: comparative analysis. J Exp Algorithmics 16:1–91MathSciNetCrossRefGoogle Scholar
  6. Burges CJ (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2(2):121–167CrossRefGoogle Scholar
  7. Camastra F, Vinciarelli A (2008) Machine learning for audio, image and video analysis: theory and applications. Advanced information and knowledge processing. Springer, LondonCrossRefGoogle Scholar
  8. Campos V, Laguna M, Martí R (2005) Context-independent scatter and tabu search for permutation problems. INFORMS J Comput 17(1):111–122MathSciNetCrossRefGoogle Scholar
  9. Camps-Valls G, Martín-Guerrero JD, Rojo-Álvarez JL, Soria-Olivas E (2004) Fuzzy sigmoid kernel for support vector classifiers. Neurocomputing 62:501–506CrossRefGoogle Scholar
  10. Chen Y, Gupta MR, Recht B (2009) Learning kernels from indefinite similarities. In: Proceedings of the 26th annual international conference on machine learning (ICML ’09), New York, NY, USA. ACM, pp 145–152Google Scholar
  11. Constantine G (1985) Lower bounds on the spectra of symmetric matrices with nonnegative entries. Linear Algebra Appl 65:171–178MathSciNetCrossRefGoogle Scholar
  12. Cortes C, Haffner P, Mohri M (2004) Rational kernels: theory and algorithms. J Mach Learn Res 5:1035–1062MathSciNetzbMATHGoogle Scholar
  13. Curriero F (2006) On the use of non-euclidean distance measures in geostatistics. Math Geol 38(8):907–926MathSciNetCrossRefGoogle Scholar
  14. Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans Evol Comput 6(2):182–197CrossRefGoogle Scholar
  15. Deza M, Huang T (1998) Metrics on permutations, a survey. J Comb Inf Syst Sci 23(1–4):173–185MathSciNetzbMATHGoogle Scholar
  16. Eiben AE, Smith JE (2003) Introduction to evolutionary computing. Springer, BerlinCrossRefGoogle Scholar
  17. Feller W (1971) An introduction to probability theory and its applications, vol 2. Wiley, HobokenzbMATHGoogle Scholar
  18. Forrester A, Sobester A, Keane A (2008) Engineering design via surrogate modelling. Wiley, HobokenCrossRefGoogle Scholar
  19. Gablonsky J, Kelley C (2001) A locally-biased form of the direct algorithm. J Glob Optim 21(1):27–37MathSciNetCrossRefGoogle Scholar
  20. Gärtner T, Lloyd J, Flach P (2003) Kernels for structured data. In: Matwin S, Sammut C (eds) Inductive logic programming, vol 2583. Lecture Notes in Computer Science. Springer, Berlin, pp 66–83CrossRefGoogle Scholar
  21. Gärtner T, Lloyd J, Flach P (2004) Kernels and distances for structured data. Mach Learn 57(3):205–232CrossRefGoogle Scholar
  22. Haussler D (1999) Convolution kernels on discrete structures. Technical report UCSC-CRL-99-10, Department of computer science, University of California at Santa CruzGoogle Scholar
  23. Hirschberg DS (1975) A linear space algorithm for computing maximal common subsequences. Commun ACM 18(6):341–343MathSciNetCrossRefGoogle Scholar
  24. Hutter F, Hoos HH, Leyton-Brown K (2011) Sequential model-based optimization for general algorithm configuration. In Proceedings of LION-5, pp 507–523Google Scholar
  25. Ikramov K, Savel’eva N (2000) Conditionally definite matrices. Journal of Mathematical Sciences 98(1):1–50MathSciNetCrossRefGoogle Scholar
  26. Jiao Y, Vert J.-P (2015) The Kendall and Mallows kernels for permutations. In: Proceedings of the 32nd international conference on machine learning (ICML-15), pp 1935–1944Google Scholar
  27. Kendall M, Gibbons J (1990) Rank correlation methods. Oxford University Press, OxfordzbMATHGoogle Scholar
  28. Lee C (1958) Some properties of nonbinary error-correcting codes. IRE Trans Inf Theory 4(2):77–82MathSciNetCrossRefGoogle Scholar
  29. Li H, Jiang T (2004) A class of edit kernels for SVMS to predict translation initiation sites in eukaryotic mrnas. In: Proceedings of the eighth annual international conference on resaerch in computational molecular biology (RECOMB ’04), New York, NY, USA. ACM, pp 262–271Google Scholar
  30. Loosli G, Canu S, Ong C (2015) Learning SVM in Krein spaces. IEEE Trans Pattern Anal Mach Intell 38(6):1204–1216CrossRefGoogle Scholar
  31. Marteau P-F, Gibet S (2014) On recursive edit distance kernels with application to time series classification. IEEE Trans Neural Netw Learn Syst PP(99):1–1Google Scholar
  32. Moraglio A, Kattan A (2011) Geometric generalisation of surrogate model based optimisation tocombinatorial spaces. In: Proceedings of the 11th European conference on evolutionary computation in combinatorial optimization (EvoCOP’11), Berlin, Heidelberg, Germany. Springer, pp 142–154Google Scholar
  33. Motwani R, Raghavan P (1995) Randomized algorithms. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  34. Murphy KP (2012) Machine learning. MIT Press Ltd., CambridgezbMATHGoogle Scholar
  35. Ong CS, Mary X, Canu S, Smola AJ (2004) Learning with non-positive kernels. In: Proceedings of the twenty-first international conference on machine learning (ICML ’04), New York, NY, USA. ACM, pp 81–88Google Scholar
  36. Pawlik M, Augsten N (2015) Efficient computation of the tree edit distance. ACM Trans Database Syst 40(1):1–40MathSciNetCrossRefGoogle Scholar
  37. Pawlik M, Augsten N (2016) Tree edit distance: robust and memory-efficient. Inf Syst 56:157–173CrossRefGoogle Scholar
  38. Rasmussen CE, Williams CKI (2006) Gaussian processes for machine learning. The MIT Press, CambridgezbMATHGoogle Scholar
  39. Reeves CR (1999) Landscapes, operators and heuristic search. Ann Oper Res 86:473–490MathSciNetCrossRefGoogle Scholar
  40. Schiavinotto T, Stützle T (2007) A review of metrics on permutations for search landscape analysis. Comput Oper Res 34(10):3143–3153CrossRefGoogle Scholar
  41. Schleif F-M, Tino P (2015) Indefinite proximity learning: a review. Neural Comput 27(10):2039–2096MathSciNetCrossRefGoogle Scholar
  42. Schleif F-M, Tino P (2017) Indefinite core vector machine. Pattern Recognit 71:187–195CrossRefGoogle Scholar
  43. Schölkopf B (2001) The kernel trick for distances. In: Leen TK, Dietterich TG, Tresp V (eds) Advances in neural information processing systems, vol 13. MIT Press, Cambridge, pp 301–307Google Scholar
  44. Sevaux M, Sörensen K (2005) Permutation distance measures for memetic algorithms with population management. In: Proceedings of 6th metaheuristics international conference (MIC’05), University of Vienna, pp. 832–838Google Scholar
  45. Singhal A (2001) Modern information retrieval: a brief overview. IEEE Bull Data Eng 24(4):35–43Google Scholar
  46. Smola AJ, Ovári ZL, Williamson RC (2000) Regularization with dot-product kernels. In: Advances in neural information processing systems vol 13, Proceedings. MIT Press, pp 308–314Google Scholar
  47. van der Loo MP (2014) The stringdist package for approximate string matching. R J 6(1):111–122CrossRefGoogle Scholar
  48. Vapnik VN (1998) Statistical learning theory, vol 1. Wiley, New YorkzbMATHGoogle Scholar
  49. Voutchkov I, Keane A, Bhaskar A, Olsen TM (2005) Weld sequence optimization: the use of surrogate models for solving sequential combinatorial problems. Comput Methods Appl Mech Eng 194(30–33):3535–3551CrossRefGoogle Scholar
  50. Wagner RA, Fischer MJ (1974) The string-to-string correction problem. J ACM 21(1):168–173MathSciNetCrossRefGoogle Scholar
  51. Wu G, Chang EY, Zhang Z (2005) An analysis of transformation on non-positive semidefinite similarity matrix for kernel machines. In: Proceedings of the 22nd international conference on machine learningGoogle Scholar
  52. Zaefferer M, Bartz-Beielstein T (2016) Efficient global optimization with indefinite kernels. In: Parallel problem solving from nature-PPSN XIV. Springer, pp 69–79Google Scholar
  53. Zaefferer M, Stork J, Bartz-Beielstein T (2014a) Distance measures for permutations in combinatorial efficient global optimization. In: Bartz-Beielstein T, Branke J, Filipič B, Smith J (eds) Parallel problem solving from nature-PPSN XIII. Springer, Cham, pp 373–383CrossRefGoogle Scholar
  54. Zaefferer M, Stork J, Friese M, Fischbach A, Naujoks B, Bartz-Beielstein T (2014b) Efficient global optimization for combinatorial problems. In: Proceedings of the 2014 conference on genetic and evolutionary computation (GECCO ’14), New York, NY, USA. ACM, pp 871–878Google Scholar
  55. Zhan X (2006) Extremal eigenvalues of real symmetric matrices with entries in an interval. SIAM J Matrix Anal Appl 27(3):851–860MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Faculty of Computer Science and Engineering ScienceTH Köln - University of Applied SciencesGummersbachGermany
  2. 2.Department of Computer ScienceTU Dortmund UniversityDortmundGermany

Personalised recommendations