Abstract
Recently, raw experimental data in machine learning often appear as direct comparisons between objects (featureless data). Different ways to evaluate difference or similarity of a pair of objects in image and data mining, image analysis, bioinformatics, etc., are usually used in practice. Nevertheless, such comparisons often are not distances or correlations (scalar products) like a correct function defined on a limited set of elements in machine learning. This problem is denoted as metric violations in ill-posed matrices. Therefore, it needs to recover violated metrics and provide optimal conditionality of corresponding matrices of pairwise comparisons for distances and similarities. This is the correct basis for using of modern machine learning algorithms.
Similar content being viewed by others
References
R. A. Horn and C. R. Johnson, Matrix Analysis, 2nd ed. (Cambridge University Press, Cambridge, 2013).
R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification (Wiley, New York, 2001).
S. D. Dvoenko, “Clustering and separating of a set of members in terms of mutual distances and similarities,” Trans. Mach. Learn. Data Mining 2 (2), 80–99 (2009).
S. D. Dvoenko and D. O. Pshenichny, “A recovering of violated metric in machine learning”, in Proc. 7th Symposium on Information and Communication Technology (SoICT’16) (ACM, New York, 2016), pp. 15–21. DOI: https://doi.org/10.1145/3011077.3011084
R. L. Bishop and R. J. Crittenden, Geometry of Manifolds (Academic Press, New York, 1964).
E. Pekalska and R. P. W. Duin, The Dissimilarity Representation for Pattern Recognition. Foundations and Applications (World Scientific, Singapore, 2005).
I. Dubchak, I. Muchnik, C. Mayor, I. Dralyuk, and S.–H. Kim, “Recognition of a protein fold in the context of the SCOP classification”, Proteins: Struct. Funct. Genetics 35, 401–407 (1999).
A. A. Goshtasby, Image Registration: Principles, Tools and Methods (Springer, London 2012). DOI: 10.1007/978–1–4471–2458–0
B. Schölkopf and A. J. Smola, Learning with Kernels (MIT Press, Cambridge, 2002).
Z.–H. Zhou, F. Roli, and J. Kittler (Eds.), Multiple Classifier Systems, tMCS 2013, Lecture Notes in Computer Science (Springer, Berlin, Heidelberg, 2013), Vol. 7872.
R. D. Luce, Individual Choice Behaviour (Wiley, New York, 1959).
D.S. Watkins, Fundamentals of Matrix Computations, 3rd ed. (Wiley, New York, 2010).
A. N. Tikhonov and V. Y. Arsenin, Solutions of Ill–Posed Problems (Winston, New York, 1977).
W. S. Torgerson, Theory and Methods of Scaling (Wiley, New York, 1958).
V. Mottl, S. Dvoenko, O. Seredin, C. Kulikowski, and I. Muchnik, “Featureless pattern recognition in an imaginary Hilbert space and its application to protein fold classification”, in Machine Learning and Data Mining in Pattern Recognition, Proc. MLDM 2001, Ed. by P. Perner, Lecture Notes in Computer Science (Springer, Berlin, Heidelberg, 2001), Vol. 2123, pp. 322–336.
Author information
Authors and Affiliations
Corresponding author
Additional information
The article is published in the original.
Sergei Danilovich Dvoenko. Graduated from the postgraduate courses of the Institute of Control Science, Russian Academy of Sciences. Received candidate’s degree in 1992. He received the associated professor designation in 1998. Graduated from the doctoral courses of the Tula State University and received doctor’s degree in 2002 at the Dorodnitsyn Computing Centre, Russian Academy of Sciences. Professor at the Tula State University. Member of the Russian organization “Association for Pattern Recognition and Image Analysis.” His scientific interests include the following fields: machine learning and pattern recognition, cluster-analysis and data mining, image processing, hidden Markov models, and fields in applied problems.
Denis Olegovich Pshenichny. Born in 1991. PhD student at the Tula State University. His scientific interests include the following fields: machine learning and pattern recognition, intelligent data analysis based on matrices of pairwise comparisons.
Rights and permissions
About this article
Cite this article
Dvoenko, S.D., Pshenichny, D.O. On Metric Correction and Conditionality of Raw Featureless Data in Machine Learning. Pattern Recognit. Image Anal. 28, 595–604 (2018). https://doi.org/10.1134/S1054661818040089
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S1054661818040089