Abstract
This paper proposes a novel method for dimensionality reduction based on a function approximating the Euclidean distance, which makes use of the norm and angle components of a vector. First, we identify the causes of errors in angle estimation for approximating the Euclidean distance, and discuss basic solutions to reduce those errors. Then, we propose a new method for dimensionality reduction that composes a set of subvectors from a feature vector and maintains only the norm and the estimated angle for every subvector. The selection of a good reference vector is important for accurate estimation of the angle component. We present criteria for being a good reference vector, and propose a method that chooses a good reference vector by using the Levenberg-Marquardt algorithm. Also, we define a novel distance function, and formally prove that the distance function consistently lower-bounds the Euclidean distance. This implies that our approach does not incur any false dismissals in reducing the dimensionality. Finally, we verify the superiority of the proposed approach via performance evaluation with extensive experiments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aggarwal, C.C.: On the Effects of Dimensionality Reduction on High Dimensional Similarity Search. In: Proc. of Int’l. Symp. on Principles of Database Systems, pp. 256–266 (2001)
Agrawal, R., Faloutsos, C., Swami, A.: Efficient Similarity Search in Sequence Database. In: Proc. of Int’l. Conf. on Foundations of Data Organization and Algorithms, pp. 69–84 (1993)
Beyer, K.S., Goldstein, J., Ramakrishnan, R., Shaft, U.: When Is Nearest Neighbor Meaningful? In: Proc. of Int’l. Conf. on Database Theory, pp. 217–235 (1999)
Bohm, C., Berchtold, S., Keim, D.A.: Searching in High-Dimensional Spaces-Index Structures for Improving the Performance of Multimedia Databases. ACM Computing Surveys 33(3), 322–373 (2001)
Egecioglu, O., Ferhatosmanoglu, H., Ogras, U.: Dimensionality Reduction and Similarity Computation by Inner Product Approximations. IEEE Trans. on Knowledge and Data Engineering, 714–726 (2004)
Jeong, S., Kim, S.-W., Kim, K., Choi, B.-U.: An Effective Method for Approximating the Euclidean Distance in High-Dimensional Space. In: Proc. of Int’l. Conf. on Databases and Expert Systems Applications, pp. 863–872 (2006)
Kanth, K.V.R., Agrawal, D., Singh, A.: Dimensionality Reduction for Similarity Searching in Dynamic Databases. In: Proc. of Int’l. Conf. on Management of Data. ACM SIGMOD, pp. 166–176. ACM Press, New York (1998)
Moon, T.K., Stirling, W.C.: Mathematical Methods and Algorithms for Signal Processing. Prentice-Hall, Englewood Cliffs (2000)
Pagel, B.-U., Six, H-W., Winter, M.: Window Query-Optimal Clustering of Spatial Objects. In: Proc. of Int’l. Conf. on Very Large Data Bases. VLDB., pp. 506–515 (1997)
Seidl, T., Kriegel, H.-P.: Optimal Multi-Step k-Nearest Neighbor Search. In: Proc. of Int’l. Conf. on Management of Data. ACM SIGMOD, pp. 154–165. ACM Press, New York (1998)
Weber, R., Schek, H.J., Blott, S.: A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces. In: Proc. of Int’l. Conf. on Very Large Data Bases. VLDB., pp. 194–205 (1998)
http://kdd.ics.uci.edu/databases/CorelFeatures/CorelFeatures.html
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jeong, S., Kim, SW., Choi, BU. (2007). Dimensionality Reduction in High-Dimensional Space for Multimedia Information Retrieval. In: Wagner, R., Revell, N., Pernul, G. (eds) Database and Expert Systems Applications. DEXA 2007. Lecture Notes in Computer Science, vol 4653. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74469-6_40
Download citation
DOI: https://doi.org/10.1007/978-3-540-74469-6_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74467-2
Online ISBN: 978-3-540-74469-6
eBook Packages: Computer ScienceComputer Science (R0)