Abstract
Nearest neighbor search is one of the most fundamental problem in machine learning, machine vision, clustering, information retrieval, etc. To handle a dataset of million or more records, efficient storing and retrieval techniques are needed. Binary code is an efficient method to address these two problems. Recently, the problem of finding good binary code has been formulated and solved, resulting in a technique called spectral hashing [21]. In this work we analyze the spectral hashing, its possible shortcomings and solutions. Experimental results are promising.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Andoni, A., Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.: Locality-Sensitive Hashing Scheme Based on p-Stable Distributions. In: Nearest Neighbor Methods in Learning and Vision: Theory and Practice. MIT Press, Cambridge (2006)
Beis, J.S., Lowe, D.G.: Shape indexing using approximating nearest-neighbor search in high-dimensional spaces. In: The International Conference on Computer Vision (CVPR) (1997)
Beygelzimer, A., Kakade, S., Langford, J.: Cover trees for nearest neighbor. In: Proceedings of the International Conference on Machine Learning (ICML) (2006)
Chen, Y.-S., Hung, Y.-P., Fuh, C.-S.: Fast algorithm for nearest neighbor search based on a lower bound tree. In: ICCV, pp. 446–453 (2001)
Dasgupta, S., Freund, Y.: Random projection trees and low dimensional manifolds. In: Annual ACM Symposium on Theory of Computing (STOC), pp. 537–546 (2008)
Devroye, L.: Non-Uniform Random Variate Generation. Springer-Verlag New York Inc., New York (1986)
Freidman, J., Bentley, J., Finkel, A.: An algorithm for finding best matches in logarithmic expected time. ACM Transactions on Mathematical Software 3(3), 209–226 (1977)
Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hashing. In: VLDB 1999, Proceedings of 25th International Conference on Very Large Data Bases (1999)
Goldstein, J., Plat, J.C., Burges, C.J.C.: Redundant bit vectors for quickly searching high-dimensional regions. In: Winkler, J.R., Niranjan, M., Lawrence, N.D. (eds.) Deterministic and Statistical Methods in Machine Learning. LNCS (LNAI), vol. 3635, pp. 137–158. Springer, Heidelberg (2005)
Gonzalez, R.C., Woods, R.E.: Digital Image Processing. Prentice-Hall, N.J. (2002)
Indyk, P., Motwani, R.: Approximate nearest neighbors: Towards removing the curse of dimensionality. In: STOC, pp. 604–613 (1998)
Jegou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 304–317. Springer, Heidelberg (2008)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal on Computer Vision (IJCV) 60(2), 91–110 (2004)
McNames, J.: A fast nearest-neighbor algorithm based on a principal axis search tree. IEEE Transactions Pattern Analysis and Machine Intelligence 23(9), 964–976 (2001)
Salakhutdinov, R.R., Hinton, G.E.: Learning a nonlinear embedding by preserving class neighborhood structure. In: AI and Statistics (2007)
Salakhutdinov, R.R., Hinton, G.E.: Semantic hashing. In: SIGIR Workshop on Information Retrieval and Applications of Graphical Models (2007)
Scholkopf, B., Smola, A., Muller, K.-R.: Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation 10, 1299–1319 (1998)
Shakhanarovich, G., Viola, P., Darrell, T.: Fast pose estimation with parameter sensitive hashing. In: International Conference on Computer Vision (ICCV) (2003)
Silpa-Anan, C., Hartley, R.: Optimised kd-trees for fast image descriptor matching. In: The International Conference on Computer Vision (CVPR) (2008)
Torralba, A., Fergus, R., Weiss, Y.: Small codes and large databases for recognition. In: The International Conference on Computer Vision (CVPR) (2008)
Weiss, Y., Torralba, A., Fergus, R.: Spectral hashing. In: Advance in Neural Information Processings (NIPS) (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Marukatat, S., Sinthupinyo, W. (2011). Improved Spectral Hashing. In: Huang, J.Z., Cao, L., Srivastava, J. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2011. Lecture Notes in Computer Science(), vol 6634. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20841-6_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-20841-6_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20840-9
Online ISBN: 978-3-642-20841-6
eBook Packages: Computer ScienceComputer Science (R0)