On Semantic Solutions for Efficient Approximate Similarity Search on Large-Scale Datasets

Ocsa, Alexander; Huillca, Jose Luis; Lopez del Alamo, Cristian

doi:10.1007/978-3-319-75193-1_54

Alexander Ocsa ORCID: orcid.org/0000-0003-0595-4164¹⁵,
Jose Luis Huillca¹⁵ &
Cristian Lopez del Alamo¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10657))

Included in the following conference series:

Iberoamerican Congress on Pattern Recognition

2065 Accesses
3 Altmetric

Abstract

Approximate similarity search algorithms based on hashing were proposed to query high-dimensional datasets due to its fast retrieval speed and low storage cost. Recent studies, promote the use of Convolutional Neural Network (CNN) with hashing techniques to improve the search accuracy. However, there are challenges to solve in order to find a practical and efficient solution to index CNN features, such as the need for heavy training process to achieve accurate query results and the critical dependency on data-parameters. Aiming to overcome these issues, we propose a new method for scalable similarity search, i.e., Deep frActal based Hashing (DAsH), by computing the best data-parameters values for optimal sub-space projection exploring the correlations among CNN features attributes using fractal theory. Moreover, inspired by recent advances in CNNs, we use not only activations of lower layers which are more general-purpose but also previous knowledge of the semantic data on the latest CNN layer to improve the search accuracy. Thus, our method produces a better representation of the data space with a less computational cost for a better accuracy. This significant gain in speed and accuracy allows us to evaluate the framework on a large, realistic, and challenging set of datasets.

You have full access to this open access chapter, Download conference paper PDF

Deep semantic preserving hashing for large scale image retrieval

Article 23 April 2018

Masoumeh Zareapoor, Jie Yang, … Surya Kant

Unsupervised Deep Hashing with Structured Similarity Learning

Robust and Index-Compatible Deep Hashing for Accurate and Fast Image Retrieval

Keywords

1 Introduction

The increasing availability of data in diverse domains has created a necessity to develop techniques and methods to discover knowledge from massive volumes of complex data, motivating many research works in databases, machine learning, and information retrieval communities. This has driven the development of scalable and efficient techniques to organize and retrieve this kind of data. Similarity search has been the traditional approach for information retrieval. Although several similarity search algorithms have been proposed to speed up similarity queries, most of them are either affected by the well-known “curse of dimensionality”. Retrieve complex data causes stability problems when the data dimensionality is too high [3].

One of the few approaches that ensure an approximate solution with sublinear search cost for high-dimensional data is the Locality Sensitive Hashing (LSH) [1]. LSH is based on the idea that closeness between two objects is usually preserved by a random projection operation. In other words, if two objects are close together in their original space, then these two objects will remain close after a scalar projection operation [12]. However, it presents some difficulties for approximate kNN queries, in particular, related to data domain parameter dependence and quality results. Therefore, in complex domains, in particular, in high dimensional data problems, an approximate solution with a solid theoretical analysis may be the best option in many application areas because of their efficiency in time and space.

On the other hand, in Machine Learning traditionally images are often described by the hand-craft visual features. However, these hand-craft features cannot well reveal the high-level semantic meaning (labels or tags) of images, and often limit the performance of image retrieval [9]. Inspired by recent advances in Convolutional Neural Network (CNN) [8], many methods solved the problem of precision of similarity retrieval by using CNN as feature extractor and then build a compact similarity-preserving hash code for fast image retrieval. Again, hashing is widely used for large-scale image retrieval as well as video and document searches because the compact representation of hash code is essential for data storage and reasonable for query searches [14]. However, some drawbacks based on these supervised hashing methods have not been solved entirely, as follows:

There is a trade-off between classification error and quantization error: activations of lower layers are more general-purpose [16], so training is more effective. However lower layers have larger activations maps (many nodes), which are harder to encode which leads to a compromise.
There is a dependency on parameter values for approximate similarity search schemes based on LSH, which determine the number of hash functions and number of hash tables.

This paper proposes a novel supervised hashing technique, named Deep frActal based Hashing (DAsH), designed to perform scalable approximate similarity search. The contributions of our work are as follows. First, we introduce and define a scheme based on CNN and optimized using fractal theory. To overcome the limitation of large activations on lower layers of CNN (output of the last convolutional layer) we reduce its dimensionality using autoencoders to the optimal sub-space. Then we index this new representation with LSH scheme. Second, we present a novel method, based on fractal theory, which allow us to can find the optimal number of hash functions for an approximate similarity search scheme based on LSH.

The paper is organized as follows. Section 2 summarizes the background for this work. Section 3 describes the proposed technique and Sect. 4 reports experimental results on real and synthetic datasets. Finally, we conclude in Sect. 5.

2 Locality Sensitive Hashing

Previous work [1] has explored the idea of hashing objects and grouping them into buckets with the goal of performing approximate similarity search within buckets associated with the query element. The idea behind LSH is that if two objects are close together in their original space, then these two objects will remain close after a scalar projection operation [12]. Hence, let h(x) be a hash function that maps a d-dimensional point x to a one-dimensional value. The function h(x) is said to be locality sensitive if the probability of mapping two d-dimensional points $x_1$, $x_2$ to the same value grows as their distance $d(x_1, x_2)$ decreases.

LSH based methods report efficient results when adequate values for m (number of hash functions) and L (number of indexes) are chosen. The $E^2$-LSH algorithm find the best value for m and L by experimentally evaluating the cost of calculation for samples in the given dataset. Basically, the tuning parameter of LSH is chosen as a function of the dataset to minimize the running time of a query while the space requirement is within the memory bounds [13].

2.1 Fractal Theory

A fractal is characterized by the self-similarity property, i.e., it is an object that presents roughly the same characteristics when analyzed over a broad range of scales [6]. From the Fractal Theory, the Correlation Fractal Dimension $\mathfrak {D}$ is particularly useful for data analysis, since it can be applied to estimate the intrinsic dimension of real datasets that exhibit fractal behavior, i.e., exactly or statistically self-similar datasets [4]. It has been shown that, given a set of N objects in a dataset with a distance function d(x, y), the average number of k neighbors within a given distance r is proportional to r raised to $\mathfrak {D}$. Thus, the pair-count PC(r) of pairs of elements within distance r follows the power law:

$$\begin{aligned} PC(r) = K_p \times r^{\mathfrak {D}} \end{aligned}$$

(1)

where, $K_p$ is a proportionality constant, and $\mathfrak {D}$ is the correlation fractal dimension of the dataset. Consequently, a fractal is defined by the self-similarity property, that is the main characteristic that represents exactly or statistically the similarity between the parts to the whole fractal.

3 Deep Fractal Based Hashing - DAsH

In this section, we propose the Deep Fractal based Hashing (DAsH) designed to perform a scalable approximate search by supervised hashing by a supervised hashing scheme. As introduced in Sect. 1, our strategy is to use the fractal theory to find the optimal sub-space for the last convolutional layer output of the CNN network, and the optimal number of hash functions for LSH index as well.

Figure 1 illustrates the training process structure. The network consists of three types of layers: (1) the convolutional layers whose weights are pre-trained most of the time on Imagenet and the target dataset is fine-tuned via transfer learning [16]; (2) the fully connected layers, with the last softmax layer returning the categorical probability distribution; (3) the autoencoders layers which are used for dimensionality reduction. The Convolutional Neural Network (CNN) is trained end-to-end with the groundtruth labels. We use the output of the last convolutional layer because it has the most general-purpose representation for learning, however there is somes issues with high dimensional data. To overcome the high-dimensionality problem, we reduce to the optimal sub-space using an autoencoder. Then we index the optimal sub-space obtained by the autoencoder with LSH scheme which, as we mentioned, it is also tuned thanks to fractal theory. At the same time, we use another n-autoencoders to learning the representation of each class. After, we will use these autoencoders to improve the retrieval process.

As it was showed in [11], a successful dimensionality reduction algorithm projects the data into a feature space with dimensionality close to the fractal dimensionality (FD) of the data in the original space and preserves topological properties. Thus, to find the target dimensionality (m) needed by autoencoder networks we follow the following heuristic. We start with the value at $m_1 = 2^2$, compute the FD of the new space with just that, then increment value at $m_2 = 2^3$, recompute the FD, and continue doing this until some ${t}{(m_t = 2 ^ t)}$ where we can see a flattening in the fractal dimension, meaning that more features do not change the fractal dimensionality of the dataset.

The second step of our procedure is image retrieval via DAsH. We process the query image forwarding it through CNN aiming to obtain the strongest n classes. In contrast to existing similarity algorithms that learn similarity from the low-level feature, our similarity is the combination of semantic-level and hashing-level similarity. So, the semantic level similarity is computed firstly (the n strongest classes). After the semantic relevance checking, we will obtain the new n queries using the strongest n autoencoders. The query is transformed into new query objects ($q_1, q_2, ... q_n$) which are hashed to locate the appropriate buckets. Once the buckets are located, the relevant candidate set is formed. Then, the elements in the candidate set are exhaustively analyzed to recover only the objects that satisfy the query condition (e.g. $d(x, q) \le r$). This process is performed for each of the L hash tables and it is illustrated in Fig. 2.

3.1 Using Fractals to Estimate LSH Parameters

To tune the LSH parameters we used a property of the correlation fractal dimension $\mathfrak {D}$, which can describes statistically a dataset. Moreover, the correlation fractal dimension $\mathfrak {D}$ can be estimated in linear time as it is depicted in [15].

We are interested to find out the resolution scale log(r) at which there are approximately k objects. Considering the line with slope $\mathfrak {D}$ passing at a point defined as ${<}log (r), log (Pairs(k)){>}$ the constant $K_d$ using the Eq. 1 is:

$$\begin{aligned} log(PC(r)) = \mathfrak {D} \times log (r) + K_p \nonumber \\ K_p = log (Pairs (k)) - \mathfrak {D} \times log (r) \end{aligned}$$

(2)

Considering another point $ {<}log (R), log (Pairs(N)){>}$, the constant $K_p$ is defined as:

$$\begin{aligned} K_p = log (Pairs(N)) - \mathfrak {D} \cdot log (R) \end{aligned}$$

(3)

Now, combining Eqs. 2 and 3, we can define the radius r as:

$$\begin{aligned} r = R \cdot exp (\frac{log (Pairs (k)) - log (Pairs(N))}{ \mathfrak {D}}) \end{aligned}$$

(4)

Using the last Eq. 4 we find out that the optimal number of hash functions m for a Locality Sensitive Hashing (LSH) based index configured to retrieve the k nearest neighbors is proportional to the number of pairs at a distance r. This has sense, because an average number of k neighbors are within a given distance r. Then, we define:

$$\begin{aligned} m \approx log (PC(r)) \end{aligned}$$

(5)

combining Eqs. 5 and 1 we obtain that $m \approx \mathfrak {D} \cdot log (r) $. Experimentally, we confirm out that the optimal m is:

$$\begin{aligned} m = (\left\lceil \mathfrak {D} + 1 \right\rceil ) \cdot log (r) \end{aligned}$$

(6)

4 Experiments

In this section, we are interested in answering the following question: (a) How accurate is our model in estimating the LSH parameters using the fractal dimension; (b) How does our DAsH method improve the other LSH implementations in terms of querying performance and precision. The performance of DAsH method was compared to two well-known approximate search methods, namely Multi-probe LSH [10], LSH-Forest [2], ITQ [5], and LOPQ [7]. All of the experiments were performed on a workstation with Intel core i7 3.0 GHz CPU and 64 GB RAM which is supplied with four Geforce GTX 1080 GPU.

We first conduct experiments on eight widely used datasets using hand-crafted features (audio, cities, eigenfaces, histograms, mgcounty, randomwalk, synth16d, synth6d, video)^{Footnote 1} to evaluate our proposed method for estimating the LSH parameters. Beside hand-crafted features, we also show the effectiveness of our methods when deep features are extracted by the deep Convolutional Neural Networks (CNN), we conduct this experiment on three datasets (MNIST^{Footnote 2}, CIFAR-10^{Footnote 3}, SVHN^{Footnote 4}) to evaluate our in terms of querying performance, meap average precision (mAP), and precision. The following describes the details of the experiments and results.

4.1 Experiment 1: Tunning LSH Parameters

LSH based methods report efficient results when adequate values for m (number of hash functions) are chosen. L (number of hash tables) is given by $L = m(m-1)/2$, see the E2LSH implementation^{Footnote 5}. To evaluate the effectiveness of the presented approach to tune the LSH parameters using fractal dimension, we worked on a variety of synthetic and real dataset. Table 1 summarizes the main features and parameters of the datasets, including the number of elements N, number of attributes d, their intrinsic (fractal) dimension $\mathfrak {D}$, the LSH parameters computed using two approaches: the Andoni (see footnote 5) algorithm and our proposal based on fractal dimension, and the total computation time for tune the LSH index (in seconds). The experiment results for the number of hash functions m show that the estimations given by Eq. 6 are comparable with those obtained with the E2LSH algorithm proposed by Andoni using up to 10X less time.

Table 1. Optimal LSH params using exhaustive e2lsh and the fractal based method.

Full size table

4.2 Retrieval Performance

The aim of this experiment is to measure the total time spent retrieving the k-nearest neighbor objects. The data structures being compared were tested with an specific values for queries. Thus we use $k=1000$ when compute the Mean Average Precision (mAP) metric and $k=25$ when compute the precision metric $(P (\%))$.

Table 2. Mean average precision (mAP), precision, and cumulative time spent to compute mAP for different methods on the MNIST, SVHN and CIFAR-10 datasets.

Full size table

Table 2 show the comparison in terms of mean average precision mAP, precision, and total time (in seconds).

5 Conclusions

In this paper, we presented a new scheme to solve approximate similarity search by supervised hashing called Deep Fractal base Hashing. Our approach shows the potential of boosting querying operations when a specialized index structure is designed from end-to-end. Due to the abilities of Fractal theory to find the optimal sub-space of the dataset and the optimal number of hash functions for LSH Index, we are able to find an optimal configuration for learning and indexing process. Moreover, we defined a novel method, based on fractal theory, which allows us to can find the optimal number of hash functions for LSH indexes. We can estimate these parameters in linear time due to it depends on computing fractal dimension.

We conducted performance studies on many real and synthetic datasets. The empirical results for LSH parameters show that our method based on fractal theory is comparable with those obtained with the brute-force algorithm using up to 10X less time. Moreover, in retrieval performance, the DAsH method was significantly better than other approximate methods, providing up to 8% better precision maintaining excellent retrieval times.

Notes

References

Andoni, A., Indyk, P.: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Commun. ACM 51(1), 117–122 (2008)
Article Google Scholar
Bawa, M., Condie, T., Ganesan, P.: LSH forest: self-tuning indexes for similarity search. In: Proceedings of the 14th International Conference on World Wide Web, pp. 651–660, Chiba, Japan (2005)
Google Scholar
Böhm, C., Berchtold, S., Keim, D.A.: Searching in high-dimensional spaces: index structures for improving the performance of multimedia databases. ACM Comput. Surv. 33(3), 322–373 (2001)
Article Google Scholar
Bones, C.C., Romani, L.A.S., de Sousa, E.P.M.: Clustering multivariate data streams by correlating attributes using fractal dimension. JIDM 7(3), 249–264 (2016)
Google Scholar
Gong, Y., Lazebnik, S., Gordo, A., Perronnin, F.: Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 35(12) (2013)
Google Scholar
Traina Jr., C., Traina, A.J.M., Wu, L., Faloutsos, C.: Fast feature selection using fractal dimension. JIDM 1(1), 3–16 (2010)
Google Scholar
Kalantidis, Y., Avrithis, Y.: Locally optimized product quantization for approximate nearest neighbor search. In: 2014 IEEE CVPR, pp. 2329–2336 (2014)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems NIPS 2012, pp. 1097–1105, USA (2012)
Google Scholar
Li, Z., Liu, J., Tang, J., Lu, H.: Robust structured subspace learning for data representation. IEEE Trans. Pattern Anal. Mach. Intell. 37(10) (2015)
Google Scholar
Lv, Q., Josephson, W., Wang, Z., Charikar, M., Li, K.: Multi-probe LSH: efficient indexing for high-dimensional similarity search. In: Proceedings of the International Conference on Very Large Data Bases, pp. 950–961, Vienna, Austria (2007)
Google Scholar
Moysey Brio, A.Z., Webb, G.M.: Chapter 7 fractal dimension. Math. Sci. Eng. 209, 167–218 (2007). L-System Fractals
Article Google Scholar
Ocsa, A., Sousa, E.P.M.: An adaptive multi-level hashing structure for fast approximate similarity search. JIDM 1(3), 359–374 (2010)
Google Scholar
Shakhnarovich, G., Darrell, T., Indyk, P.: Nearest-neighbor methods in learning and vision: theory and practice. In: Locality-Sensitive Hashing Using Stable Distributions, pp. 55–67. The MIT Press (2006)
Google Scholar
Shen, F., Shen, C., Liu, W., Shen, H.T.: Supervised discrete hashing. In: CVPR, pp. 37–45. IEEE Computer Society (2015)
Google Scholar
Traina Jr., C., Traina, A., Wu, L., Faloutsos, C.: Fast feature selection using fractal dimension. J. Inf. Data Manag. 1(1), 3 (2010)
Google Scholar
Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? CoRR abs/1411.1792 (2014)
Google Scholar

Download references

Acknowledgements

This project has been partially funded by CIENCIA-ACTIVA (Perú) through the Doctoral Scholarship at UNSA University, and FONDECYT (Perú) Project 148-2015.

Author information

Authors and Affiliations

Universidad Nacional de San Agustin, Arequipa, Peru
Alexander Ocsa & Jose Luis Huillca
Universidad La Salle, Arequipa, Peru
Cristian Lopez del Alamo

Authors

Alexander Ocsa
View author publications
You can also search for this author in PubMed Google Scholar
Jose Luis Huillca
View author publications
You can also search for this author in PubMed Google Scholar
Cristian Lopez del Alamo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Alexander Ocsa , Jose Luis Huillca or Cristian Lopez del Alamo .

Editor information

Editors and Affiliations

Universidad Federico Santa María, Santiago, Chile
Marcelo Mendoza
Carlos III University of Madrid, Madrid, Spain
Sergio Velastín

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ocsa, A., Huillca, J.L., Lopez del Alamo, C. (2018). On Semantic Solutions for Efficient Approximate Similarity Search on Large-Scale Datasets. In: Mendoza, M., Velastín, S. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2017. Lecture Notes in Computer Science(), vol 10657. Springer, Cham. https://doi.org/10.1007/978-3-319-75193-1_54

Download citation

DOI: https://doi.org/10.1007/978-3-319-75193-1_54
Published: 04 February 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-75192-4
Online ISBN: 978-3-319-75193-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

On Semantic Solutions for Efficient Approximate Similarity Search on Large-Scale Datasets

Abstract

Similar content being viewed by others

Deep semantic preserving hashing for large scale image retrieval

Unsupervised Deep Hashing with Structured Similarity Learning