Large-Scale Graph Indexing Using Binary Embeddings of Node Contexts
Graph-based representations are experiencing a growing usage in visual recognition and retrieval due to their representational power in front of classical appearance-based representations in terms of feature vectors. Retrieving a query graph from a large dataset of graphs has the drawback of the high computational complexity required to compare the query and the target graphs. The most important property for a large-scale retrieval is the search time complexity to be sub-linear in the number of database examples. In this paper we propose a fast indexation formalism for graph retrieval. A binary embedding is defined as hashing keys for graph nodes. Given a database of labeled graphs, graph nodes are complemented with vectors of attributes representing their local context. Hence, each attribute counts the length of a walk of order k originated in a vertex with label l. Each attribute vector is converted to a binary code applying a binary-valued hash function. Therefore, graph retrieval is formulated in terms of finding target graphs in the database whose nodes have a small Hamming distance from the query nodes, easily computed with bitwise logical operators. As an application example, we validate the performance of the proposed methods in a handwritten word spotting scenario in images of historical documents.
KeywordsGraph matching Graph indexing Application in document analysis Word spotting Binary embedding
Unable to display preview. Download preview PDF.
- 5.Dahm, N., Bunke, H., Caelli, T., Gao, Y.: A unified framework for strengthening topological node features and its application to subgraph isomorphism detection. In: Kropatsch, W.G., Artner, N.M., Haxhimusa, Y., Jiang, X. (eds.) GbRPR 2013. LNCS, vol. 7877, pp. 11–20. Springer, Heidelberg (2013)CrossRefGoogle Scholar
- 8.Fernández-Mota, D., Almazán, J., Cirera, N., Fornés, A., Lladós, J.: Bh2m: The barcelona historical, handwritten marriages database. In: 22nd International Conference on Pattern Recognition, pp. 256–261. IEEE (2014)Google Scholar
- 11.Indyk, P., Motwani, R.: Approximate nearest neighbors: Towards removing the curse of dimensionality. In: Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing, STOC 1998, pp. 604–613. ACM (1998)Google Scholar
- 12.Llados, J., Rusinol, M., Fornes, A., Fernandez, D., Dutta, A.: On the influence of word representations for handwritten word spotting in historical documents. International Journal of Pattern Recognition and Artificial Intelligence 26(05) (2012)Google Scholar
- 16.Riesen, K., Bunke, H.: Approximate graph edit distance computation by means of bipartite graph matching. Image and Vision Computing (2009)Google Scholar
- 21.Wang, P., Eglin, V., Garcia, C., Largeron, C., Lladós, J., Fornés, A.: A coarse-to-fine word spotting approach for historical handwritten documents based on graph embedding and graph edit distance. In: 22nd International Conference on Pattern Recognition, pp. 3074–3079. IEEE (2014)Google Scholar
- 22.Yan, X., Yu, P.S., Han, J.: Graph indexing: a frequent structure-based approach. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, pp. 335–346. ACM (2004)Google Scholar
- 23.Zhou, F., De la Torre, F.: Factorized graph matching. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 127–134 (June 2012)Google Scholar