Large-Scale Graph Indexing Using Binary Embeddings of Node Contexts

  • Pau RibaEmail author
  • Josep Lladós
  • Alicia Fornés
  • Anjan Dutta
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9069)


Graph-based representations are experiencing a growing usage in visual recognition and retrieval due to their representational power in front of classical appearance-based representations in terms of feature vectors. Retrieving a query graph from a large dataset of graphs has the drawback of the high computational complexity required to compare the query and the target graphs. The most important property for a large-scale retrieval is the search time complexity to be sub-linear in the number of database examples. In this paper we propose a fast indexation formalism for graph retrieval. A binary embedding is defined as hashing keys for graph nodes. Given a database of labeled graphs, graph nodes are complemented with vectors of attributes representing their local context. Hence, each attribute counts the length of a walk of order k originated in a vertex with label l. Each attribute vector is converted to a binary code applying a binary-valued hash function. Therefore, graph retrieval is formulated in terms of finding target graphs in the database whose nodes have a small Hamming distance from the query nodes, easily computed with bitwise logical operators. As an application example, we validate the performance of the proposed methods in a handwritten word spotting scenario in images of historical documents.


Graph matching Graph indexing Application in document analysis Word spotting Binary embedding 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Almazán, J., Gordo, A., Fornés, A., Valveny, E.: Word spotting and recognition with embedded attributes. IEEE Transactions on Pattern Analysis and Machine Intelligence 36(12), 2552–2566 (2014)CrossRefGoogle Scholar
  2. 2.
    Calonder, M., Lepetit, V., Strecha, C., Fua, P.: BRIEF: Binary robust independent elementary features. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 778–792. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  3. 3.
    Cheng, J., Ke, Y., Ng, W., Lu, A.: Fg-index: Towards verification-free query processing on graph databases. In: International Conference on Management of Data, SIGMOD 2007, pp. 857–872. ACM, New York (2007)CrossRefGoogle Scholar
  4. 4.
    Conte, D., Foggia, P., Sansone, C., Vento, M.: Thirty years of graph matching in pattern recognition. International Journal of Pattern Recognition and Artificial Intelligence 18(03), 265–298 (2004)CrossRefGoogle Scholar
  5. 5.
    Dahm, N., Bunke, H., Caelli, T., Gao, Y.: A unified framework for strengthening topological node features and its application to subgraph isomorphism detection. In: Kropatsch, W.G., Artner, N.M., Haxhimusa, Y., Jiang, X. (eds.) GbRPR 2013. LNCS, vol. 7877, pp. 11–20. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  6. 6.
    Dupé, F.X., Brun, L.: Edition within a graph kernel framework for shape recognition. In: Torsello, A., Escolano, F., Brun, L. (eds.) GbRPR 2009. LNCS, vol. 5534, pp. 11–20. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  7. 7.
    Dutta, A., Llados, J., Pal, U.: A symbol spotting approach in graphical documents by hashing serialized graphs. Pattern Recognition 46(3), 752–768 (2013)CrossRefGoogle Scholar
  8. 8.
    Fernández-Mota, D., Almazán, J., Cirera, N., Fornés, A., Lladós, J.: Bh2m: The barcelona historical, handwritten marriages database. In: 22nd International Conference on Pattern Recognition, pp. 256–261. IEEE (2014)Google Scholar
  9. 9.
    Fischer, A., Suen, C.Y., Frinken, V., Riesen, K., Bunke, H.: A fast matching algorithm for graph-based handwriting recognition. In: Kropatsch, W.G., Artner, N.M., Haxhimusa, Y., Jiang, X. (eds.) GbRPR 2013. LNCS, vol. 7877, pp. 194–203. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  10. 10.
    Fischler, M.A., Elschlager, R.A.: The representation and matching of pictorial structures. IEEE Transactions on Computers C-22(1), 67–92 (1973)CrossRefGoogle Scholar
  11. 11.
    Indyk, P., Motwani, R.: Approximate nearest neighbors: Towards removing the curse of dimensionality. In: Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing, STOC 1998, pp. 604–613. ACM (1998)Google Scholar
  12. 12.
    Llados, J., Rusinol, M., Fornes, A., Fernandez, D., Dutta, A.: On the influence of word representations for handwritten word spotting in historical documents. International Journal of Pattern Recognition and Artificial Intelligence 26(05) (2012)Google Scholar
  13. 13.
    Messmer, B.T., Bunke, H.: A decision tree approach to graph and subgraph isomorphism detection. Pattern Recognition 32(12), 1979–1998 (1999)CrossRefGoogle Scholar
  14. 14.
    Morgan, H.L.: The generation of a unique machine description for chemical structures-a technique developed at chemical abstracts service. Journal of Chemical Documentation 5(2), 107–113 (1965)CrossRefGoogle Scholar
  15. 15.
    Neuhaus, M., Bunke, H.: Bridging the Gap Between Graph Edit Distance and Kernel Machines. World Scientific Publishing Co., Inc., River Edge (2007)zbMATHGoogle Scholar
  16. 16.
    Riesen, K., Bunke, H.: Approximate graph edit distance computation by means of bipartite graph matching. Image and Vision Computing (2009)Google Scholar
  17. 17.
    Robles-Kelly, A., Hancock, E.R.: Graph edit distance from spectral seriation. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(3), 365–378 (2005)CrossRefGoogle Scholar
  18. 18.
    Rusiñol, M., Aldavert, D., Toledo, R., Lladós, J.: Efficient segmentation-free keyword spotting in historical document collections. Pattern Recognition 48(2), 545–555 (2015)CrossRefGoogle Scholar
  19. 19.
    Sanfeliu, A., Fu, K.S.: A distance measure between attributed relational graphs for pattern recognition. IEEE Transactions on Systems, Man and Cybernetics SMC-13(3), 353–362 (1983)CrossRefGoogle Scholar
  20. 20.
    Saund, E.: A graph lattice approach to maintaining and learning dense collections of subgraphs as image features. IEEE Transactions on Pattern Analysis and Machine Intelligence 35(10), 2323–2339 (2013)CrossRefGoogle Scholar
  21. 21.
    Wang, P., Eglin, V., Garcia, C., Largeron, C., Lladós, J., Fornés, A.: A coarse-to-fine word spotting approach for historical handwritten documents based on graph embedding and graph edit distance. In: 22nd International Conference on Pattern Recognition, pp. 3074–3079. IEEE (2014)Google Scholar
  22. 22.
    Yan, X., Yu, P.S., Han, J.: Graph indexing: a frequent structure-based approach. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, pp. 335–346. ACM (2004)Google Scholar
  23. 23.
    Zhou, F., De la Torre, F.: Factorized graph matching. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 127–134 (June 2012)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Pau Riba
    • 1
    Email author
  • Josep Lladós
    • 1
  • Alicia Fornés
    • 1
  • Anjan Dutta
    • 1
  1. 1.Computer Vision Center - Computer Science DepartmentUniversitat Autònoma de BarcelonaBarcelonaSpain

Personalised recommendations