Construction of Model of Structured Documents Based on Machine Learning

  • Sergey Golubev
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6744)


In this paper we consider the problem of structured document recognition. The document recognition system is proposed. This system incorporates a recognition module based on methods of structured image recognition, a graph document model and a method of document model generalization. The machine learning component makes the process of document model construction easier and less time-consuming.


document recognition machine learning graph document model 


  1. 1.
    Farrow, G.S.D., et al.: Model Matching in Intelligent Document Understanding. In: Proc. of ICDAR 1995 (1995)Google Scholar
  2. 2.
    Hirayama, Y.: Analyzing Form Images by Using Line-Shared-Adjacent Cell Relations. In: Proc. of IAPR 1996 (1996)Google Scholar
  3. 3.
    Yuan, J., Tang, Y.Y., Suen, C.Y.: Four Directional Adjacency Graphs (FDAG) and Their Application in Locating Fields in Forms. In: Proc. of ICDAR 1995 (1995)Google Scholar
  4. 4.
    Zuyev, K.A.: System for Identification of Structure of Printed Documents, Candidate of Science Dissertation, MGUL (in Russian) (1999)Google Scholar
  5. 5.
    Cook, D., Holder, L.: Mining Graph Data. Wiley Interscience, Hoboken (2006)CrossRefzbMATHGoogle Scholar
  6. 6.
    Kuramochi, M., Karypis, G.: An efficient algorithm for discovering frequent subgraphs, Tech. Rep. 02-026 Minneapolis, University of Minnesota (2002)Google Scholar
  7. 7.
    Neuhaus, M., Bunke, H.: A probabilistic approach to learning costs for graph edit distance. In: Proceedings 17th International Conference on Pattern Recognition, vol. 3 (2004)Google Scholar
  8. 8.
    Yan, X., Han, J.: gSpan: Graph-Based Substructure Pattern Mining. In: Proc. IEEE International Conference on Data Mining (ICDM 2002), Los Alamitos (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Sergey Golubev
    • 1
    • 2
  1. 1.Moscow Institute of Physics and TechnologyDolgoprudnyRussia
  2. 2.ABBYY SoftwareMoscowRussia

Personalised recommendations