A survey of non-thinning based vectorization methods

  • Liu Wenyin
  • Dov Dori
Document Image Analysis and Recognition
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1451)


We survey the methods developed up to date for crude vectorization of document images. We classify them into six categories: thinning based, Hough Transform based, contour-based, run-graph based, mesh-pattern based, and sparse pixel based. The crude vectorization is a relatively mature subject in the Document Analysis and Recognition field, though there are rooms to improve. The purpose of the survey is to provide researchers with a comprehensive overview of this technique for them to choose a suitable method when developing their vectorization algorithms and systems.


Vectorization Document Analysis and Recognition Polygonalization 


  1. [1]
    Boatto L et al. (1992) An Interpretation System for Land Register Maps. IEEE Computer 25(7):25–32Google Scholar
  2. [2]
    Chai I, Dori D (1992) Orthogonal Zig-Zag: An Efficient Method for Extracting Lines from Engineering Drawings. In: Visual Form, eds. Arcelli C, Cordella LP, Sanniti di Baja G, Plenum Press, New York London, pp 127–136Google Scholar
  3. [3]
    Di Zenzo S and Morelli A (1989) A useful image representation. In: Proc of 5th Int. Conf. on Image Analysis and Processing, Singapore, pp 170–178.Google Scholar
  4. [4]
    Dori D, Liang Y, Dowell J, I. Chai (1993) Spare Pixel Recognition of Primitives in Engineering Drawings. Machine Vision and Applications 6:79–82Google Scholar
  5. [5]
    Dori D (1997) Orthogonal Zig-Zag: an Algorithm for Vectorizing Engineering Drawings Compared With Hough Transform. Advances in Engineering Software 28(1):11–24CrossRefGoogle Scholar
  6. [6]
    Dunham JG (1986) Optimum uniform piecewise linear approximation of planar curves. IEEE PAM 18(1):67–75Google Scholar
  7. [7]
    Hough PVC (1962) A method and means for recognizing complex patterns, ]USA Patent 3,096,654, 1962.Google Scholar
  8. [8]
    Hung SHY and Kasvand T (1983) Critical points on a perfectly 8-or perfectly 6connected thin binary line. Pattern Recognition 16:297–284.CrossRefGoogle Scholar
  9. [9]
    Jaisimha MY et al. (1993) A Methodology for the Characterization of the Performance of Thinning Algorithms. In: Proc. of 2nd ICDAR, pp 282–286Google Scholar
  10. [10]
    Jimenez J and Navalon JL (1982) Some Experiments in Image Vectorization. IBM J. Res. Develop 26:724–734Google Scholar
  11. [11]
    Kasturi R et al. (1990) A System for Interpretation of Line Drawings. IEEE PAMI 12(10):978–992Google Scholar
  12. [12]
    Lam L, Lee SW, and Suen CY (1992) Thinning methodologies — A comprehensive survey. IEEE PAMI:14(9):869–887.Google Scholar
  13. [13]
    Lam L, Suen CY (1993) Evaluation of Thinning Algorithms from an OCR Viewpoint. In: Proc. of 2nd ICDAR, Tsukuba, Japan, pp 287–290Google Scholar
  14. [14]
    Lee S et al. (1991) Performance Evaluation of Skeletonization Algorithms for Document Image Processing. In: Proc. of Ist ICDAR, France, pp 260–271Google Scholar
  15. [15]
    Lin X et al. (1985) Efficient Diagram Understanding With Characteristic Pattern Detection. Computer Vision, Graphics and Image Processing 30:84–106Google Scholar
  16. [16]
    Liu W et al. (1995) Object Recognition in Engineering Drawings Using Planar Indexing. In: Proc. of GREC'95, Penn. State Univ., USA, pp 53–61Google Scholar
  17. [17]
    Liu W, Dori D (1996) Sparse Pixel Tracking: A Fast Vectorization Algorithm Applied to Engineering Drawings. In: Proc. 13th ICPR, Vienna, Austria, Volume III (Robotics and Applications, pp 808–811Google Scholar
  18. [18]
    [18]Liu W, Dori D (1997) A Protocol for Performance Evaluation of Line Detection Algorithms. Machine Vision Applications 9(5/6):240–250CrossRefGoogle Scholar
  19. [19]
    Monagan G and Roosli M (1993) Appropriate Base Representation Using a Run Graph. In: Proc. of 2nd ICDAR, Tsukuba, Japan, 1993, pp 623–626Google Scholar
  20. [20]
    Montanari U (1970) A note on the minimal length polygonal approximation to a digitized contour. CACM 13(l):41–47.Google Scholar
  21. [21]
    Sklansky J and Gonzalez V (1980) Fast Polygonal Approximation of Digitized Curves. Pattern Recognition 12:327–331CrossRefGoogle Scholar
  22. [22]
    Smith RW (1987) Computer Processing of Line Images: A Survey. Pattern Recognition 20(1):7–15CrossRefGoogle Scholar
  23. [23]
    Tamura H (1978) A Comparison of Line Thinning Algorithms from Digital Geometry Viewpoint. In: Proc. of 4th ICPR, Kyoto, Japan, pp 715–719Google Scholar
  24. [24]
    Vaxiviere P and Tombre K (1992) Cellestin: CAD Conversion of Mechanical Drawings. IEEE Computer 25(5): 46–54Google Scholar
  25. [25]
    Vaxiviere P and Tombre K (1995) Subsampling: A Structural Approach to Technical Document Vectorization. In: Shape, Structure and Pattern Recognition, eds. Dori D and Bruckstein A, World Scientific, 1995, pp 323–332Google Scholar
  26. [26]
    Yoo J-Y et al. (1998) Information Extraction from a Skewed Form Document in the Presence of Crossing Characters. In: Graphics Recognition—Algorithms and Systems, eds. K. Tombre and A. Chhabra, Lecture Notes in Computer Science, Vol. 1389, pp139–148, Springer, April, 1998Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1998

Authors and Affiliations

  • Liu Wenyin
    • 1
  • Dov Dori
    • 1
  1. 1.Faculty of Industrial Engineering and ManagementTechnion-Israel Institute of TechnologyHaifaIsrael

Personalised recommendations