Structural Rectification of Non-planar Document Images: Application to Graphics Recognition

  • Gady Agam
  • Changhua Wu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2390)


Document analysis and graphics recognition algorithms are normally applied to the processing of images of 2D documents scanned when flattened against a planar surface. Technological advancements in recent years have led to a situation in which digital cameras with high resolution are widely available. Consequently, traditional graphics recognition tasks may be updated to accommodate document images captured through a camera in an uncontrolled environment. In this paper the problem of document image rectification is discussed. The rectification targets the correction of perspective and geometric distortions of document images taken from uncalibrated cameras, by synthesizing new views which are better suited for existing graphics recognition and document analysis techniques. The proposed approach targets cases in which the document is not necessarily flat, without relaying on specific modeling assumptions, and by utilizing one or more overlapping views of the document. Document image rectification results are provided for several cases.


perspective correction joint triangulation view synthesis document pre-processing graphics recognition document analysis image processing 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    H. Baird, “Document image defect models”, In Proc. SSPR’90, pp. 38–46, 1990.Google Scholar
  2. 2.
    A. Amin, S. Fischer, A. F. Parkinson, R. Shiu, “Comparative study of skew algorithms”, Journal of Electronic Imaging, Vol. 5, Iss. 4, pp. 443–451, 1996.CrossRefGoogle Scholar
  3. 3.
    M. Sawaki, H. Murase, and N. Hagita, “Character recognition in bookshelf images by automatic template selection,” Proc. ICPR’98, pp. 1117–1120, Aug. 1998.Google Scholar
  4. 4.
    H. Fujisawa, H. Sako, Y. Okada, and S. Lee, “Information capturing camera and developmental issues”, In Proc. ICDAR’99, pp. 205–208, 1999.Google Scholar
  5. 5.
    M. Shridhar, J. W. V. Miller, G. Houle, and L. Bijnagte, “Recognition of license plate images: issues and perspectives”, In Proc. ICDAR’99, pp. 17–20, 1999.Google Scholar
  6. 6.
    H. Li, D. Doermann, and O. Kia, “Automatic text detection and tracking in digital video”, IEEE Trans. Image Processing, Vol. 9, No. 1, pp. 147–156, 2000.CrossRefGoogle Scholar
  7. 7.
    T. Kanungo, R. Haralick, and I. Philips, “Global and local document degradation models”, In Proc. ICDAR’93, pp. 730–734, 1993.Google Scholar
  8. 8.
    G. Agam, G. Michaud, J. S. Perrier, J. L. Houle, and P. Cohen, “A survey of image based view synthesis approaches for interactive 3D sensing”, Technical Report GRPR-RT-9901, The Perception and Robotics Laboratory, Ecole Polytechnique, Montreal, Canada, April, 1999.Google Scholar
  9. 9.
    J. S. Perrier, G. Agam, and P. Cohen, “Image-based view synthesis for enhanced perception in teleoperation”, in Enhanced and Synthetic Vision 2000, J. G. Verly, ed., Proc. SPIE 4023, pp. 213–224, 2000.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Gady Agam
    • 1
  • Changhua Wu
    • 1
  1. 1.Department of Computer ScienceIllinois Institute of TechnologyChicago

Personalised recommendations