Form item extraction based on line searching
This paper presents an item searching method which has been applied to various kinds of forms. This approach is based on line detection through the Hough transform. After obtaining the straight lines, Hough directions are used to detect the real segments in the image. Segments can correspond either to continuous line, or to black parts of dashed or dotted lines. So, the segments are grouped together and classified between both adjacent line crossing points. Items are located by searching the minimum cycles of the graph constructed from the line intersection points. The last step consists of verifying the line classes based on the homogeneity hypothesis of item sides.
This method was applied to French Tax forms and tables coming from scientific publications. The experimental results have demonstrated the robustness and the reliability of such an approach to various forms with different types of item delimiters.
KeywordsForm Analysis Item Extraction Line Searching Hough Transform
Unable to display preview. Download preview PDF.
- 1.Yuan, J., Xu, L.,and Suen, C.Y.: Form Items Extraction by Model matching. First International Conference on Document Analysis and Recognition, vol. 1, pages 210–218, October 1991.Google Scholar
- 2.Yan, C.D., Tang, Y.Y.and Suen, C.Y.: Form Understanding System Based on Form Description Language. First International Conference on Document Analysis and Recognition, vol. 1, pages 283–293, October 1991.Google Scholar
- 3.Wong, K.Y., Casey, R.G., and Wahl, F.M.: Document Analysis System, IBM Journal. Research and Development., vol. 26, November 1982.Google Scholar
- 4.Watanabe, T., Naruse, H., Luo, Q., and Sugie, N.: Structure Analysis of Table-form Documents on the Basis of the Recognition of Vertical and Horizontal Line Segments. First International Conference on Document Analysis and Recognition, vol. 2, pages 638–646, October 1991.Google Scholar
- 5.Taylor, S., Fritzson, R., and Pastor, J.A.: Extraction of Data from Preprinted Form. Machine Vision and Applications, vol. 5, pages 211–222, 1992.Google Scholar
- 6.Lam, S.W., Javanbakht, L., and Srihari, S.N.: Anatomy of a Form Reader. Second International Conference on Document Analysis and Recognition, pages 506–509, October 1993.Google Scholar
- 7.Illingworth, J., and Kittler, J.: A survey of the Hough transform. Computer Vision, Graphics, and Image Processing, vol. 44, pages 87–117, 1988.Google Scholar
- 8.Risse, T.: Hough Transform for Line Recognition: Complexity of Evidence Accumulation and Cluster Detection. Computer Vision, Graphics, and Image Processing, vol. 46, pages 327–345, 1989.Google Scholar
- 9.Muller-BelaÏd, Y., and Mohr, R.: Planes and quadrics detection using Hough transform. 7 th International Conference on Pattern Recognition; August 1984.Google Scholar