Advertisement

Straight Line Reconstruction for Fully Materialized Table Extraction in Degraded Document Images

  • Héloïse AlhéritièreEmail author
  • Walid AmaïeurEmail author
  • Florence CloppetEmail author
  • Camille KurtzEmail author
  • Jean-Marc OgierEmail author
  • Nicole VincentEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11414)

Abstract

Tables are one of the best ways to synthesize information such as statistical results, key figures in documents. In this article we focus on the extraction of materialized tables in document images, in the particular case where acquisition noise can disrupt the recovering of the table structures. The sequential printings/scannings of a document and its deterioration can lead to “broken” lines among the materialized segments of the tables. We propose a method based on the search for straight line segments in documents, relying on a new image transform that locally defines primitives well suited for pattern recognition and on a proposed theoretical model of lines in order to confirm their presence among a set of confident potential line parts. The extracted straight line segments are then used to reconstruct the table structures. Our approach has been evaluated both from quality and stability points of view.

Keywords

Table extraction Straight line features Local diameter transforms Degraded lines Document images 

References

  1. 1.
    Alhéritière, H., Cloppet, F., Kurtz, C., Ogier, J.M., Vincent, N.: A document straight line based segmentation for complex layout extraction. In: ICDAR, Proceedings, pp. 1126–1131 (2017)Google Scholar
  2. 2.
    Ben-Tzvi, D., Sandler, M.B.: A combinatorial Hough transform. Patt. Rec. Lett. 11(3), 167–174 (1990)CrossRefGoogle Scholar
  3. 3.
    Burns, J.B., Hanson, A.R., Riseman, E.M.: Extracting straight lines. IEEE Trans. Patt. Anal. Mach. Intell. 8(4), 425–455 (1986)CrossRefGoogle Scholar
  4. 4.
    Cesarini, F., Marinai, S., Sarti, L., Soda, G.: Trainable table location in document images. In: ICPR, Proceedings, pp. 236–240 (2002)Google Scholar
  5. 5.
    Coüasnon, B., Lemaitre, A.: Recognition of tables and forms. In: Doermann, D., Tombre, K. (eds.) Handbook of Document Image Processing and Recognition, pp. 647–677. Springer, London (2014).  https://doi.org/10.1007/978-0-85729-859-1_20CrossRefGoogle Scholar
  6. 6.
    Debled-Rennesson, I., Feschet, F., Rouyer-Degli, J.: Optimal blurred segments decomposition of noisy shapes in linear time. Comput. Graph. 30(1), 30–36 (2006)CrossRefGoogle Scholar
  7. 7.
    Gatos, B., Danatsas, D., Pratikakis, I., Perantonis, S.J.: Automatic table detection in document images. In: ICAPR, Proceedings, pp. 609–618 (2005)CrossRefGoogle Scholar
  8. 8.
    Göbel, M., Hassan, T., Oro, E., Orsi, G.: ICDAR 2013 table competition. In: ICDAR, Proceedings, pp. 1449–1453 (2013)Google Scholar
  9. 9.
    Green, E., Krishnamoorthy, M.: Model-based analysis of printed tables. In: ICDAR, Proceedings, pp. 214–217 (1995)Google Scholar
  10. 10.
    Kasar, T., Barlas, P., Adam, S., Chatelain, C., Paquet, T.: Learning to detect tables in scanned document images using line information. In: ICDAR, Proceedings, pp. 1185–1189 (2013)Google Scholar
  11. 11.
    Kerautret, B., Even, P.: Blurred segments in gray level images for interactive line extraction. In: IWCIA, Proceedings, pp. 176–186 (2009)zbMATHGoogle Scholar
  12. 12.
    Khurshid, K., Siddiqi, I., Faure, C., Vincent, N.: Comparison of Niblack inspired binarization methods for ancient documents. In: DRR, Proceedings, pp. 724–732 (2009)Google Scholar
  13. 13.
    Kieninger, T.: Table structure recognition based on robust block segmentation. In: DRR, Proceedings, pp. 22–32 (1998)Google Scholar
  14. 14.
    Laurentini, A., Viada, P.: Identifying and understanding tabular material in compound documents. In: ICPR, Proceedings, pp. 405–409 (1992)Google Scholar
  15. 15.
    Mandal, S., Chowdhury, S.P., Das, A.K., Chanda, B.: A simple and effective table detection system from document images. Int. J. Doc. Anal. Recognit. 8(2), 172–182 (2006)CrossRefGoogle Scholar
  16. 16.
    Mukhopadhyay, P., Chaudhuri, B.B.: A survey of Hough transform. Patt. Rec. 48(3), 993–1010 (2015)CrossRefGoogle Scholar
  17. 17.
    Shafait, F., Smith, R.: Table detection in heterogeneous documents. In: DAS, Proceedings, pp. 65–72 (2010)Google Scholar
  18. 18.
    Shpilman, R., Brailovsky, V.: Fast and robust techniques for detecting straight line segments using local models. Patt. Rec. Lett. 20(9), 865–877 (1999)CrossRefGoogle Scholar
  19. 19.
    Watanabe, T., Naruse, H., Luo, Q., Sugie, N.: Structure analysis of table-form documents on the basis of the recognition of vertical and horizontal line segments. In: ICDAR, Proceedings, pp. 638–646 (1991)Google Scholar
  20. 20.
    Xu, Z., Shin, B.S., Klette, R.: Determination of length and width of a line-segment by using a Hough transform. In: DGCI, Proceedings, pp. 190–201 (2014)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.LIPADEUniversité Paris DescartesParisFrance
  2. 2.Université de La Rochelle, L3iLa RochelleFrance

Personalised recommendations