Advertisement

Layout Analysis and Content Classification in Digitized Books

  • Andrea Corbelli
  • Lorenzo BaraldiEmail author
  • Fabrizio Balducci
  • Costantino Grana
  • Rita Cucchiara
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 701)

Abstract

Automatic layout analysis has proven to be extremely important in the process of digitization of large amounts of documents. In this paper we present a mixed approach to layout analysis, introducing a SVM-aided layout segmentation process and a classification process based on local and geometrical features. The final output of the automatic analysis algorithm is a complete and structured annotation in JSON format, containing the digitalized text as well as all the references to the illustrations of the input page, and which can be used by visualization interfaces as well as annotation interfaces. We evaluate our algorithm on a large dataset built upon the first volume of the “Enciclopedia Treccani”.

Keywords

Layout analysis Content classification SVM Annotation interfaces 

References

  1. 1.
    Antonacopoulos, A., Gatos, B., Karatzas, D.: ICDAR 2003 page segmentation competition. In: ICDAR, p. 688. IEEE (2003)Google Scholar
  2. 2.
    Appiani, E., Cesarini, F., Colla, A.M., Diligenti, M., Gori, M., Marinai, S., Soda, G.: Automatic document classification and indexing in high-volume applications. Int. J. Doc. Anal. Recogn. 4(2), 69–83 (2001)CrossRefGoogle Scholar
  3. 3.
    Baird, H., Jones, S., Fortune, S.: Image segmentation by shape-directed covers. In: International Conference on Pattern Recognition, vol. 1, pp. 820–825, June 1990Google Scholar
  4. 4.
    Baraldi, L., Grana, C., Cucchiara, R.: A deep siamese network for scene detection in broadcast videos. In: ACM International Conference on Multimedia, pp. 1199–1202. ACM (2015)Google Scholar
  5. 5.
    Bertini, M., Del Bimbo, A., Serra, G., Torniai, C., Cucchiara, R., Grana, C., Vezzani, R.: Dynamic pictorial ontologies for video digital libraries annotation. In: IEEE MultiMedia Magazine, pp. 42–51. ACM (2009)Google Scholar
  6. 6.
    Cesarini, F., Lastri, M., Marinai, S., Soda, G.: Encoding of modified XY trees for document classification. In: Proceedings of the Sixth International Conference on Document Analysis and Recognition, pp. 1131–1136. IEEE (2001)Google Scholar
  7. 7.
    Chen, K., Yin, F., Liu, C.L.: Hybrid page segmentation with efficient whitespace rectangles extraction and grouping. In: 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 958–962. IEEE (2013)Google Scholar
  8. 8.
    Coüasnon, B., Lemaitre, A.: Recognition of tables and forms. In: Doermann, D., Tombre, K. (eds.) Handbook of Document Image Processing and Recognition, pp. 647–677. Springer, London (2014)CrossRefGoogle Scholar
  9. 9.
    Mauro, N., Ferilli, S., Esposito, F.: Learning to Recognize Critical Cells in Document Tables. In: Agosti, M., Esposito, F., Ferilli, S., Ferro, N. (eds.) IRCDL 2012. CCIS, vol. 354, pp. 105–116. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-35834-0_12 CrossRefGoogle Scholar
  10. 10.
    Duda, R.O., Hart, P.E.: Use of the Hough transformation to detect lines and curves in pictures. Commun. ACM 15(1), 11–15 (1972)CrossRefzbMATHGoogle Scholar
  11. 11.
    Esposito, F., Malerba, D., Lisi, F.A.: Machine learning for intelligent processing of printed documents. J. Intell. Inf. Syst. 14(2–3), 175–198 (2000)CrossRefGoogle Scholar
  12. 12.
    Grana, C., Serra, G., Manfredi, M., Coppi, D., Cucchiara, R.: Layout analysis and content enrichment of digitized books. Multimed. Tools Appl. 75(7), 3879–3900 (2016)CrossRefGoogle Scholar
  13. 13.
    Ha, J., Haralick, R.M., Phillips, I.T.: Recursive XY cut using bounding boxes of connected components. In: Proceedings of the Third International Conference on Document Analysis and Recognition, vol. 2, pp. 952–955. IEEE (1995)Google Scholar
  14. 14.
    Kaur, S., Sharma, D.V.: Table structure identification from document images: a survey. Int. J. Innov. Adv. Comput. Sci. 4, 581–585 (2015)Google Scholar
  15. 15.
    Kise, K., Sato, A., Iwata, M.: Segmentation of page images using the area Voronoi diagram. Comput. Vis. Image Underst. 70(3), 370–382 (1998)CrossRefGoogle Scholar
  16. 16.
    Lazzara, G., Levillain, R., Géraud, T., Jacquelet, Y., Marquegnies, J., Crépin-Leblond, A.: The scribo module of the olena platform: a free software framework for document image analysis. In: 2011 International Conference on Document Analysis and Recognition (ICDAR), pp. 252–258. IEEE (2011)Google Scholar
  17. 17.
    Liu, Y., Mitra, P., Giles, C.L.: A fast preprocessing method for table boundary detection: narrowing down the sparse lines using solely coordinate information. In: The Eighth IAPR International Workshop on Document Analysis Systems, pp. 431–438. IEEE (2008)Google Scholar
  18. 18.
    Mandal, S., Chowdhury, S.P., Das, A.K., Chanda, B.: Detection and segmentation of tables and math-zones from document images. In: Proceedings of the 2006 ACM Symposium on Applied Computing. SAC 2006, pp. 841–846. ACM (2006)Google Scholar
  19. 19.
    Mandal, S., Chowdhury, S., Das, A., Chanda, B.: A simple and effective table detection system from document images. Int. J. Doc. Anal. Recogn. (IJDAR) 8(2–3), 172–182 (2006)CrossRefGoogle Scholar
  20. 20.
    Matas, J., Galambos, C., Kittler, J.: Robust detection of lines using the progressive probabilistic Hough transform. Comput. Vis. Image Underst. 78(1), 119–137 (2000). http://dx.doi.org/10.1006/cviu.1999.0831 CrossRefGoogle Scholar
  21. 21.
    Phillips, I.T., Chhabra, A.K.: Empirical performance evaluation of graphics recognition systems. IEEE Trans. Pattern Anal. Mach. Intell. 21(9), 849–870 (1999)CrossRefGoogle Scholar
  22. 22.
    Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. (CSUR) 34(1), 1–47 (2002)CrossRefGoogle Scholar
  23. 23.
    Smith, R.: An overview of the Tesseract OCR engine. In: International Conference on Document Analysis and Recognition, pp. 629–633. IEEE (2007)Google Scholar
  24. 24.
    Zanibbi, R., Blostein, D., Cordy, J.: A survey of table recognition. Doc. Anal. Recogn. 7(1), 1–16 (2004)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Andrea Corbelli
    • 1
  • Lorenzo Baraldi
    • 1
    Email author
  • Fabrizio Balducci
    • 1
  • Costantino Grana
    • 1
  • Rita Cucchiara
    • 1
  1. 1.Dipartimento di Ingegneria “Enzo Ferrari”Università degli Studi di Modena e Reggio EmiliaModenaItaly

Personalised recommendations