Abstract
So far, there has not been a comparative evaluation of different approaches for text extraction from scholarly figures. In order to fill this gap, we have defined a generic pipeline for text extraction that abstracts from the existing approaches as documented in the literature. In this paper, we use this generic pipeline to systematically evaluate and compare 32 configurations for text extraction over four datasets of scholarly figures of different origin and characteristics. In total, our experiments have been run over more than 400 manually labeled figures. The experimental results show that the approach BS-4OS results in the best F-measure of 0.67 for the Text Location Detection and the best average Levenshtein Distance of 4.71 between the recognized text and the gold standard on all four datasets using the Ocropy OCR engine.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
References
Böschen, F., Scherp, A.: A systematic comparison of different approaches for unsupervised extraction of text from scholarly figures [extended report]. Technical report 1607, Christian-Albrechts-Universität zu Kiel (2016). http://www.uni-kiel.de/journals/receive/jportal_jparticle_00000290
Böschen, F., Scherp, A.: Formalization and preliminary evaluation of a pipeline for text extraction from infographics. In: Bergmann, R., Görg, S., Müller, G. (eds.) LWA 2015 Workshop: KDML, pp. 20–31. CEUR (2015)
Böschen, F., Scherp, A.: Multi-oriented text extraction from information graphics. In: DocEng, pp. 35–38. ACM (2015)
Carberry, S., Elzer, S., Demir, S.: Information graphics: an untapped resource for digital libraries. In: SIGIR, pp. 581–588. ACM (2006)
Chiang, Y., Knoblock, C.A.: A general approach for extracting road vector data from raster maps. IJDAR 16(1), 55–81 (2013)
Chiang, Y., Knoblock, C.A.: Recognizing text in raster maps. GeoInformatica 19(1), 1–27 (2015)
Choudhury, S.R., Giles, C.L.: An architecture for information extraction from figures in digital libraries. In: WWW, pp. 667–672 (2015)
Fraz, M., Sarfraz, M.S., Edirisinghe, E.A.: Exploiting colour information for better scene text detection and recognition. IJDAR 18(2), 153–167 (2015)
Huang, W., Tan, C.L., Leow, W.K.: Associating text and graphics for scientific chart understanding. In: ICDAR, pp. 580–584. IEEE Computer Society (2005)
Jayant, C., Renzelmann, M., Wen, D., Krisnandi, S., Ladner, R.E., Comden, D.: Automated tactile graphics translation: in the field. In: ASSETS, pp. 75–82 (2007)
Jiuzhou, Z.: Creation of synthetic chart image database with ground truth. Honors year project report, National University of Singapore (2006). https://www.comp.nus.edu.sg/~tancl/ChartImageDatabase/Report_Zhaojiuzhou.pdf
Karatzas, D., Gomez-Bigorda, L., Nicolaou, A., Ghosh, S.K., Bagdanov, A.D., Iwamura, M., Matas, J., Neumann, L., Chandrasekhar, V.R., Lu, S., Shafait, F., Uchida, S., Valveny, E.: ICDAR 2015 competition on robust reading. In: ICDAR, 23–26 August 2015, pp. 1156–1160. IEEE Computer Society (2015)
Khurshid, K., Siddiqi, I., Faure, C., Vincent, N.: Comparison of Niblack inspired binarization methods for ancient documents. In: Document Recognition and Retrieval (DRR), pp. 1–10. SPIE (2009)
Lu, X., Kataria, S., Brouwer, W.J., Wang, J.Z., Mitra, P., Giles, C.L.: Automated analysis of images in documents for intelligent document search. IJDAR 12(2), 65–81 (2009)
Otsu, N.: A threshold selection method from gray-level histograms. TSMC 9(1), 62–66 (1979)
Samet, H., Tamminen, M.: Efficient component labeling of images of arbitrary dimension represented by linear bintrees. IEEE TPAMI 10(4), 579–586 (1988)
Sas, J., Zolnierek, A.: Three-stage method of text region extraction from diagram raster images. In: Burduk, R., Jackowski, K., Kurzynski, M., Wozniak, M., Zolnierek, A. (eds.) Proceedings of the 8th International Conference on Computer Recognition Systems CORES 2013, vol. 226, pp. 527–538. Springer, Heidelberg (2013)
Savva, M., Kong, N., Chhajta, A., Fei-Fei, L., Agrawala, M., Heer, J.: ReVision: automated classification, analysis and redesign of chart images. In: UIST, pp. 393–402. ACM (2011)
Xu, S., Krauthammer, M.: A new pivoting and iterative text detection algorithm for biomedical images. J. Biomed. Inform. 43, 924–931 (2010)
Yang, L., Huang, W., Tan, C.L.: Semi-automatic ground truth generation for chart image recognition. In: Bunke, H., Spitz, A.L. (eds.) DAS 2006. LNCS, vol. 3872, pp. 324–335. Springer, Heidelberg (2006). doi:10.1007/11669487_29
Acknowledgement
This research was co-financed by the EU H2020 project MOVING (http://www.moving-project.eu/) under contract no 693092.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Böschen, F., Scherp, A. (2017). A Comparison of Approaches for Automated Text Extraction from Scholarly Figures. In: Amsaleg, L., Guðmundsson, G., Gurrin, C., Jónsson, B., Satoh, S. (eds) MultiMedia Modeling. MMM 2017. Lecture Notes in Computer Science(), vol 10132. Springer, Cham. https://doi.org/10.1007/978-3-319-51811-4_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-51811-4_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-51810-7
Online ISBN: 978-3-319-51811-4
eBook Packages: Computer ScienceComputer Science (R0)