Extraction of Doodles and Drawings from Manuscripts

  • Chandranath Adak
  • Bidyut B. Chaudhuri
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8251)

Abstract

In this paper we propose an approach to separate the non-texts from texts of a manuscript. The non-texts are mainly in the form of doodles and drawings of some exceptional thinkers and writers. These have enormous historical values due to study on those writers’ subconscious as well as productive mind. We also propose a computational approach to recover the struck-out texts to reduce human effort. The proposed technique has a preprocessing stage, which removes noise using median filter and segments object region using fuzzy c-means clustering. Now connected component analysis finds the major portions of non-texts, and window examination eliminates the partially attached texts. The struck-out texts are extracted by eliminating straight lines, measuring degree of continuity, using some morphological operations.

Keywords

Connected Component Document Image Analysis Doodle Separation Fuzzy C-Means Clustering Manuscript Processing 

References

  1. 1.
    Nagy, G.: Twenty Years of Document Image Analysis in PAMI. IEEE Trans. on PAMI 22(1), 38–62 (2000)CrossRefGoogle Scholar
  2. 2.
    Luo, H., Agam, G., Dinstein, I.: Directional Mathematical Morphology Approach for Line Thinning and Extraction of Character Strings from Maps and Line Drawings. In: Proc. ICDAR 1995, pp. 257–260 (1995)Google Scholar
  3. 3.
    Kasturi, R., Bow, S.T., El-Masri, W., Shah, J., Gattiker, J.R., Mokate, U.B.: A System for Interpretation of Line Drawings. IEEE Trans. on PAMI 12(10), 978–992 (1990)CrossRefGoogle Scholar
  4. 4.
    Dori, D., Liu, W.: Vector-Based Segmentation of Text Connected to Graphics in Engineering Drawings. In: Perner, P., Rosenfeld, A., Wang, P. (eds.) SSPR 1996. LNCS, vol. 1121, pp. 322–331. Springer, Heidelberg (1996)CrossRefGoogle Scholar
  5. 5.
    Lu, Z.: Detection of Text Regions from Digital Engineering Drawings. IEEE Transactions on PAMI 20(4), 431–439 (1998)CrossRefGoogle Scholar
  6. 6.
    Fletcher, L.A., Kasturi, R.: A Robust Algorithm for Text String Separation from Mixed Text/Graphics Images. IEEE Tran. on PAMI 10(6), 910–918 (1988)CrossRefGoogle Scholar
  7. 7.
    He, S., Abe, N.: A Clustering-Based Approach to the Separation of Text Strings from Mixed Text/Graphics Documents. In: IEEE Proc. of ICPR 1996, pp. 706–710 (1996)Google Scholar
  8. 8.
    Adak, C.: Unsupervised Text Extraction from G-Maps. In: Proc. Int. Conf. on Human Computer Interactions (ICHCI 2013), India (August 2013)Google Scholar
  9. 9.
    Roy, P.P., Lladós, J., Pal, U.: Text/Graphics Separation in Color Maps. In: Proc. Int. Conf. on Computing: Theory and Applications (ICCTA 2007) (2007)Google Scholar
  10. 10.
    Garg, R., Hassan, E., Chaudhury, S., Gopal, M.: A CRF Based Scheme for Overlapping Multi-Colored Text Graphics Separation. In: Proc. ICDAR 2011, pp. 1215–1219 (2011)Google Scholar
  11. 11.
    MATLAB R2012a (7.14.0.739), MathWorks Inc., http://www.mathworks.com
  12. 12.
    Chaudhuri, B.B., Borah, S., Saraf, A., Goyal, A., Kumari, A.: Separation of Text from Non-Text Doodles of Poet Rabindranath Tagore’s Manuscripts. In: Proc. Nat. Conf. on Comp. and Comm. Systems (NCCCS 2012), pp. 1–5 (November 2012)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Chandranath Adak
    • 1
  • Bidyut B. Chaudhuri
    • 2
  1. 1.Dept. of CSEUniversity of KalyaniIndia
  2. 2.CVPR UnitIndian Statistical InstituteKolkataIndia

Personalised recommendations