Layout-Based Document-Retrieval System by Radon Transform Using Dynamic Time Warping

  • Giuseppe Pirlo
  • Michela Chimienti
  • Michele Dassisti
  • Donato Impedovo
  • Angelo Galiano
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8156)

Abstract

In the context of sustainability of document management technologies, this paper presents a new system for layout-based document retrieval specifically designed for commercial form retrieval. The system first uses a technique based on mathematical morphology to extract grid-based structural components from the document image. Successively, Radon Transform is used for document layout description. A document matching technique based on dynamic time warping is finally adopted. The experimental results carried out on real and simulated data set, demonstrate the effectiveness of the approach with respect to different classes of commercial forms.

Keywords

Document management Document Image Retrieval Sustainability Mathematic Morphology Radon Transform Dynamic Time Warping 

References

  1. 1.
    Manning, C.D., Raghavan, P., Schütze, H.: An Introduction to Information Retrieval. Cambridge Press (2009)Google Scholar
  2. 2.
    Doermann, D.: The Indexing and Retrieval of Document Images: A Survey. Computer Vision and Image Understanding 70(3), 287–298 (1998)CrossRefGoogle Scholar
  3. 3.
    Ko, Y.: A study of term weighting schemes using class information for text classification. In: Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1029–1030. ACM, NY (2012)Google Scholar
  4. 4.
    Marukawa, K., Hu, T., Fujisawa, H., Shima, Y.: Document retrieval tolerating character recognition errors - Evaluation and application. Pattern Recognition 30(8), 1361–1371 (1997)CrossRefGoogle Scholar
  5. 5.
    Taghva, K., Borsack, J., Condit, A.: Evaluation of model-based retrieval effectiveness with OCR text. ACM TOIS 14(1), 64–93 (1996)CrossRefGoogle Scholar
  6. 6.
    Lopresti, D.: Robust Retrieval of noisy text. In: Proceedings of the Third Forum on Research and Technology Advances in, pp.76–85 (1996) Google Scholar
  7. 7.
    Doermann, D.: The Indexing and Retrieval of Document Images: A Survey. Computer Vision and Image Understanding 70(3), 287–298 (1998)CrossRefGoogle Scholar
  8. 8.
    Mitra, M., Chaudhuri, B.: Information retrieval from documents: A Survey. Information Retrieval 2(2/3), 141–163 (2000)CrossRefGoogle Scholar
  9. 9.
    Tzacheva, A., El-Sonbaty, Y., El-Kwae, A.: Document Image Matching Using a Maximal Grid Approach. In: Proc. SPIE Document Recognition and Retrieval IX, pp. 121–128 (2002)Google Scholar
  10. 10.
    Duygulu, P., Atalay, V.: A Hierarchical Representation of Form Documents for Identification and Retrieval. International Journal on Document Analysis and Recognition 5(1), 17–27 (2002)CrossRefMATHGoogle Scholar
  11. 11.
    Huang, M., Dementhon, D., Doermann, D., Golebiowski, L.: Document ranking by layout relevance. In: Proc. Eighth International Conference on, vol. 1, pp. 362–366 (2005)Google Scholar
  12. 12.
    Erol, B., Antúnez, E., Hull, J.J.: Hotpaper: multimedia interaction with paper using mobile phones. In: Proceeding of the 16th ACM International Conference on Multimedia, pp. 399–408 (2008)Google Scholar
  13. 13.
    Liu, Q., Liao, C.: PaperUI. In: Iwamura, M., Shafait, F. (eds.) CBDAR 2011. LNCS, vol. 7139, pp. 83–100. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  14. 14.
    Serra, J.: Image Analysis and Mathematical Morphology. Academic Press (1982)Google Scholar
  15. 15.
    Pirlo, G.: Removing Underlines from Handwritten Text: An experimental investigation. In: Downton, C., et al. (eds.) Handwriting Recognition, pp. 497–502. World Scientific Publishing Co. Pte. Ltd., Singapore (1997) (in Progress)Google Scholar
  16. 16.
    Cormack, A.M.: Computed tomography: Some history and recent developments. In: Proc. Symposia in Applied Mathematics, vol. 27, pp. 35–42 (1983)Google Scholar
  17. 17.
    Deans, S.R.: The Radon Transform and Some of Its Applications. Wiley, NY (1983)MATHGoogle Scholar
  18. 18.
    Jafari-Khouzani, K., Soltanian-Zadeh, H.: Radon Transform orientation estimation for rotation invariant texture analysis. IEEE Trans. Pattern Anal. Mach. Intell. 27(6), 1004–1008 (2005)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Seo, S., et al.: A robust image fingerprinting system using the Radon transforms. Signal Process. Image Commun. 19(4), 325–339 (2004)CrossRefGoogle Scholar
  20. 20.
    Hjouj, F., Kammler, D.W.: Identification of Reflected, Scaled, Translated, and Rotated Objects From Their Radon Projections. IEEE Trans. Image Processing 17(3), 301–310 (2008)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Salvador, S., Chan, P.: Fast DTW: Toward Accurate Dynamic Time Warping in Linear Time and Space. In: Proc. KDD Workshop on Mining Temporal and Sequential Data, pp. 70–80 (2004)Google Scholar
  22. 22.
    Lemire, D.: Faster Retrieval with a Two-Pass Dynamic-Time-Warping Lower Bound. Pattern Recognition 42(9), 2169–2180 (2009)CrossRefMATHGoogle Scholar
  23. 23.
    Kittler, J., Hatef, M., Duin, R.P.W., Matias, J.: On combining classifiers. IEEE Trans. on Pattern Analysis Machine Intelligence 20(3), 226–239 (1998)CrossRefGoogle Scholar
  24. 24.
    Xu, L., Krzyzak, A., Suen, C.Y.: Methods of Combining Multiple Classifiers and Their Applications to Handwriting Recognition. IEEE Transaction on Systems, Man and Cybernetics 22(3), 418–435 (1992)CrossRefGoogle Scholar
  25. 25.
    Ho, T.K., Hull, J.J., Srihari, S.N.: Decision combination in multiple classifier systems. IEEE Trans. Pattern Anal. Mach. Intell. 16(1), 66–75 (1994)CrossRefGoogle Scholar
  26. 26.
    Huang, M., Dementhon, D., Doermann, D., Golebiowski, L.: Document ranking by layout relevance. In: Proc. 8th ICDAR, pp. 362–366 (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Giuseppe Pirlo
    • 1
  • Michela Chimienti
    • 2
  • Michele Dassisti
    • 3
  • Donato Impedovo
    • 4
  • Angelo Galiano
    • 4
  1. 1.Dipartimento di InformaticaUniversità degli Studi di Bari "A. Moro"BariItaly
  2. 2.Laboratorio Kad3MonopoliItaly
  3. 3.Dip. Meccanica, Management e MatematicaPolitecnico di BariBariItaly
  4. 4.Dyrecta LabConversanoItaly

Personalised recommendations