Skip to main content

The Evolution of Document Image Analysis

  • Reference work entry
  • First Online:
Handbook of Document Image Processing and Recognition

Abstract

One of the first application domains for computer science was Optical Character Recognition. At that time, it was expected that a machine would quickly be able to read any document. History has proven that the task was more difficult than that. This chapter explores the history of the document analysis and recognition domain, from OCR to page analysis and on to the open problems which are still to be completely dealt with.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 549.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 549.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Many thanks to the authors of several chapters of this handbook for their contribution to the list of stubborn obstacles.

References

  1. Schantz HF (1982) History of OCR, optical character recognition. Recognition Technologies Users Association, Manchester Center, Vt., USA

    Google Scholar 

  2. Rabinow J (1969) Whither OCR? And whence? Datamation 15(7):38–42

    Google Scholar 

  3. Mori S, Suen CY, Yamamoto K (1992) Historical review of OCR research and development. Proc IEEE 80(7):1029–1058

    Article  Google Scholar 

  4. Nagy G (1992) At the frontiers of OCR. Proc IEEE 80(7):1093–1100

    Article  Google Scholar 

  5. Mori S, Nishida H, Yamada H (1999) Optical character recognition. Wiley, New York

    Google Scholar 

  6. Sampson G (1985) Writing systems. Stanford University Press, Stanford

    Google Scholar 

  7. Ritchie G, Russell G, Black A, Pulman S (1992) Computational morphology. MIT, Cambridge

    Google Scholar 

  8. Wong KY, Casey RG, Wahl FM (1982) Document analysis system. IBM J Res Dev 26(6): 647–656

    Article  Google Scholar 

  9. Wang D, Srihari SN (1989) Classification of newspaper image blocks using texture analysis. Comput Vis Graph Image Process 47:327–352

    Article  Google Scholar 

  10. Nagy G, Seth S, Viswanathan M (1992) A prototype document image analysis system for technical journals. IEEE Comput Mag 25(7):10–22

    Article  Google Scholar 

  11. Schürmann J (1978) A multifont word recognition system for postal address reading. IEEE Trans Comput 27(8):721–732

    Article  Google Scholar 

  12. Kasturi R, Alemany J (1988) Information extraction from images of paper-based maps. IEEE Trans Softw Eng 14(5):671–675

    Article  Google Scholar 

  13. Shimotsuji S, Hori O, Asano M, Suzuki K, Hoshino F, Ishii T (1992) A robust recognition system for a drawing superimposed on a map. IEEE Comput Mag 25(7):56–59

    Article  Google Scholar 

  14. Groen F, Sanderson A, Schlag J (1985) Symbol recognition in electrical diagrams using probabilistic graph matching. Pattern Recognit Lett 3:343–350

    Article  Google Scholar 

  15. Vaxivière P, Tombre K (1992) Celesstin: CAD conversion of mechanical drawings. IEEE Comput Mag 25(7):46–54

    Article  Google Scholar 

  16. Rice S, Nagy G, Nartker T (1999) OCR: an illustrated guide to the frontier. Kluwer, Boston

    Google Scholar 

  17. Baird H (2007) The state of the art of document image degradation modeling. In: Chaudhuri B (ed) Digital document processing. Springer, London

    Google Scholar 

  18. Sellen A, Harper R (2003) The myth of the paperless office. MIT, Cambridge

    Google Scholar 

  19. Lesk M (1997) Practical digital libraries: books, bytes, & bucks. Morgan Kaufmann, San Francisco

    Google Scholar 

  20. Nunberg G (1996) The future of the book. University of California Press, Berkeley

    Google Scholar 

  21. Baird HS, Bunke H, Yamamoto K (eds) (1992) Structured document image analysis. Springer, Berlin/New York

    MATH  Google Scholar 

  22. Nagy G (2000) Twenty years of document image analysis in PAMI. IEEE Trans PAMI 22(1):38–62

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Henry S. Baird or Karl Tombre .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag London

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Baird, H.S., Tombre, K. (2014). The Evolution of Document Image Analysis. In: Doermann, D., Tombre, K. (eds) Handbook of Document Image Processing and Recognition. Springer, London. https://doi.org/10.1007/978-0-85729-859-1_43

Download citation

Publish with us

Policies and ethics