The Evolution of Document Image Analysis

Baird, Henry S.; Tombre, Karl

doi:10.1007/978-0-85729-859-1_43

Henry S. Baird³ &
Karl Tombre⁴

3860 Accesses
3 Citations

Abstract

One of the first application domains for computer science was Optical Character Recognition. At that time, it was expected that a machine would quickly be able to read any document. History has proven that the task was more difficult than that. This chapter explores the history of the document analysis and recognition domain, from OCR to page analysis and on to the open problems which are still to be completely dealt with.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 549.99; Price excludes VAT (USA)

Hardcover Book: USD 549.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Many thanks to the authors of several chapters of this handbook for their contribution to the list of stubborn obstacles.

References

Schantz HF (1982) History of OCR, optical character recognition. Recognition Technologies Users Association, Manchester Center, Vt., USA
Google Scholar
Rabinow J (1969) Whither OCR? And whence? Datamation 15(7):38–42
Google Scholar
Mori S, Suen CY, Yamamoto K (1992) Historical review of OCR research and development. Proc IEEE 80(7):1029–1058
Article Google Scholar
Nagy G (1992) At the frontiers of OCR. Proc IEEE 80(7):1093–1100
Article Google Scholar
Mori S, Nishida H, Yamada H (1999) Optical character recognition. Wiley, New York
Google Scholar
Sampson G (1985) Writing systems. Stanford University Press, Stanford
Google Scholar
Ritchie G, Russell G, Black A, Pulman S (1992) Computational morphology. MIT, Cambridge
Google Scholar
Wong KY, Casey RG, Wahl FM (1982) Document analysis system. IBM J Res Dev 26(6): 647–656
Article Google Scholar
Wang D, Srihari SN (1989) Classification of newspaper image blocks using texture analysis. Comput Vis Graph Image Process 47:327–352
Article Google Scholar
Nagy G, Seth S, Viswanathan M (1992) A prototype document image analysis system for technical journals. IEEE Comput Mag 25(7):10–22
Article Google Scholar
Schürmann J (1978) A multifont word recognition system for postal address reading. IEEE Trans Comput 27(8):721–732
Article Google Scholar
Kasturi R, Alemany J (1988) Information extraction from images of paper-based maps. IEEE Trans Softw Eng 14(5):671–675
Article Google Scholar
Shimotsuji S, Hori O, Asano M, Suzuki K, Hoshino F, Ishii T (1992) A robust recognition system for a drawing superimposed on a map. IEEE Comput Mag 25(7):56–59
Article Google Scholar
Groen F, Sanderson A, Schlag J (1985) Symbol recognition in electrical diagrams using probabilistic graph matching. Pattern Recognit Lett 3:343–350
Article Google Scholar
Vaxivière P, Tombre K (1992) Celesstin: CAD conversion of mechanical drawings. IEEE Comput Mag 25(7):46–54
Article Google Scholar
Rice S, Nagy G, Nartker T (1999) OCR: an illustrated guide to the frontier. Kluwer, Boston
Google Scholar
Baird H (2007) The state of the art of document image degradation modeling. In: Chaudhuri B (ed) Digital document processing. Springer, London
Google Scholar
Sellen A, Harper R (2003) The myth of the paperless office. MIT, Cambridge
Google Scholar
Lesk M (1997) Practical digital libraries: books, bytes, & bucks. Morgan Kaufmann, San Francisco
Google Scholar
Nunberg G (1996) The future of the book. University of California Press, Berkeley
Google Scholar
Baird HS, Bunke H, Yamamoto K (eds) (1992) Structured document image analysis. Springer, Berlin/New York
MATH Google Scholar
Nagy G (2000) Twenty years of document image analysis in PAMI. IEEE Trans PAMI 22(1):38–62
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science and Engineering Department, Lehigh University, 19 Memorial Drive West, 18015, Bethlehem, PA, USA
Henry S. Baird
Université de Lorraine, 34 cours Léopold, CS 25233, 54052, Nancy, France
Karl Tombre

Authors

Henry S. Baird
View author publications
You can also search for this author in PubMed Google Scholar
Karl Tombre
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Henry S. Baird or Karl Tombre .

Editor information

Editors and Affiliations

University of Maryland, College Park, MD, USA
David Doermann
Université de Lorraine, Nancy, France
Karl Tombre

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Baird, H.S., Tombre, K. (2014). The Evolution of Document Image Analysis. In: Doermann, D., Tombre, K. (eds) Handbook of Document Image Processing and Recognition. Springer, London. https://doi.org/10.1007/978-0-85729-859-1_43

Download citation

DOI: https://doi.org/10.1007/978-0-85729-859-1_43
Published: 24 July 2019
Publisher Name: Springer, London
Print ISBN: 978-0-85729-858-4
Online ISBN: 978-0-85729-859-1
eBook Packages: Computer ScienceReference Module Computer Science and Engineering

Publish with us

Policies and ethics