Machine Learning in Document Analysis and Recognition

  • Simone Marinai
  • Hiromichi Fujisawa

Part of the Studies in Computational Intelligence book series (SCI, volume 90)

Table of contents

  1. Front Matter
    Pages I-XI
  2. Donato Malerba, Michelangelo Ceci, Margherita Berardi
    Pages 45-69
  3. Richard Zanibbi, Dorothea Blostein, James R. Cordy
    Pages 71-103
  4. Floriana Esposito, Stefano Ferilli, Teresa M. A. Basile, Nicola Di Mauro
    Pages 105-138
  5. Stefan Jaeger, Huanfeng Ma, David Doermann
    Pages 163-191
  6. Simone Marinai, Emanuele Marino, Giovanni Soda
    Pages 193-219
  7. George Nagy, Sriharsha Veeramachaneni
    Pages 221-257
  8. Tatsuhiko Kagehiro, Hiromichi Fujisawa
    Pages 277-303
  9. Sergey Tulyakov, Venu Govindaraju
    Pages 305-332
  10. Sergey Tulyakov, Stefan Jaeger, Venu Govindaraju, David Doermann
    Pages 361-386
  11. Sargur N. Srihari, Harish Srinivasan, Siyuan Chen, Matthew J. Beal
    Pages 387-408
  12. Back Matter
    Pages 429-433

About this book


The objective of Document Analysis and Recognition (DAR) is to recognize the text and graphicalcomponents of a document and to extract information. With ?rst papers dating back to the 1960’s, DAR is a mature but still gr- ing research?eld with consolidated and known techniques. Optical Character Recognition (OCR) engines are some of the most widely recognized pr- ucts of the research in this ?eld, while broader DAR techniques are nowadays studied and applied to other industrial and o?ce automation systems. In the machine learning community, one of the most widely known - search problems addressed in DAR is recognition of unconstrained handwr- ten characters which has been frequently used in the past as a benchmark for evaluating machine learning algorithms, especially supervised classi?ers. However, developing a DAR system is a complex engineering task that involves the integration of multiple techniques into an organic framework. A reader may feel that the use of machine learning algorithms is not approp- ate for other DAR tasks than character recognition. On the contrary, such algorithms have been massively used for nearly all the tasks in DAR. With large emphasis being devoted to character recognition and word recognition, other tasks such as pre-processing, layout analysis, character segmentation, and signature veri?cation have also bene?ted much from machine learning algorithms.


Document Image Analysis and Recognition (DIAR) Learning Strategies algorithm algorithms calculus classification cognition handwriting recognition image analysis layout learning machine learning neural networks self-organizing map verification

Editors and affiliations

  • Simone Marinai
    • 1
  • Hiromichi Fujisawa
    • 2
  1. 1.Dipartimento di Sistemi e InformaticaUniversity of FlorenceFirenzeItaly
  2. 2.Hitachi Central Research LaboratoryKokubunji-shi, TokyoJapan

Bibliographic information

  • DOI
  • Copyright Information Springer-Verlag Berlin Heidelberg 2008
  • Publisher Name Springer, Berlin, Heidelberg
  • eBook Packages Engineering Engineering (R0)
  • Print ISBN 978-3-540-76279-9
  • Online ISBN 978-3-540-76280-5
  • Series Print ISSN 1860-949X
  • Buy this book on publisher's site
Industry Sectors
IT & Software
Oil, Gas & Geosciences