Abstract
In this paper, we present DocMining, a general framework that allows the construction of scenarios dedicated to document image processing. The framework is the result of the collaboration between four academic partners and one industrial partner. The main issues of DocMining are the description and the execution of document analysis scenarios. The explicit declaration of scenarios and the plug-ins oriented approach of the framework allow to integrate easily new Document Processing Units and to create new application prototypes. Moreover, this paper highlights the interest of the platform to solve the problem of performance evaluation.
Chapter PDF
References
Clavier, E., Héroux, P., Gardes, J., Trupin, E.: Ground-truth production and benchmarking scenarios creation with DocMining. In: 3rd International Workshop on Document Layout Interpretation and its application DLIA 2003, Edinburgh, Scotland (August 2003)
Clavier, E., Masini, G., Delalandre, M., Rigamonti, M., Tombre, K., Gardes, J.: DocMining: A cooperative platform for heterogeneous document interpretation according to user-defined scenarios. In: International Workshop on Graphic Recognition GREC 2003, Barcelona, Spain (July 2003)
Coüasnon, B.: DMOS: A generic document recognition method. Application to an automatic generator of musical scorers, mathematical formulae and table structures recognition systems. In: Proceedings of 6th International Conference on Document Analysis and recognition ICDAR 2001, Seattle, USA (2001)
Parodi, P., Piccioli, G.: An efficient pre-processing of mixed-content document images for OCR systems. 13th Int. Conf. On Pattern Recognition 3, 778–782 (1996)
Pasternak, B.: Adaptierbares Kernsystem zur Interpretation von Zeichnungen. Dissertation zur Erlangung des akademisch Grades eines Doktors der Naturwissenschaften (Dr. rer. nat.), Universität Hamburg (1996)
Phelps, T.A., Wilensky, R.: The multivalent browser: A platform for new ideas. Document Engineering 2001, Atlanta, Georgia, USA (2001)
Yanikoglu, B.A., Vincent, L.: Pink panther: a complete environment for ground-truthing and benchmarking document page segmentation. Pattern Recognition 31, 1191–1204 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Adam, S. et al. (2004). DocMining: A Document Analysis System Builder. In: Marinai, S., Dengel, A.R. (eds) Document Analysis Systems VI. DAS 2004. Lecture Notes in Computer Science, vol 3163. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-28640-0_45
Download citation
DOI: https://doi.org/10.1007/978-3-540-28640-0_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23060-1
Online ISBN: 978-3-540-28640-0
eBook Packages: Springer Book Archive