Abstract
The internet is certainly a wide-spread platform for information interchange today and the semantic web actually seems to become more and more real. However, day-to-day work in companies still necessitates the laborious, manual processing of huge amounts of printed documents. This article presents the system smartFIX, a document analysis and understanding system developed by the DFKI spin-off insiders. During the research project “adaptive Read”, funded by the German ministry for research, BMBF, smartFIX was fundamentally developed to a higher maturity level, with a focus on adaptivity. The system is able to extract information from documents – documents ranging from fixed format forms to unstructured letters of many formats. Apart from the architecture, the main components and the system characteristics, we also show some results from the application of smartFIX to representative samples of medical bills and prescriptions.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Dengel, A., Bleisinger, R., Hoch, R., Hönes, F., Malburg, M., Fein, F.: OfficeMAID — A System for Automatic Mail Analysis, Interpretation and Delivery. In: Proceedings DAS 1994, Int’l Association for Pattern Recognition Workshop on Document Analysis Systems, Kaiserslautern, October 1994, pp. 253–276 (1994)
Baumann, S., Ali, M.B.H., Dengel, A., Jäger, T., Malburg, M., Weigel, A., Wenzel, C.: Message Extraction from Printed Documents A Complete Solution. In: Proc. of the 4th International Conference on Document Analysis and Recognition (ICDAR), Ulm, Germany (1997)
Dengel, A., Hinkelmann, K.: The Specialist Board – a technology workbench for document analysis and understanding. In: Tanik, M.M., Bastani, F.B., Gibson, D., Fielding, P.J. (eds.) Proc. of the 2nd World Conference Integrated Design and Process Technology, Austin, TX, USA (1996)
Schreiber, G., Akkermans, H., Anjewierden, A., de Hoog, R., Shadbolt, N., Van de Velde, W., Wielinga, B.: Knowledge Engineering and Management – The Common- KADS Methodology. The MIT Press, Cambridge (1999)
Klein, B., Gökkus, S., Kieninger, T., Dengel, A.: Three Approaches to Industrial Table Spotting. In: Proc. of the 6.th International Conference on Document Analysis and Recognition (ICDAR), Seattle, USA (2001)
Dengel, A., Dubiel, F.: Computer Understanding of Document Structure. International Journal of Imaging Systems & Technology (IJIST), Special Issue on Document Analysis & Recognition 7(4), 271–278 (1996)
Dubiel, F., Dengel, A.: FormClas — OCR-Free Classification of Forms. In: Hull, J.J., Liebowitz, S. (eds.) Document Analysis Systems II, pp. 189–208. World Scientific Publishing Co. Inc., Singapore (1998)
Fordan, A.: Constraint Solving over OCR Graphs. In: Bartenstein, O., Geske, U., Hannebauer, M., Yoshie, O. (eds.) INAP 2001. LNCS (LNAI), vol. 2543, pp. 205–216. Springer, Heidelberg (2003)
Kieninger, T., Dengel, A.: A Paper-to-HTML Table Converting System. In: Lee, S.-W., Nakano, Y. (eds.) DAS 1998. LNCS, vol. 1655, pp. 356–365. Springer, Heidelberg (1998)
Junker, M., Dengel, A.: Preventing overfitting in learning text patterns for document categorization. In: Singh, S., Murshed, N., Kropatsch, W.G. (eds.) ICAPR 2001. LNCS, vol. 2013, p. 137. Springer, Heidelberg (2001)
Altenhofen, C., Stanišic-Petrovic, M., Junker, M., Kieninger, T., Hofmann, H.: Werkzeugeinsatz in der Dokumentenverwaltung (German). In: Computerworld Schweiz, Nr. 15/2002, S. 6-11 (April 2002), http://www.kodok.de/german/literat/artikel/index_artikel.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Klein, B., Dengel, A.R., Fordan, A. (2004). smartFIX: An Adaptive System for Document Analysis and Understanding. In: Dengel, A., Junker, M., Weisbecker, A. (eds) Reading and Learning. Lecture Notes in Computer Science, vol 2956. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24642-8_11
Download citation
DOI: https://doi.org/10.1007/978-3-540-24642-8_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21904-0
Online ISBN: 978-3-540-24642-8
eBook Packages: Springer Book Archive