Abstract
Historical maps are important sources of information for scholars of various disciplines. Many libraries are digitising their map collections as bitmap images, but for these collections to be most useful, there is a need for searchable metadata. Due to the heterogeneity of the images, metadata are mostly extracted by hand—if at all: many collections are so large that anything more than the most rudimentary metadata would require an infeasible amount of manual effort. We propose an active-learning approach to one of the practical problems in automatic metadata extraction from historical maps: locating occurrences of image elements such as text or place markers. For that, we combine template matching (to locate possible occurrences) with active learning (to efficiently determine a classification). Using this approach, we design a human computer interaction in which large numbers of elements on a map can be located reliably using little user effort. We experimentally demonstrate the effectiveness of this approach on real-world data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Personal communication with Dr. H.-G. Schmidt, head of the Manuscripts and Early Prints department, Würzburg University Library.
- 2.
Würzburg University Library, http://www.franconica-online.de/.
- 3.
Note that this basic approach is not invariant to scale and rotation. It is naturally robust against small variations, but some historical maps would require a more advanced template matching algorithm.
- 4.
See [17] and http://scikit-learn.org/.
- 5.
References
Arteaga, M.G.: Historical map polygon and feature extractor. In: Proceedings of the 1st ACM SIGSPATIAL International Workshop on MapInteraction, pp. 66–71 (2013)
Brunelli, R.: Template Matching Techniques in Computer Vision: Theory and Practice. Wiley, New York (2009)
Bryan, B., Nichol, R.C., Genovese, C.R., Schneider, J., Miller, C.J., Wasserman, L.: Active learning for identifying function threshold boundaries. Adv. Neural Inf. Process. Syst. 18, 163–170 (2006)
Chen, Y., Krause, A.: Near-optimal batch mode active learning and adaptive submodular optimization. In: Proceedings of the 30th International Conference on Machine Learning, pp. 160–168 (2013)
Deseilligny, M.P., Le Men, H., Stamon, G.: Character string recognition on maps, a rotation-invariant recognition method. Pattern Recogn. Lett. 16(12), 1297–1310 (1995)
Donmez, P., Carbonell, J.G.: Proactive learning: cost-sensitive active learning with multiple imperfect oracles. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management, pp. 619–628 (2008)
Fleet, C., Kowal, K.C., Pridal, P.: Georeferencer: crowdsourced georeferencing for map library collections. D-Lib Mag. 18(11/12) (2012)
Guo, Y., Schuurmans, D.: Discriminative batch mode active learning. In: Advances in Neural Information Processing Systems 20, Proceedings of the 21st Annual Conference on Neural Information Processing Systems, pp. 593–600 (2007)
Höhn, W.: Detecting arbitrarily oriented text labels in early maps. In: Sanches, J.M., Micó, L., Cardoso, J.S. (eds.) IbPRIA 2013. LNCS, vol. 7887, pp. 424–432. Springer, Heidelberg (2013)
Höhn, W., Schmidt, H.G., Schöneberg, H.: Semiautomatic recognition and georeferencing of places in early maps. In: Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 335–338 (2013)
Hoi, S., Jin, R., Zhu, J., Lyu, M.: Batch mode active learning and its application to medical image classification. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 417–424 (2006)
Holzinger, A.: Human-computer interaction and knowledge discovery (HCI-KDD): what is the benefit of bringing those two fields to work together? In: Cuzzocrea, A., Kittl, C., Simos, D.E., Weippl, E., Xu, L. (eds.) CD-ARES 2013. LNCS, vol. 8127, pp. 319–328. Springer, Heidelberg (2013)
Jenny, B., Hurni, L.: Cultural heritage: studying cartographic heritage: analysis and visualization of geometric distortions. Comput. Graph. 35(2), 402–411 (2011)
Leyk, S., Boesch, R., Weibel, R.: Saliency and semantic processing: extracting forest cover from historical topographic maps. Pattern Recogn. 39(5), 953–968 (2006)
Mello, C.A.B., Costa, D.C., dos Santos, T.J.: Automatic image segmentation of old topographic maps and floor plans. In: Proceedings of the 2012 IEEE International Conference on Systems, Man, and Cybernetics, pp. 132–137 (2012)
Parker, C.: An analysis of performance measures for binary classifiers. In: Proceedings of the 11th International Conference on Data Mining, pp. 517–526 (2011)
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Schein, A.I., Ungar, L.H.: Active learning for logistic regression: an evaluation. Mach. Learn. 68(3), 235–265 (2007)
Schöneberg, H., Schmidt, H.G., Höhn, W.: A scalable, distributed and dynamic workflow system for digitization processes. In: Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 359–362 (2013)
Settles, B.: Active learning literature survey. Computer Sciences Technical report 1648, University of Wisconsin-Madison (2010)
Settles, B.: Active Learning. Synthesis Lectures on Artificial Intelligence and Machine Learning. Morgan and Claypool Publishers, San Rafael (2012)
Shaw, T., Bajcsy, P.: Automation of digital historical map analyses. In: Proceedings of the IS&T/SPIE Electronic Imaging 2011, vol. 7869 (2011)
Simon, R., Haslhofer, B., Robitza, W., Momeni, E.: Semantically augmented annotations in digitized map collections. In: Proceedings of the 11th Annual International ACM/IEEE Joint Conference on Digital Libraries, pp. 199–202 (2011)
Acknowledgments
We thank Wouter Duivesteijn for fruitful discussion and helpful comments. We thank Hans-Günter Schmidt of the Würzburg University Library for providing real data and practical use cases.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Budig, B., van Dijk, T.C. (2015). Active Learning for Classifying Template Matches in Historical Maps. In: Japkowicz, N., Matwin, S. (eds) Discovery Science. DS 2015. Lecture Notes in Computer Science(), vol 9356. Springer, Cham. https://doi.org/10.1007/978-3-319-24282-8_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-24282-8_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24281-1
Online ISBN: 978-3-319-24282-8
eBook Packages: Computer ScienceComputer Science (R0)