Abstract
Web documents are a promising source of spatial information. With information recognition and extraction, this information can be used in various applications such as building semantic maps and indoor robotic navigation. In this paper, we present a novel methodology to identify spatial information in web documents using semi-supervised trained machine learning classifiers. The semi-supervised models trained with the half amount of data available yield only the F-score of 4% and 9% inferior to the supervised models trained with complete data on classifying spatial entities and relationships respectively.
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
References
Walter, M.R., Hemachandra, S., Homberg, B., Tellex, S., Teller, S.: A framework for learning semantic maps from grounded natural language descriptions. Int. J. Robot. Res. 33(9), 1167–1190 (2014)
Talbot, B., Schulz, R., Upcroft, B., Wyeth, G.: Reasoning about natural language phrases for semantic goal driven exploration. In: Proceedings of the Australasian Conference on Robotics and Automation 2015 (2015)
Nadeau, D., Sekine, S.: A survey of named entity recognition and classification. Linguisticae Investigationes 30, 3–26 (2007)
Hou, J., Schulz, R., Wyeth, G., Nayak, R.: Finding within-organisation spatial information on the Web. In: Pfahringer, B., Renz, J. (eds.) AI 2015. LNCS, vol. 9457, pp. 242–248. Springer, Cham (2015). doi:10.1007/978-3-319-26350-2_21
Kolomiyets, O., Kordjamshidi, P., Bethard, S., Moens, M.-F.: Semeval-2013 task 3: spatial role labeling. In: Second Joint Conference on Lexical and Computational Semantics (* SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), pp. 255–266 (2013)
Bastianelli, E., Croce, D., Nardi, D., Basili, R.: UNITOR-HMM-TK: Structured kernel-based learning for spatial role labeling. In: Second Joint Conference on Lexical and Computational Semantics (* SEM), vol. 2, pp. 573–579 (2013)
Roberts, K., Harabagiu, S.M.: UTD-SpRL: a joint approach to spatial role labeling. In: Proceedings of the First Joint Conference on Lexical and Computational Semantics-Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation, pp. 419–424 (2012)
Prakash, V.J., Nithya, L.M.: A survey on semi-supervised learning techniques. Int. J. Comput. Trends Technol. 8(1), 25–29 (2014)
Manning, C., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)
Cho, H.-C., Okazaki, N., Miwa, M., Tsujii, J.: Named entity recognition with multiple segment representations. Inf. Process. Manag. 49(4), 954–965 (2013)
Sutton, C., McCallum, A.: An introduction to conditional random fields. Found. Trends Mach. Learn. 4(4), 267–373 (2011)
Mani, I., et al.: SpatialML: annotation scheme, resources, and evaluation. Lang. Resour. Eval. 44(3), 263–280 (2010)
Kaggle: Normalized Discounted Cumulative Gain
Okazaki, N.: CRFsuite: a fast implementation of Conditional Random Fields (CRFs) (2007)
Joachims, T.: SVM-HMM: sequence tagging with SVMs (2008)
Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by Gibbs sampling. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 363–370 (2005)
Tkachenko, M., Simanovsky, A.: Named entity recognition: exploring features. Proc. KONVENS 2012, 118–127 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Lie, H., Nayak, R., Wyeth, G. (2017). Spatial Information Recognition in Web Documents Using a Semi-supervised Machine Learning Method. In: Bouguettaya, A., et al. Web Information Systems Engineering – WISE 2017. WISE 2017. Lecture Notes in Computer Science(), vol 10569. Springer, Cham. https://doi.org/10.1007/978-3-319-68783-4_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-68783-4_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68782-7
Online ISBN: 978-3-319-68783-4
eBook Packages: Computer ScienceComputer Science (R0)