Abstract
Geoparsing assigns geographic identifiers to textual words and phrases in documents. The specific problem is how to apply geoparsing in languages where changes of word termination occur. An appropriate method requires a flexible solution reflecting different strategies and priorities. Sixteen Czech RSS news channels were evaluated according to ten criteria. Three selected RSS channels were monitored for more than two years. The applied geoparsing included successive steps of different filters’ application and utilized the generation of different grammatical cases for recognized entities. Various problems with geographical names are classified and documented. The quality assessment shows satisfactory results namely for identification of names in domiciles (94%). The pessimistic strategy is applied to analyze a geographical balance of news distribution. The results show significant differences between distribution of news in monitored channels and document a high concentration of cultural and national news in several locations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aronoff, S.: Geographic Information Systems: A Management Perspective. WDL Publicatios, Ottawa (1989)
Beaman, R.S., Conn, B.J.: Automated geoparsing and georeferencing of Malesian collection locality data. Telopea. 10(1), 43–52 (2003)
Caldwell, D.: Geoparsing Maps the Future of Text Documents, http://www.directionsmag.com/article.php?article_id=3268
Chowdhury, G.G.: Natural language processing. Annual Review of Information Science and Technology 37(1), 51–89 (2003)
Cucerzan, S., Yarowsky, D.: Language Independent Named Entity Recognition Combining Morphological and Contextual Evidence. In: Proc. Joint Sigdat Conference on Empirical Methods in Natural Language Processing and Very Large Corpora., pp. 90–99. The Association for Computational Linguistics, Stroudsburg (1999)
Da Silva, J.F., Kozareva, Z., Lopes, G.P.: Cluster Analysis and Classification of Named Entities. In: Proc. Conference on Language Resources and Evaluation, pp. 321–324. LREC, Lisbon (2004)
Erik, F.T.K.S.: Introduction to the CoNLL-2002 Shared Task: Language-Independent Named Entity Recognition. In: Proc. of CoNLL 2002, Taipei, Taiwan, pp. 155–158 (2002)
Güting, R.H.: An Introduction to Spatial Database Systems. VLDB Journal 3(4), 357–399 (1994)
Jun, S., Ahamad, M.: FeedEx: Collaborative exchange of news feeds. In: Proc. of the 15th International Conference on World Wide Web, pp. 113–122. ACM, New York (2006)
Keller, M., Brownstein, J. S., Freifeld, C. C.: Expanding a Gazetteer-Based Approach for Geo-Parsing Disease Alerts (2008), http://prior-knowledge-language-ws.wdfiles.com/local--files/start/keller_slides.pdf
Košková, I., Kafka, Š.: Geoparser – automatické vyhledávání geografických lokalizací v textu. In: Proceedings of Geoinformační Infrastruktury Pro Praxi., p. 100. MSD, Brno (2009)
Lee, S., Lee, G.G.: Heuristic Methods for Reducing Errors of Geographic Named Entities Learned by Bootstrapping. In: Dale, R., Wong, K.-F., Su, J., Kwong, O.Y. (eds.) IJCNLP 2005. LNCS (LNAI), vol. 3651, pp. 658–669. Springer, Heidelberg (2005)
Nemec, P.: Evaluation of trends and structures of news using geoparsing. Diploma Thesis, p. 83. VŠB-TU Ostrava (2010)
Nemec, P., Horák, J.: The Geographical Balance of Regional News of Czech TV CT24. In: Proc. of international Symposium GIS Ostrava 2009, p. 10. TANGER, Ostrava (2009)
Piskorski, J.: Extraction of Polish Named-Entities. In: Proc. Conference on Language Resources and Evaluation, pp. 313–316. LREC, Lisbonne (2004)
Popov, B., Kirilov, A., Maynard, D., Manov, D.: Creation of reusable components and language resources for Named Entity Recognition in Russian. In: Proc. Conference on Language Resources and Evaluation, pp. 309–312. LREC, Lisbonne (2004)
RSS Specifications, http://www.rss-specifications.com
Saaty, L.T.: Fundamentals of decision making and priority theory with analytic hierarchy process. RWS publications, Pittsburgh (1994)
Saaty, L.T., Vargas, L.G.: Models, methods, concepts, and applications of the analytic hierarchy process. Kluwer Academic, Boston (2001)
Sia, K.C., Cho, J.: Efficient monitoring algorithm for fast news alerts. IEEE Transactions on Knowledge and Data Engineering 19(7), 950–961 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Horák, J., Belaj, P., Ivan, I., Nemec, P., Ardielli, J., Růžička, J. (2011). Geoparsing of Czech RSS News and Evaluation of Its Spatial Distribution. In: Katarzyniak, R., Chiu, TF., Hong, CF., Nguyen, N.T. (eds) Semantic Methods for Knowledge Management and Communication. Studies in Computational Intelligence, vol 381. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23418-7_31
Download citation
DOI: https://doi.org/10.1007/978-3-642-23418-7_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23417-0
Online ISBN: 978-3-642-23418-7
eBook Packages: EngineeringEngineering (R0)