Skip to main content

Geoparsing of Czech RSS News and Evaluation of Its Spatial Distribution

  • Chapter
Semantic Methods for Knowledge Management and Communication

Part of the book series: Studies in Computational Intelligence ((SCI,volume 381))

Abstract

Geoparsing assigns geographic identifiers to textual words and phrases in documents. The specific problem is how to apply geoparsing in languages where changes of word termination occur. An appropriate method requires a flexible solution reflecting different strategies and priorities. Sixteen Czech RSS news channels were evaluated according to ten criteria. Three selected RSS channels were monitored for more than two years. The applied geoparsing included successive steps of different filters’ application and utilized the generation of different grammatical cases for recognized entities. Various problems with geographical names are classified and documented. The quality assessment shows satisfactory results namely for identification of names in domiciles (94%). The pessimistic strategy is applied to analyze a geographical balance of news distribution. The results show significant differences between distribution of news in monitored channels and document a high concentration of cultural and national news in several locations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aronoff, S.: Geographic Information Systems: A Management Perspective. WDL Publicatios, Ottawa (1989)

    Google Scholar 

  2. Beaman, R.S., Conn, B.J.: Automated geoparsing and georeferencing of Malesian collection locality data. Telopea. 10(1), 43–52 (2003)

    Google Scholar 

  3. Caldwell, D.: Geoparsing Maps the Future of Text Documents, http://www.directionsmag.com/article.php?article_id=3268

  4. Chowdhury, G.G.: Natural language processing. Annual Review of Information Science and Technology 37(1), 51–89 (2003)

    Article  Google Scholar 

  5. Cucerzan, S., Yarowsky, D.: Language Independent Named Entity Recognition Combining Morphological and Contextual Evidence. In: Proc. Joint Sigdat Conference on Empirical Methods in Natural Language Processing and Very Large Corpora., pp. 90–99. The Association for Computational Linguistics, Stroudsburg (1999)

    Google Scholar 

  6. Da Silva, J.F., Kozareva, Z., Lopes, G.P.: Cluster Analysis and Classification of Named Entities. In: Proc. Conference on Language Resources and Evaluation, pp. 321–324. LREC, Lisbon (2004)

    Google Scholar 

  7. Erik, F.T.K.S.: Introduction to the CoNLL-2002 Shared Task: Language-Independent Named Entity Recognition. In: Proc. of CoNLL 2002, Taipei, Taiwan, pp. 155–158 (2002)

    Google Scholar 

  8. Güting, R.H.: An Introduction to Spatial Database Systems. VLDB Journal 3(4), 357–399 (1994)

    Article  Google Scholar 

  9. Jun, S., Ahamad, M.: FeedEx: Collaborative exchange of news feeds. In: Proc. of the 15th International Conference on World Wide Web, pp. 113–122. ACM, New York (2006)

    Chapter  Google Scholar 

  10. Keller, M., Brownstein, J. S., Freifeld, C. C.: Expanding a Gazetteer-Based Approach for Geo-Parsing Disease Alerts (2008), http://prior-knowledge-language-ws.wdfiles.com/local--files/start/keller_slides.pdf

  11. Košková, I., Kafka, Š.: Geoparser – automatické vyhledávání geografických lokalizací v textu. In: Proceedings of Geoinformační Infrastruktury Pro Praxi., p. 100. MSD, Brno (2009)

    Google Scholar 

  12. Lee, S., Lee, G.G.: Heuristic Methods for Reducing Errors of Geographic Named Entities Learned by Bootstrapping. In: Dale, R., Wong, K.-F., Su, J., Kwong, O.Y. (eds.) IJCNLP 2005. LNCS (LNAI), vol. 3651, pp. 658–669. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  13. Nemec, P.: Evaluation of trends and structures of news using geoparsing. Diploma Thesis, p. 83. VŠB-TU Ostrava (2010)

    Google Scholar 

  14. Nemec, P., Horák, J.: The Geographical Balance of Regional News of Czech TV CT24. In: Proc. of international Symposium GIS Ostrava 2009, p. 10. TANGER, Ostrava (2009)

    Google Scholar 

  15. Piskorski, J.: Extraction of Polish Named-Entities. In: Proc. Conference on Language Resources and Evaluation, pp. 313–316. LREC, Lisbonne (2004)

    Google Scholar 

  16. Popov, B., Kirilov, A., Maynard, D., Manov, D.: Creation of reusable components and language resources for Named Entity Recognition in Russian. In: Proc. Conference on Language Resources and Evaluation, pp. 309–312. LREC, Lisbonne (2004)

    Google Scholar 

  17. RSS Specifications, http://www.rss-specifications.com

  18. Saaty, L.T.: Fundamentals of decision making and priority theory with analytic hierarchy process. RWS publications, Pittsburgh (1994)

    Google Scholar 

  19. Saaty, L.T., Vargas, L.G.: Models, methods, concepts, and applications of the analytic hierarchy process. Kluwer Academic, Boston (2001)

    Book  Google Scholar 

  20. Sia, K.C., Cho, J.: Efficient monitoring algorithm for fast news alerts. IEEE Transactions on Knowledge and Data Engineering 19(7), 950–961 (2007)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Horák, J., Belaj, P., Ivan, I., Nemec, P., Ardielli, J., Růžička, J. (2011). Geoparsing of Czech RSS News and Evaluation of Its Spatial Distribution. In: Katarzyniak, R., Chiu, TF., Hong, CF., Nguyen, N.T. (eds) Semantic Methods for Knowledge Management and Communication. Studies in Computational Intelligence, vol 381. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23418-7_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-23418-7_31

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-23417-0

  • Online ISBN: 978-3-642-23418-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics