Abstract
TV newscasts report about the latest event-related facts occurring in the world. Relying exclusively on them is, however, insufficient to fully grasp the context of the story being reported. In this paper, we propose an approach that retrieves and analyzes related documents from the Web to automatically generate semantic annotations that provide viewers and experts comprehensive information about the news. We detect named entities in the retrieved documents that further disclose relevant concepts that were not explicitly mentioned in the original newscast. A ranking algorithm based on entity frequency, popularity peak analysis, and domain experts’ rules sorts those annotations to generate what we call Semantic Snapshot of a Newscast (NSS). We benchmark this method against a gold standard generated by domain experts and assessed via a user survey over five BBC newscasts. Results of the experiments show the robustness of our approach holding an Average Normalized Discounted Cumulative Gain of 66.6%.
Chapter PDF
Similar content being viewed by others
References
Cafarella, M.J., Downey, D., Soderland, S., Etzioni, O.: Knowitnow: Fast, scalable information extraction from the web. In: Human Language Technology Conference(HLT-EMNLP-2005), pp. 563–570 (2005)
Cohen, W.W.: Automatically extracting features for concept learning from the web. In: Seventeenth International Conference on Machine Learning, ICML 2000, pp. 159–166. Morgan Kaufmann Publishers Inc., San Francisco (2000)
Croft, W.B., Metzler, D., Strohman, T.: Search engines: Information retrieval in practice. Addison-Wesley, Reading (2010)
Henzinger, M., Chang, B.-W., Milch, B., Brin, S.: Query-free news search. In: Proceedings of the 12th International Conference on World Wide Web, WWW 2003, pp. 1–10. ACM, New York (2003)
Li, Y., Rizzo, G., Redondo Garcia, J.L., Troncy, R.: Enriching media fragments with named entities for video classification. In: 1st Worldwide Web Workshop on Linked Media (LiME 2013), Rio de Janeiro, Brazil (2013)
Li, Y., Rizzo, G., Troncy, R., Wald, M., Wills, G.: Creating enriched youtube media fragments with nerd using timed-text. In: 11th International Semantic Web Conference (ISWC 2012), November 2012
Redondo-García, J.L., Hildebrand, M., Romero, L.P., Troncy, R.: Augmenting TV newscasts via entity expansion. In: Presutti, V., Blomqvist, E., Troncy, R., Sack, H., Papadakis, I., Tordai, A. (eds.) ESWC Satellite Events 2014. LNCS, vol. 8798, pp. 472–476. Springer, Heidelberg (2014)
Redondo Garcia, J.L., De Vocht, L., Troncy, R., Mannens, E., Van de Walle, R.: Describing and contextualizing events in tv news show. In: 23rd International Conference on World Wide Web Companion, pp. 759–764 (2014)
Rizzo, G., van Erp, M., Troncy, R.: Benchmarking the extraction and disambiguation of named entities on the semantic web. In: 9th International Conference on Language Resources and Evaluation (LREC 2014) (2014)
Steiner, T., Verborgh, R., Gabarro Vallés, J., Van de Walle, R.: Adding meaning to social network microposts via multiple named entity disambiguation apis and tracking their data provenance. International Journal of Computing Information Systems and Industrial Management 5, 69–78 (2013)
Tran, M.-V., Nguyen, T.-T., Nguyen, T.-S., Le, H.-Q.: Automatic named entity set expansion using semantic rules and wrappers for unary relations. In: Asian Language Processing (IALP), pp. 170–173, December 2010
Wang, R.C., Cohen, W.W.: Language-independent set expansion of named entities using the web. In: Seventh IEEE International Conference on Data Mining, ICDM 2007, Washington, DC, USA, pp. 342–350 (2007)
Wang, R.C., Cohen, W.W.: Iterative set expansion of named entities using the web. In: Eighth IEEE International Conference on Data Mining, ICDM 2008, Washington, DC, USA, pp. 1091–1096 (2008)
Winkler, W.E.: Overview of record linkage and current research directions. In: Bureau of the Census (2006)
Wolf, G., Khatri, H., Chokshi, B., Fan, J., Chen, Y., Kambhampati, S.: Query processing over incomplete autonomous databases. In: 33rd International Conference on Very Large Data Bases, VLDB 2007, pp. 651–662 (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Redondo García, J.L., Rizzo, G., Romero, L.P., Hildebrand, M., Troncy, R. (2015). Generating Semantic Snapshots of Newscasts Using Entity Expansion. In: Cimiano, P., Frasincar, F., Houben, GJ., Schwabe, D. (eds) Engineering the Web in the Big Data Era. ICWE 2015. Lecture Notes in Computer Science(), vol 9114. Springer, Cham. https://doi.org/10.1007/978-3-319-19890-3_26
Download citation
DOI: https://doi.org/10.1007/978-3-319-19890-3_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19889-7
Online ISBN: 978-3-319-19890-3
eBook Packages: Computer ScienceComputer Science (R0)