Skip to main content

Extracting Key Entities and Significant Events from Online Daily News

  • Conference paper
Intelligent Data Engineering and Automated Learning – IDEAL 2008 (IDEAL 2008)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5326))

Abstract

To help people obtain the most important information daily in the shortest time, a novel framework is presented for simultaneous key entities extraction and significant events mining from daily web news. The technique is mainly based on modeling entities and news documents as weighted undirected bipartite graph, which consists of three steps. First, key entities are extracted by scoring all candidate entities on a specific day and tracking their trends within a specific time window. Second, a weighted undirected bipartite graph is built based on entities and related news documents, then mutual reinforcement is imposed on the bipartite graph to rank both of them. Third, clustering on news articles generates daily significant events. Experimental study shows effectiveness of this approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Li, W., Qian, D., Lu, Q., Yuan, C.: Detecting, categorizing and clustering entity mentions in Chinese text. In: SIGIR 2007, pp. 647–654 (2007)

    Google Scholar 

  2. Allan, J., Papka, R., Lavrenko, V.: On-line new event detection and tracking. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 1998) (1998)

    Google Scholar 

  3. Connell, M., Feng, A., Kumaran, G., Raghavan, H., Shah, C., Allan, J.: UMass at TDT 2004. In: 2004 Topic Detection and Tracking Workshop (TDT 2004), Gaithersburg, Maryland, USA (2004)

    Google Scholar 

  4. Allan, J., Carbonell, J., Doddington, G., Yamron, J., Yang, Y.: Topic detection and tracking pilot study: Final report. In: Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop (1998)

    Google Scholar 

  5. Chen, C.C., Chen, Y.T., Sun, Y., Chen, M.C.: Life cycle modeling of news events using aging theory. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) ECML 2003. LNCS (LNAI), vol. 2837, pp. 47–59. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  6. Kleinberg, J.M.: Bursty and hierarchical structure in streams. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 91–101 (2002)

    Google Scholar 

  7. Brants, T., Chen, F.: A system for new event detection. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2003) (2003)

    Google Scholar 

  8. Smith, D.A.: Detecting and browsing events in unstructured text. In: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2002) (2002)

    Google Scholar 

  9. Swan, R.C., Allan, J.: Extracting significant time varying features from text. In: Proceedings of the 1999 ACM CIKM International Conference on Information and Knowledge Management (CIKM 1999) (1999)

    Google Scholar 

  10. Zha, H., Zhang, Z.: On Matrices with Low-rank-plus-shift Structures: Partial SVD and Latent Semantic Indexing. SIAM Journal of Matrix Analysis and Applications 21, 522–536 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  11. Chen, K.-Y., Luesukprasert, L., Chou, S.-c.T.: Hot topic extraction based on timeline analysis and multidimensional sentence modeling. IEEE transactions on knowledge and data engineering 19(8) (August 2007)

    Google Scholar 

  12. Fung, G.P.C., Yu, J.X., Yu, P.S., Lu, H.: Parameter free bursty events detection in text streams. In: VLDB, pp. 181–192 (2005)

    Google Scholar 

  13. Kleinberg, J.: Authoritative sources in a hyperlinked environment. In: Proceedings of the 9th ACM-SIAM Symposium on Discrete Algorithms (1998)

    Google Scholar 

  14. Fung, G.P.C., Yu, J.X., Liu, H., Yu, P.S.: Time-dependent event hierarchy construction. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2007)

    Google Scholar 

  15. Yang, Y., Pierce, T., Carbonell, J.: A study of retrospective and on-line event detection. In: SIGIR, pp. 28–36 (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Liu, M., Liu, Y., Xiang, L., Chen, X., Yang, Q. (2008). Extracting Key Entities and Significant Events from Online Daily News. In: Fyfe, C., Kim, D., Lee, SY., Yin, H. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2008. IDEAL 2008. Lecture Notes in Computer Science, vol 5326. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88906-9_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-88906-9_26

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-88905-2

  • Online ISBN: 978-3-540-88906-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics