Abstract
This chapter presents a framework of detecting bursty topics of correlated news and twitter, and discusses how to integrate the framework into government services. Especially, as a specific application of the proposed framework of detecting bursty topics of correlated news and twitter, this chapter gives an example of collecting news and twitter that are related to “the 2012 London Olympic game” and applying the proposed framework.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
In Kleinberg [1], \(\tau (i,j)\) is defined not as \((j - i)\gamma\), but as \((j - i)\gamma \ln m\), where \(m\) is the number of batches in the sequence \({\mathbf{B}} = (B_{1} , \ldots ,B_{m} )\). In this chapter, we omit the term \(\ln m\) in this definition for simplicity.
- 5.
- 6.
- 7.
Those evaluation results are still based on inside evaluation, which means that the two parameters \(s\) and \(\gamma\) are optimized with the news and tweets for evaluation we show in this chapter. However, we tune the two parameters across the 34 topics for evaluation, where we observed that the optimal values of the two parameters are mostly consistent across 34 topics for evaluation. Parameter optimization with held-out training data is one of our future work.
- 8.
Although Table 2 only shows the evaluation results for 34 topics that are relevant to “the London Olympic games”, even for the whole 50 topics, precision of the detected bursty topics is about 90 % per day/topic for both news articles and tweet texts.
References
Kleinberg, J. (2002). Bursty and hierarchical structure in streams. In Proceedings of 8th SIGKDD (pp. 91–101).
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.
Blei, D. M., & Lafferty, J. D. (2006). Dynamic topic models. In Proceedings of 23rd ICML (pp. 113–120).
Takahashi, Y., Utsuro, T., Yoshioka, M., Kando, N., Fukuhara, T., Nakagawa, H., & Kiyota, Y. (2012). Applying a burst model to detect bursty topics in a topic model. In JapTAL 2012 (Vol. 7614 of LNCS, pp. 239–249) Berlin: Springer.
Mane, K., & Borner, K. (2004). Mapping topics and topic bursts in PNAS. In: Proceedings of PNAS (Vol. 101, Suppl 1, pp. 5287–5290).
AlSumait, L., Bardara, D., Gentle, J., & Domeniconi, C. (2009). Topic significance ranking of LDA generative models. In Proceedings of ECML/PKDD (pp. 67–82).
Wang, X., Zhai, C. X., & Hu, R. S. (2007). Mining correlated bursty topic patterns from coordinated text streams. In Proceedings of 13th SIGKDD (pp. 784–793).
Zhang, J., Song, Y., Zhang, C., & Liu, S. (2010). Evolutionary hierarchical Dirichlet processes for multiple correlated time-varying corpora. In Proceedings of 16th SIGKDD (pp. 1079–10881).
Petrović, S., Osborne, M., & Lavrenko, V. (2010). Streaming first story detection with application to twitter. In HLT-NAACL (pp. 181–189).
Weng, J., & Lee, B. S. (2011). LDA-Based document models for ad-hoc retrieval. In Proceedings of Fifth ICWSM (pp. 401–408).
Li, C., Sun, A., & Datta, A. (2012). Twevent: Segment-based event detection from tweets. In Proceedings of 21st CIKM (pp. 155–164).
Diao, Q., Jiang, J., Zhu, F., & Lim, E. P. (2012). Finding bursty topics from microblogs. In Proceedings of 50th ACL (pp. 536–544).
AlSumait, L., Bardara, D., & Domeniconi, C. (2008). On-Line LDA: Adaptive topic models for mining text streams with applications to topic detection and tracking. In Proceedings of 8th ICDM (pp. 3–12).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Utsuro, T., Inoue, Y., Imada, T., Yoshioka, M., Kando, N. (2015). Detecting Bursty Topics of Correlated News and Twitter for Government Services. In: Nepal, S., Paris, C., Georgakopoulos, D. (eds) Social Media for Government Services. Springer, Cham. https://doi.org/10.1007/978-3-319-27237-5_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-27237-5_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27235-1
Online ISBN: 978-3-319-27237-5
eBook Packages: Computer ScienceComputer Science (R0)