Abstract
When some hot social issue or event occurs, it will significantly increase the number of comments and retweet on that day on twitter. Generally, an event can be extracted by its term frequency but it is hard to find an event that has a low term frequency. Because of this reason there can be a probability of missing important information. However, there is a kind of reliable user who is directly related to that event so that no matter how low the number of tweet is on that case. In this paper, we propose user reliability based event extraction method. The latent Dirichlet allocation(LDA) model is adapted with timeline analysis to extract high-frequency events. User behaviors are analyzed to classify reliable users who are directly related to the issue. Reliable low-frequency events can be detected based on reliable users. In order to verify the effectiveness of the proposed method, four social issues are selected and experimented on Korean twitter test set. The experimental results showed 97.2% in precision for the top 10 extracted events (P@10) on each day. This result shows that the proposed method is effective for extracting events in twitter corpus.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Benson, E., Haghighi, A., Barzilay, R.: Event Discovery in Social Media Feeds. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 389–398. ACL (2011)
Popescu, A.M., Pennacchiotti, M.: Detecting Controversial Events from Twitter. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 1873–1876. ACM (2010)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet Allocation. The Journal of machine Learning Research 3, 993–1022 (2003)
Tinati, R., Carr, L., Hall, W., Bentwood, J.: Identifying Communicator Roles in Twitter. In: Proceedings of the 21st International Conference Companion on World Wide Web, pp. 1161–1168. ACM (2012)
Kardara, M., Papadakis, G., Papaoikonomou, T., Tserpes, K., Varvarigou, T.: Influence Patterns in Topic Communities of Social Media. In: Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics, pp. 10–21. ACM (2012)
Weng, J., Lim, E.P., Jiang, J., He, Q.: Twitter Rank: Finding Topic-sensitive Influential Twitterers. In: Proceedings of the Third ACM International Conference on Web Search and Data Mining, pp. 261–270. ACM (2010)
Sun, B., Ng, V.T.: Identifying Influential Users by Their Postings in Social Network. In: Proceedings of the 3rd International Workshop on Modeling Social Media, pp. 1–8. ACM (2012)
Xu, Z., Zhang, Y., Wu, Y., Yang, Q.: Modeling User Posting Behavior on Social Media. In: Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 545–554. ACM (2012)
Java, A., Song, X., Finin, T., Tseng, B.: Why We Twitter: Understanding Microblogging Usage and Communities. In: Proceedings of WebKDD/SNA-KDD, pp. 556–565. ACM (2007)
Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a Social Network or a News Media? In: Proceedings of the 19th International Conference on World Wide Web, pp. 591–600. ACM (2010)
Yang, Z., Guo, J., Cai, K., Tang, J., Li, J., Zhang, L., Su, Z.: Understanding retweeting behaviors in social networks. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 1633–1636. ACM (2010)
Mendoza, M., Poblete, B., Castillo, C.: Twitter Under Crisis: Can we trust what we RT? In: Proceedings of the First Workshop on Social Media Analytics, pp. 71–79. ACM (2010)
Sayyadi, H., Hurst, M., Maykov, A.: Event Detection and Tracking in Social Streams. In: Proceedings of ICWSM, pp. 311–314 (2009)
Pak, A., Paroubek, P.: Twitter as a Corpus for Sentiment Analysis and Opinion Mining. In: Proceedings of LREC, pp. 1320–1326 (2010)
Diao, Q., Jiang, J., Zhu, F., Lim, E.P.: Finding bursty topics from microblogs. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers, vol. 1, pp. 536–544. ACL (2012)
Blei, D.M., Lafferty, J.D.: Dynamic topic models. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 113–120. ACM (2006)
Tsolmon, B., Kwon, A.-R., Lee, K.-S.: Extracting Social Events Based on Timeline and Sentiment Analysis in Twitter Corpus. In: Bouma, G., Ittoo, A., Métais, E., Wortmann, H. (eds.) NLDB 2012. LNCS, vol. 7337, pp. 265–270. Springer, Heidelberg (2012)
Kleinberg, J.M.: Authoritative Sources in a Hyperlinked Environment. Journal of the ACM 46(5), 604–632 (1999)
Tsolmon, B., Lee, K.S.: A Graph-based Reliable User Classification. In: Proceedings of the First International Conference on Advanced Data and Information Engineering, pp. 61–68. Springer, Singapore (2014)
Java library for Twitter API, http://twitter4j.org
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tsolmon, B., Lee, KS. (2014). Extracting Social Events Based on Timeline and User Reliability Analysis on Twitter. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2014. Lecture Notes in Computer Science, vol 8404. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-54903-8_18
Download citation
DOI: https://doi.org/10.1007/978-3-642-54903-8_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-54902-1
Online ISBN: 978-3-642-54903-8
eBook Packages: Computer ScienceComputer Science (R0)