Abstract
Social emergency information is usually disseminated and driven by a hot topic described succinctly with a hashtag in social media. In China, hashtag prediction for social emergencies is more and more practical for E-governance. How to predict the hashtag popularity for social emergency has become a considerably important task. However, previous research mainly focused on commercial hashtag prediction, such as marketing and promotion. For the hashtag popularity prediction, the core issue is to identify the key features for improving prediction accuracy. To the best of our knowledge, there is few research focus on the feature extraction of hashtag for social emergency. In addition, we extract features for hashtag popularity prediction from “seed information” by avoiding excessive crawling. The “seed information” are the microblogs under a hashtag for a 24-h period since the hashtag was published. Based on the “seed information”, the user-based and content-based features are derived, which facilitate the spread of social emergency information. Furthermore, recursive feature elimination (RFE) analysis and nine machine learning classification models are integrated to determine the optimal features among all possible feature combinations. The effectiveness and robustness of our proposed features are verified.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Crane, R., Sornette, D.: Robust dynamic classes revealed by measuring the response function of a social system. Proc. Natl. Acad. Sci. 105(41), 15649–15653 (2008)
Ferrara, E., Interdonato, R., Tagarelli, A.: Online popularity and topical interests through the lens of instagram. In: ACM Conference on Hypertext and Social Media. pp. 24–34 (2014)
Figueiredo, F., Benevenuto, F., Almeida, J.M., Fabr, F.F., Almeida, B.J.M.: The tube over time: characterizing popularity growth of youtube videos. In: Proceedings of the 4th ACM International Conference on Web Search and Data Mining. pp. 745–754 (2011)
González-Bailón, S., Borge-Holthoefer, J., Rivero, A., Moreno, Y.: The dynamics of protest recruitment through an online network. Sci. R. 1, 1–7 (2011)
Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46(1–3), 389–422 (2002)
He, X., Gao, M., Kan, M.Y., Liu, Y., Sugiyama, K.: Predicting the popularity of web 2.0 items based on user comments. In: The International ACM SIGIR Conference on Research and Development in Information Retrieval. pp. 233–242 (2014)
Jeon, J., Croft, W.B., Lee, J.H., Park, S.: A framework to predict the quality of answers with non-textual features. In: International ACM SIGIR Conference on Research and Development in Information Retrieval. pp. 228–235 (2006)
Kong, S., Mei, Q., Feng, L., Ye, F., Zhao, Z.: Predicting bursts and popularity of hashtags in real-time. In: Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval. pp. 927–930 (2014)
Lehmann, J., Gonçalves, B., Cattuto, C., Ramasco, J.J., Cattuto, C.: Dynamical classes of collective attention in Twitter. In: Proceedings of the 21st International Conference on World Wide Web. pp. 251–260. WWW 2012, NY, USA. ACM, New York (2012)
Liu, Q., Agichtein, E., Dror, G., Gabrilovich, E., Maarek, Y., Dan, P., Szpektor, I.: Predicting web searcher satisfaction with existing community-based answers. In: Proceeding of the International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2011, Beijing, China. pp. 415–424 July 2011
Liu, Y., Huang, X., An, A., Yu, X.: ARSA: a sentiment-aware model for predicting sales performance using blogs. In: International ACM SIGIR Conference on Research and Development in Information Retrieval. pp. 607–614 (2007)
Ma, Z., Sun, A., Cong, G.: On predicting the popularity of newly emerging hashtags in Twitter. J. Am. Soc. Inf. Sci. Technol. 64(7), 1399–1410 (2013)
Pinto, H., Almeida, J.M., Gonçalves, M.A.: Using early view patterns to predict the popularity of Youtube videos. In: Proceedings of the 6th ACM International Conference on Web Search and Data Mining, New York, USA. pp. 365–374. ACM, New York February 2013
Tsur, O., Rappoport, A.: What’s in a hashtag?: content based prediction of the spread of ideas in microblogging communities. In: Proceedings of the 5th ACM International Conference on Web Search and Data Mining. pp. 643–652. ACM (2012)
Wang, S., Yan, Z., Hu, X., Yu, P.S., Li, Z.: Burst time prediction in cascades. In: 29th AAAI Conference on Artificial Intelligence. pp. 325–331 (2015)
Yang, J.S.U., Leskovec, J.S.U.: Patterns of temporal variation in online media categories and subject descriptors. In: ACM International Conference on Web Search and Data Minig. pp. 1–13 (2011)
Zhang, H.P., Yu, H.K., Xiong, D.Y., Liu, Q.: HHMM-based Chinese lexical analyzer ICTCLAS. In: Proceedings of the 2nd SIGHAN Workshop on Chinese Language Processing. pp. 184–187. Association for Computational Linguistics (2003)
Acknowledgements
This work is supported by the National Natural Science Foundation of China (Grant No. 71403262, 71774154, 71573247, 71503246).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Li, Q., Li, Y. (2017). Predicting Hashtag Popularity of Social Emergency by a Robust Feature Extraction Method. In: Chen, J., Theeramunkong, T., Supnithi, T., Tang, X. (eds) Knowledge and Systems Sciences. KSS 2017. Communications in Computer and Information Science, vol 780. Springer, Singapore. https://doi.org/10.1007/978-981-10-6989-5_12
Download citation
DOI: https://doi.org/10.1007/978-981-10-6989-5_12
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6988-8
Online ISBN: 978-981-10-6989-5
eBook Packages: Computer ScienceComputer Science (R0)