Abstract
Information is one of the most important resources in our modern lifestyle and society. Users on social network platforms, like Twitter, produce thousands of tweets every second in a continuous stream. However, not all written data are important for a follower, i.e., not necessary relevant information. That means, trawling through uncountable tweets is a time-consuming and depressing task, even if most of the messages are useless and do not contain news. This paper describes an approach for aggregation and summarization of short messages like tweets. Useless messages will be filtered out, whereas the most important information will be aggregated into a summarized output. Our experiments show the advantages of our promising approach, which can also be applied for similar problems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Twitter: https://twitter.com.
- 2.
Tweets per second, last visited 2021/03/01: www.internetlivestats.com/twitter-statistics/.
- 3.
HTML entities: https://en.wikipedia.org/?title=HTML_entity.
- 4.
Unicode normalization: https://en.wikipedia.org/wiki/Unicode_equivalence.
References
Ayers, J.W., et al.: Why do people use electronic nicotine delivery systems (electronic cigarettes)? A content analysis of Twitter, 2012–2015. PLoS ONE 12(3), 1–8 (2017)
Cavazos-Rehg, P., et al.: A content analysis of depression-related tweets. Comput. Hum. Behav. 54, 351–357 (2016)
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge, April 2009. Online edition
O’Connor, B., Krieger, M., Ahn, D.: TweetMotif: exploratory search and topic summarization for Twitter. In: Proceedings of the International AAAI Conference on Weblogs and Social Media, Washington, DC, USA (2010)
Okazaki, M., Matsuo, Y.: Semantic Twitter: analyzing tweets for real-time event notification. In: Breslin, J.G., Burg, T.N., Kim, H.-G., Raftery, T., Schmidt, J.-H. (eds.) BlogTalk 2008-2009. LNCS, vol. 6045, pp. 63–74. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-16581-8_7
Parikh, R., Karlapalem, K.: ET: events from tweets. In: WWW (Companion Volume), pp. 613–620. ACM (2013)
Railean, C., Moraru, A.: Discovering popular events from tweets. In: Conference on Data Mining and Data Warehouses (SiKDD), October 2013
Rudenko, L., Haas, C., Endres, M.: Analyzing Twitter data with preferences. In: Darmont, J., Novikov, B., Wrembel, R. (eds.) ADBIS 2020. CCIS, vol. 1259, pp. 177–188. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-54623-6_16
Sankaranarayanan, J., Samet, H., Teitler, B.E., Lieberman, M.D., Sperling, J.: TwitterStand: news in tweets. In: ACM 2009, pp. 42–51 (2009)
Sharifi, B.P., Inouye, D.I., Kalita, J.K.: Summarization of Twitter microblogs. Comput. J. 57(3), 378–402 (2014)
Sutton, J., et al.: Lung cancer messages on Twitter: content analysis and evaluation. J. Am. Coll. Radiol. 15, 210–217 (2017)
Takamura, H., Okumura, M.: Text summarization model based on the budgeted median problem. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management (CIKM 2009), Hong Kong, China, pp. 1589–1592 (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Endres, M., Rudenko, L., Gröninger, D. (2021). Aggregation and Summarization of Thematically Similar Twitter Microblog Messages. In: Bellatreche, L., Dumas, M., Karras, P., Matulevičius, R. (eds) Advances in Databases and Information Systems. ADBIS 2021. Lecture Notes in Computer Science(), vol 12843. Springer, Cham. https://doi.org/10.1007/978-3-030-82472-3_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-82472-3_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-82471-6
Online ISBN: 978-3-030-82472-3
eBook Packages: Computer ScienceComputer Science (R0)