A Research Paper on: Online Summarization and Real-Time Timeline Generation Using Stream of Tweets

  • Geeta G. DayalaniEmail author
  • Balkrishna K. Patil
  • Rajesh A. Auti
Conference paper
Part of the Lecture Notes in Networks and Systems book series (LNNS, volume 34)


Twitter is the most popular microblogging service where millions of tweets are being posted daily on a wide range of topics. Short text messages are called tweets. The length of the tweet is limited to 140 characters. They are initiated at a very high rate. Tweets contain large amount of data which is noisy and redundant in nature. In this paper, a unique framework called Summblr for the summarization of continuous tweet data to deal with the issue is introduced. Traditional summarization procedures pact with smaller sets of data those are also static while Summblr, is introduced to deal with large data stream of tweets which arrives dynamically at an actual quicker rate. The framework comprises of three components. As a primary step, an algorithm to cluster the data stream of short text messages called tweets is developed that is networked which binds the tweets together and also maintains it in a novel data structure called TCV that is tweet cluster vector. Secondly, a new technique called TCV Rank summarization for producing both online and historical summaries of random time durations is projected. Thirdly, an approach for effectively recognizing the topic evolution is developed. This method analyzes progressively the alterations that are based on summary or else the quantity-based deviations to generate the timelines from large data of tweet streams automatically.


Summarization Timeline Tweet stream Cluster Specification Extractive summary Tweet rank summarization Pyramidal time frame 


  1. 1.
    Dayalani, G.G.: Tweet streams online summarization and timeline generation. Int. J. Adv. Sci. Res. Eng. Trends 1(3), 79–82 (2016)Google Scholar
  2. 2.
    Vanderwende, L., Suzuki, H., Brockett, C., Nenkova, A.: Beyond SumBasic: task-focused summarization with sentence simplification and lexical expansion. Inf. Process. Manage. 43(6), 1606–1618 (2007)CrossRefGoogle Scholar
  3. 3.
    Radev, D., Blair-Goldensohn, S., Zhang, Z.: Experiments in single and multi-document summarization using mead. In: DUC-01, vol. 1001, p. 48109 (2001)Google Scholar
  4. 4.
    Erkan, G., Radev, D.: Lexrank: graph-based centrality as salience in text summarization. J. Artif. Intell. Res. 22, 457–480 (2004)Google Scholar
  5. 5.
    Mihalcea, R., Tarau, P.: TextRank: bringing order into texts. In: EMNLP. ACL, Barcelona, pp. 404–411 (2004)Google Scholar
  6. 6.
    Brin, S., Page, L.: The anatomy of a large-scale hypertextual Websearch engine. Comput. Netw. ISDN Syst. 30(1–7), 107–117 (1998)CrossRefGoogle Scholar
  7. 7.
    Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: an efficient data clustering method for very large databases. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 103–114 (1996)Google Scholar
  8. 8.
    Bradley, P.S., Fayyad, U.M., Reina, C.: Scaling clustering algorithms to large databases. In: Proceedings of Knowledge Discovery and Data Mining, pp. 9–15 (1998)Google Scholar
  9. 9.
    Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A framework for clustering evolving data streams. In: Proceedings of 29th International Conference on Very Large Data Bases, pp. 81–92 (2003)CrossRefGoogle Scholar
  10. 10.
    Wang, D., Li, T., Zhu, S., Ding, C.: Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization. In: Proceedings of 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 307–314 (2008)Google Scholar
  11. 11.
    Yih, W.-T., Goodman, J., Vanderwende, L., Suzuki, H.: Multi-document summarization by maximizing informative content-words. In: Proceedings of 20th International Joint Conference on Artificial Intelligence, pp. 1776–1782 (2007)Google Scholar
  12. 12.
    He, Q., Chang, K., Lim, E.-P., Zhang, J.: Bursty feature representation for clustering text streams. In: Proceedings of the 2007 SIAM International Conference on Data Mining, pp. 491–496 (2007)CrossRefGoogle Scholar
  13. 13.
    Aggarwal, C.C., Yu, P.S.: On clustering massive text and categorical data streams. Knowl. Inf. Syst. 24(2), 171–196 (2010)CrossRefGoogle Scholar
  14. 14.
    Xu, J., Kalashnikov, D.V., Mehrotra, S.: Efficient summarization framework for multi-attribute uncertain data. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 421–432 (2014)Google Scholar
  15. 15.
    Sharifi, B., Hutton, M.-A., Kalita, J.: Summarizing microblogs automatically. In: Proceedings of Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association Computation Linguistics, pp. 685–688 (2010)Google Scholar
  16. 16.
    Gong, L., Zeng, J., Zhang, S.: Text stream clustering algorithm based on adaptive feature selection. Expert Syst. Appl. 38(3), 1393–1399 (2011)CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  • Geeta G. Dayalani
    • 1
    Email author
  • Balkrishna K. Patil
    • 1
  • Rajesh A. Auti
    • 1
  1. 1.Computer Science EngineeringEverest College of EngineeringAurangabadIndia

Personalised recommendations