TKG: A Graph-Based Approach to Extract Keywords from Tweets

  • Willyan Daniel AbilhoaEmail author
  • Leandro Nunes de Castro
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 290)


Twitter is a microblog service that generates a huge amount of textual content daily. All this content needs to be explored by means of text mining, natural language processing, information retrieval, and other techniques. In this context, automatic keyword extraction is a task of great usefulness. A fundamental step in text mining techniques consists of building a model for text representation. This paper proposes a keyword extraction method for tweet collections that represents texts as graphs and applies centrality measures for finding the relevant vertices (keywords). The proposal is applied to two tweet collections of Brazilian TV shows and its results are compared to those of TFIDF and KEA.


Knowledge Discovery Text Mining Keyword Extraction Graph Theory Centrality Measures 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Kietzmann, J.H., Hermkens, K., McCarthy, I.P., Silvestre, B.S.: Social media? Get serious! Understanding the functional building blocks of social media. Business Horizons 54, 241–251 (2011)CrossRefGoogle Scholar
  2. 2.
    Yoshida, M., Matsushima, S., Ono, S., Sato, I., Nakagawa, H.: ITC-UT: Tweet Categorization by Query Categorization of On-line Reputation Management. In: Conference on Multilingual and Multimodal Information Access Evaluation (2010)Google Scholar
  3. 3.
    Prabowo, R., Thelwall, M.: Sentiment analysis: A combined approach. Journal of Informetrics 3, 143–157 (2009)CrossRefGoogle Scholar
  4. 4.
    Bermingham, A., Smeaton, A.: On Using Twitter to Monitor Political Sentiment and Predict Election Results. Sentiment Analysis Where AI Meets Psychology, 2–10 (2011)Google Scholar
  5. 5.
    Feldman, R., Sanger, J.: The Text Mining Handbook Advanced Approaches in Analysing Unstructured Data, Cambridge (2007)Google Scholar
  6. 6.
    Hirschman, L., Thompson, H.S.: Overview of evaluation in speech and natural language processing. In: Survey of the State of the Art in Human Language Technology, pp. 409–414. Cambridge University Press and Giardini Editori, Pisa (1997)Google Scholar
  7. 7.
    Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. ACM Press (1999)Google Scholar
  8. 8.
    Salton, G., Yang, C.S., Yu, C.T.: A Theory of Term Importance in Automatic Text Analysis. Journal of the American society for Information Science 26, 33–44 (1975)CrossRefGoogle Scholar
  9. 9.
    Zhang, C., Wang, H., Liu, Y., Wu, Y., Liao, Y., Wang, B.: Automatic Keyword Extraction from Documents Using Conditional Random Fields. Journal of Computational Information Systems, 1169–1180 (2008)Google Scholar
  10. 10.
    Gross, J.L., Yellen, J.: Graph Theory and Its Applications, 2nd edn. Chapman & Hall/CRC (2006)Google Scholar
  11. 11.
    Jin, W., Srihari, R.K.: Graph-based text representation and knowledge discovery. In: Proceedings of the 2007 ACM Symposium on Applied Computing, vol. 7, pp. 807–811 (2007)Google Scholar
  12. 12.
    Palshikar, G.K.: Keyword Extraction from a Single Document Using Centrality Measures. In: Ghosh, A., De, R.K., Pal, S.K. (eds.) PReMI 2007. LNCS, vol. 4815, pp. 503–510. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  13. 13.
    Zhou, F., Zhang, F., Yang, B.: Graph-based text representation model and its realization. Natural Language Processing and Knowledge Engineering (NLP-KE) 8(1), 21–23 (2010)Google Scholar
  14. 14.
    Schenker, A., Last, M., Bunke, H.: Classification of Web documents using a graph model. Document Analysis and Recognition 1, 240–244 (2003)Google Scholar
  15. 15.
    Hensman, S.: Construction of conceptual graph representation of texts. In: Proceedings of Student Research Workshop at HLT-NAACL, Boston, pp. 49–54 (2004)Google Scholar
  16. 16.
    Nieminen, J.: On the centrality in a graph. Scand. J. Psychol. 15, 332–336 (1974)CrossRefGoogle Scholar
  17. 17.
    Wasserman, S., Faust, K., Iacobucci, D.: Social Network Analysis: Methods and Applications. Cambridge University Press, Cambridge (1995)Google Scholar
  18. 18.
    Hage, P., Harary, F.: Eccentricity and centrality in networks. Social Networks 17, 57–63 (1995)CrossRefGoogle Scholar
  19. 19.
    Zhang, K., Xu, H., Tang, J., Li, J.: Keyword Extraction Using Support Vector Machine. In: Yu, J.X., Kitsuregawa, M., Leong, H.-V. (eds.) WAIM 2006. LNCS, vol. 4016, pp. 85–96. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  20. 20.
    Rose, S., Engel, D., Cramer, N., Cowley, W.: Automatic Keyword Extraction from Individual Documents. Text Mining: Applications and Theory, 1–20 (2010)Google Scholar
  21. 21.
    Lott, B.: Survey of Keyword Extraction Techniques. UNM Education (2012)Google Scholar
  22. 22.
    Witten, I.H., Paynter, G.W., Frank, E., Gutwin, C., Nevill-Manning, C.G.: KEA Practical Automatic Keyphrase Action. In: Proceedings of the 4th ACM Conference on Digital Library (DL 1999), Berkeley, CA, USA, pp. 254–226 (1999)Google Scholar
  23. 23.
    Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Academic Press (2001)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Willyan Daniel Abilhoa
    • 1
    Email author
  • Leandro Nunes de Castro
    • 1
  1. 1.Natural Computing LaboratoryMackenzie Presbyterian UniversitySão PauloBrazil

Personalised recommendations