Sentiment-Based Ranking of Blog Posts Using Rhetorical Structure Theory

  • Jose M. Chenlo
  • Alexander Hogenboom
  • David E. Losada
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7934)


Polarity estimation in large-scale and multi-topic domains is a difficult issue. Most state-of-the-art solutions essentially rely on frequencies of sentiment-carrying words (e.g., taken from a lexicon) when analyzing the sentiment conveyed by natural language text. These approaches ignore the structural aspects of a document, which contain valuable information. Rhetorical Structure Theory (RST) provides important information about the relative importance of the different text spans in a document. This knowledge could be useful for sentiment analysis and polarity classification. However, RST has only been studied for polarity classification problems in constrained and small scale scenarios. The main objective of this paper is to explore the usefulness of RST in large-scale polarity ranking of blog posts. We apply sentence-level methods to select the key sentences that convey the overall on-topic sentiment of a blog post. Then, we apply RST analysis to these core sentences in order to guide the classification of their polarity and thus to generate an overall estimation of the document’s polarity with respect to a specific topic. Our results show that RST provides valuable information about the discourse structure of the texts that can be used to make a more accurate ranking of documents in terms of their estimated sentiment in multi-topic blogs.


Blog Opinion Mining Sentiment Analysis Polarity Estimation Discourse Structure Rhetorical Structure Theory 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Santos, R.L.T., Macdonald, C., McCreadie, R., Ounis, I., Soboroff, I.: Information retrieval on the blogosphere. Found. Trends Inf. Retr. 6(1), 1–125 (2012)zbMATHCrossRefGoogle Scholar
  2. 2.
    Pang, B., Lee, L.: Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval 2(1-2), 1–135 (2007)Google Scholar
  3. 3.
    Ounis, I., Macdonald, C., Soboroff, I.: Overview of the TREC 2008 blog track. In: Proc. of the 17th Text Retrieval Conference, TREC 2008. NIST (2008)Google Scholar
  4. 4.
    Chenlo, J.M., Losada, D.: Effective and efficient polarity estimation in blogs based on sentence-level evidence. In: Proc. 20th ACM Int. Conf. on Information and Knowledge Management, CIKM 2011, Glasgow, UK, pp. 365–374 (2011)Google Scholar
  5. 5.
    Heerschop, B., Goossen, F., Hogenboom, A., Frasincar, F., Kaymak, U., de Jong, F.: Polarity analysis of texts using discourse structure. In: Proc. 20th ACM Int. Conf. on Inf. and Knowledge Manag., CIKM 2011, Glasgow, UK, pp. 1061–1070 (2011)Google Scholar
  6. 6.
    Mann, W.C., Thompson, S.A.: Rhetorical structure theory: Toward a functional theory of text organization. Text 8(3), 243–281 (1988)Google Scholar
  7. 7.
    Gerani, S., Carman, M.J., Crestani, F.: Proximity-based opinion retrieval. In: Proc. 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2010, pp. 403–410. ACM, New York (2010)CrossRefGoogle Scholar
  8. 8.
    Santos, R.L.T., He, B., Macdonald, C., Ounis, I.: Integrating proximity to subjective sentences for blog opinion retrieval. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 325–336. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  9. 9.
    He, B., Macdonald, C., He, J., Ounis, I.: An effective statistical approach to blog post opinion retrieval. In: Proc. 17th ACM Int. Conf. on Information and Knowledge Management, CIKM 2008, pp. 1063–1072. ACM, New York (2008)Google Scholar
  10. 10.
    Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity in phrase-level sentiment analysis. In: Proc. Conf. on Human Language Technology and Empirical Methods in Natural Language Processing, HLT 2005, pp. 347–354. ACL (2005)Google Scholar
  11. 11.
    He, B., Macdonald, C., Ounis, I.: Ranking opinionated blog posts using opinionfinder. In: SIGIR, pp. 727–728 (2008)Google Scholar
  12. 12.
    Robertson, S.: How okapi came to TREC. In: Voorhees, E.M., Harman, D.K. (eds.) TREC: Experiments and Evaluation in Information Retrieval, pp. 287–299 (2005)Google Scholar
  13. 13.
    Soricut, R., Marcu, D.: Sentence level discourse parsing using syntactic and lexical information. In: Proc. 2003 Conf. of the North American Chapter of the ACL on Human Language Technology, NAACL 2003, vol. 1, pp. 149–156. ACL, Stroudsburg (2003)Google Scholar
  14. 14.
    Carlson, L., Marcu, D., Okurowski, M.E.: Building a discourse-tagged corpus in the framework of rhetorical structure theory. In: Proc. 2nd SIGdial Workshop on Discourse and Dialogue, SIGDIAL 2001, vol. 16, pp. 1–10. ACL (2001)Google Scholar
  15. 15.
    Macdonald, C., Ounis, I.: The TREC Blogs 2006 collection: Creating and analysing a blog test collection. Technical Report TR-2006-224, Department of Computing Science, University of Glasgow (2006)Google Scholar
  16. 16.
    Parapar, J., Vidal, M., Santos, J.: Finding the best parameter setting: Particle swarm optimisation. In: 2nd Spanish Conf. on IR, CERI 2012, pp. 49–60 (2012)Google Scholar
  17. 17.
    Pang, B., Lee, L.: A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In: Pr. of the ACL, pp. 271–278 (2004)Google Scholar
  18. 18.
    Zirn, C., Niepert, M., Stuckenschmidt, H., Strube, M.: Fine-grained sentiment analysis with structural features. In: Asian Federation of Natural Language Processing, vol. 12 (2011)Google Scholar
  19. 19.
    Somasundaran, S., Namata, G., Wiebe, J., Getoor, L.: Supervised and unsupervised methods in employing discourse relations for improving opinion polarity classification. In: Proc. 2009 Conf. on Empirical Methods in Natural Language Processing, EMNLP 2009, vol. 1, pp. 170–179. ACL (2009)Google Scholar
  20. 20.
    Zhou, L., Li, B., Gao, W., Wei, Z., Wong, K.F.: Unsupervised discovery of discourse relations for eliminating intra-sentence polarity ambiguities. In: Proc. Conf. on Empirical Methods in Natural Language Processing, EMNLP 2011, pp. 162–171. ACL, Stroudsburg (2011)Google Scholar
  21. 21.
    Lioma, C., Larsen, B., Lu, W.: Rhetorical relations for information retrieval. In: Proc. 35th Int. Conf. ACM SIGIR on Research and Development in Information Retrieval, SIGIR 2012, pp. 931–940. ACM, New York (2012)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Jose M. Chenlo
    • 1
  • Alexander Hogenboom
    • 2
  • David E. Losada
    • 1
  1. 1.Centro de Investigación en Tecnoloxías da Información (CITIUS)Universidad de Santiago de CompostelaSpain
  2. 2.Econometric InstituteErasmus University RotterdamThe Netherlands

Personalised recommendations