Certainty Identification in Texts: Categorization Model and Manual Tagging Results

  • Victoria L. Rubin
  • Elizabeth D. Liddy
  • Noriko Kando
Part of the The Information Retrieval Series book series (INRE, volume 20)


This chapter presents a theoretical framework and preliminary results for manual categorization of explicit certainty information in 32 English newspaper articles. Our contribution is in a proposed categorization model and analytical framework for certainty identification. Certainty is presented as a type of subjective information available in texts. Statements with explicit certainty markers were identified and categorized according to four hypothesized dimensions — level, perspective, focus, and time of certainty. The preliminary results reveal an overall promising picture of the presence of certainty information in texts, and establish its susceptibility to manual identification within the proposed four-dimensional certainty categorization analytical framework. Our findings are that the editorial sample group had a significantly higher frequency of markers per sentence than did the sample group of the news stories. For editorials, high level of certainty, writer’s point of view, and future and present time were the most populated categories. For news stories, the most common categories were high and moderate levels, directly involved third party’s point of view, and past time. These patterns have positive practical implications for automation.


Subjectivity manual tagging natural language processing uncertainty epistemic comments evidentials hedges certainty expressions point of view annotating opinions 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

7. Bibliography

  1. Anick, P. and Bergler, S. (1992) Lexical structures for linguistic inference. In Pustejovsky, J. and Bergler, S. (Eds.) Lexical Semantics and Knowledge Representation. Berlin, Springer Verlag: 121–135.Google Scholar
  2. Banfield, A. (1982) Unspeakable Sentences. Routledge and Kegan Paul, Boston.Google Scholar
  3. Bergler, S., Doandes, M., Gerard, C., and Witte, R. (2004) Attributions. In Qu, Y., Shanahan, J. G., Wiebe, J. (Eds.) Proceedings of AAAI Spring Symposium: Exploring Attitude and Affect in Text: Theories and Applications, Stanford, CA. AAAI Press.Google Scholar
  4. Cappon, R. J. (2000) The Associated Press Guide to News Writing. Foster City, CA, IDG Books Worldwide Inc.Google Scholar
  5. Chafe, W. (1986) Evidentiality in English Conversation and Academic Writing. In Chafe, W. and Nichols, J. (Eds.) Evidentiality: The Linguistic Coding of Epistemology. Norwood, New Jersey, Ablex Publishing Corporation. 20: 261–273.Google Scholar
  6. Coates, J. (1983) The Semantics of the Modal Auxiliaries. London & Canberra, Croom Helm.Google Scholar
  7. Holmes, J. (1990) Hedges and boosters in women’s and men’s speech. Language and communication 10(3): 185–205.CrossRefGoogle Scholar
  8. Hyland, K. (1998) Hedging in Scientific Research Articles. Amsterdam, Philadelphia, John Benjamin Publishing Company.Google Scholar
  9. Kando, N. (1996) Text structure analysis based on human recognition: Cases of Japanese newspaper and English newspaper. Bulletin of National Center for Science Information Systems, No. 8, pp.107–126 (Japanese)Google Scholar
  10. Lackoff, G. (1972) Hedges: a study of meaning criteria and the logic of fuzzy concepts. Chicago Linguistic Society Papers.Google Scholar
  11. Liddy, E.D., McVearry, K., Paik, W., Yu, E.S., and McKenna, M. (1993) Development, implementation & Testing of a Discourse Model for Newspaper Texts. Proceedings of the ARPA Workshop on Human Language Technology, Princeton, NJ, March 21–24, 1993.Google Scholar
  12. Liddy, E.D., Paik, W., and McKenna, M. (1995) Development and Implementation of a discourse model for newspaper texts. Proceedings of the AAAI Symposium on Empirical Methods in Discourse Interpretation and Generation. Stanford, CA.Google Scholar
  13. Merriam-Webster Online Dictionary, Accessed on January 30, 2004.Google Scholar
  14. Mushin, I. (2001) Evidentiality and Epistemological Stance: Narrative Retelling. Amsterdam, John Benjamins Publishing Co.Google Scholar
  15. Rubin, V. L., Stanton, J. M., and Liddy E. D. (2004) Discerning Emotions in Texts. AAAI Spring Symposium: Exploring Attitude and Affect in Text: Theories and Applications, Stanford, CA.Google Scholar
  16. Searle, J. R. (1979) Expression and Meaning: Studies in the Theory of Speech Acts. Cambridge, London, New York, Melbourne, Cambridge University Press.Google Scholar
  17. van Dijk, T. A. (1981) Studies in the Pragmatics of Discourse, Mouton Publishers, The Hague, The NetherlandsGoogle Scholar
  18. Wiebe, J. M. (1994) Tracking Point of View in Narrative. Computational Linguistics 20(2): 233–287.Google Scholar
  19. Wiebe, J. M. (2000) Learning Subjective Adjectives from Corpora. Proceedings of the 17th National Conference on Artificial Intelligence (AAAI-2000). Austin, Texas, July 2000.Google Scholar
  20. Wiebe, J., Bruce, R., Bell, M., Martin, M., and Wilson, T. (2001) A Corpus Study of Evaluative and Speculative Language. Proceedings of the 2nd ACL SIGdial Workshop on Discourse and Dialogue. Aalborg, Denmark, September, 2001.Google Scholar

Copyright information

© Springer 2006

Authors and Affiliations

  • Victoria L. Rubin
    • 1
  • Elizabeth D. Liddy
    • 1
  • Noriko Kando
    • 2
  1. 1.School of Information Studies Center for Natural Language ProcessingSyracuse UniversitySyracuseUSA
  2. 2.National Institute of InformaticsTokyoJapan

Personalised recommendations