Distributed and Parallel Databases

, Volume 29, Issue 3, pp 239–260 | Cite as

Quantifying the trustworthiness of social media content

  • Sai T. Moturu
  • Huan Liu


The growing popularity of social media in recent years has resulted in the creation of an enormous amount of user-generated content. A significant portion of this information is useful and has proven to be a great source of knowledge. However, since much of this information has been contributed by strangers with little or no apparent reputation to speak of, there is no easy way to detect whether the content is trustworthy. Search engines are the gateways to knowledge but search relevance cannot guarantee that the content in the search results is trustworthy. A casual observer might not be able to differentiate between trustworthy and untrustworthy content. This work is focused on the problem of quantifying the value of such shared content with respect to its trustworthiness. In particular, the focus is on shared health content as the negative impact of acting on untrustworthy content is high in this domain. Health content from two social media applications, Wikipedia and Daily Strength, is used for this study. Sociological notions of trust are used to motivate the search for a solution. A two-step unsupervised, feature-driven approach is proposed for this purpose: a feature identification step in which relevant information categories are specified and suitable features are identified, and a quantification step for which various unsupervised scoring models are proposed. Results indicate that this approach is effective and can be adapted to disparate social media applications with ease.


Trust evaluation Trustworthiness Social media Content Quality 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Adler, B., Chatterjee, K., de Alfaro, L., Faella, M., Pye, I., Raman, V.: Assigning trust to Wikipedia content. In: 4th Intl Symposium on Wikis, Wikisym 2008. (2008) Google Scholar
  2. 2.
    Agichtein, E., Castillo, C., Donato, D., Gionis, A., Mishne, G.: Finding high-quality content in social media. In: Proc. of the Intl. Conf. on Web Search and Web Mining, pp. 183–194. (2008) CrossRefGoogle Scholar
  3. 3.
    Bailey, B.P., Gurak, L.J., Konstan, J.A.: Trust in cyberspace. In: Ratner, J. (ed.) Human Factors and Web Development, 2nd ed., pp. 311–321. Lawrence Erlbaum, New Jersey (2002) Google Scholar
  4. 4.
    Blumenstock, J.: Size matters: word count as a measure of quality on Wikipedia. In: Proc. of the 17th Intl. Conf. on World Wide Web (WWW) 2008, pp. 1095–1096. ACM, New York (2008) CrossRefGoogle Scholar
  5. 5.
    Childs, S.: Judging the quality of Internet-based health information. Perform. Meas. Metr. 6(2), 80–96 (2005) CrossRefGoogle Scholar
  6. 6.
    Dondio, P., Barrett, S.: Computational trust in web content quality: a comparative evaluation on the Wikipedia project. Informatica 31(2), 151–160 (2007) Google Scholar
  7. 7.
    Hu, M., Lim, E.P., Sun, A., Lauw, H.W., Vuong, B.Q. Measuring article quality in Wikipedia: models and evaluation. In: Proc. of the 16th ACM Conf. on Information and Knowledge Management, CIKM 2007, pp. 243–252. (2007) Google Scholar
  8. 8.
    Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Sys. 20(4), 422–446 (2002) CrossRefGoogle Scholar
  9. 9.
    Korp, P.: Health on the Internet: implications for health promotion. Health Educ. Res. 21(1), 78–86 (2005) CrossRefGoogle Scholar
  10. 10.
    Liu, H., Hussain, F., Tan, C., Dash, M.: Discretization: an enabling technique. Data Min. Knowl. Discov. 6(4), 393–423 (2002) CrossRefMathSciNetGoogle Scholar
  11. 11.
    McGuinness, D., Zeng, H., Da Silva, P., Ding, L., Narayanan, D., Bhaowal, M.: Investigations into trust for collaborative information repositories: a Wikipedia case study. In: Proc. of the Workshop on Models of Trust for the Web 2006, pp. 3–131. (2006) Google Scholar
  12. 12.
    McSherry, F., Najork, M.: Computing information retrieval performance measures efficiently in the presence of tied scores. Lect. Notes Comput. Sci. 4956, 414–421 (2008) CrossRefGoogle Scholar
  13. 13.
    Siegrist, M., Cvetkovich, G.: Perception of hazards: the role of social trust and knowledge. Risk Anal. 20(5), 713–720 (2006) CrossRefGoogle Scholar
  14. 14.
    Sztompka, P.: Trust: A sociological theory. Cambridge Univ Press, Cambridge (1999) Google Scholar
  15. 15.
    Tan, P., Steinbach, M., Kumar, V.: Introduction to data mining. Addison-Wesley/Longman, Boston (2005) Google Scholar
  16. 16.
    Zeng, H., Alhossaini, M., Ding, L., Fikes, D., McGuiness, D.L.: Computing trust from revision history. In: Proc. of the 2006 Intl. Conf. on Privacy, Security and Trust (2006) Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  1. 1.School of Computing, Informatics and Decision Systems EngineeringArizona State UniversityTempeUSA
  2. 2.Media LabMassachusetts Institute of TechnologyCambridgeUSA

Personalised recommendations