Skip to main content

Inferring Trust from Message Features Using Linear Regression and Support Vector Machines

  • Conference paper
  • First Online:
Smart and Innovative Trends in Next Generation Computing Technologies (NGCT 2017)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 828))

Included in the following conference series:

Abstract

With the proliferation of social media like Facebook and Twitter more and more people have started depending on it for all sorts of news. Social media (such as Facebook, twitter), topical forums, wikis etc. enable a community of end users to interact or cooperate towards a common goal. However, the effectiveness of disseminating information through social media lacks in quality as a result of less fact checking, low barriers to entry, more biases, and several rumors thereby making social media not only the source of genuine news but also fake news. Thus, in this paper we are dealing with the problem of classifying information on the basis of its reliability using freely available user generated data like tweets on twitter using machine learning algorithms. A variety of machine learning algorithms like linear regression, logistic regression, Naïve Bayes etc. are used to determine the trust carried by the tweets. In addition to basic machine learning algorithms we present a learning model that is based on the combination of naïve Bayes and logistic regression and also naïve Bayes and linear regression. Also a technique has been discussed to determine a threshold parameter which is used for the classification of data in case of SMO and linear regression algorithm in order to increase their accuracy which is achieved for modified linear regression algorithm in most of the cases. The modified linear regression algorithm when used in combination with naïve Bayes achieves a better accuracy than all the other algorithms used in this paper. Also hybrid classification works better than the individual algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Basharat, S., Chachoo, M.: On linear vs hybrid configuration: an empirical study. In: International Conference on Advances in Computers, Communication and Electronic Engineering, Commune, Srinagar, pp. 180–184 (2015)

    Google Scholar 

  2. Castillo, C., Mendoza, M., Poblete, B.: Information credibility on Twitter. In: Proceedings of the 20th International Conference on World Wide Web, pp. 675–684. ACM, March 2011

    Google Scholar 

  3. Corcoran, M.: Death by cliff plunge, with a push from twitter. The New York Times (2009)

    Google Scholar 

  4. Kleinbaum, D.G., Klein, M.: Logistic Regression. A Self-Learning Text. SBH. Springer, New York (2010). https://doi.org/10.1007/978-1-4419-1742-3

    Book  MATH  Google Scholar 

  5. Esfandiari, G.: The Twitter devolution. Foreign Policy, 7 June 2010

    Google Scholar 

  6. Grover, R.: Ad. ly: The Art of Advertising on Twitter. Businessweek, 6 January 2011

    Google Scholar 

  7. Gupta, A., Kumaraguru, P.: Credibility ranking of tweets during high impact events. In: Proceedings of the 1st Workshop on Privacy and Security in Online Social Media, p. 2. ACM, April 2012

    Google Scholar 

  8. Gupta, A., Kumaraguru, P., Castillo, C., Meier, P.: TweetCred: real-time credibility assessment of content on Twitter. In: Aiello, L.M., McFarland, D. (eds.) SocInfo 2014. LNCS, vol. 8851, pp. 228–243. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-13734-6_16

    Chapter  Google Scholar 

  9. Gupta, A., Lamba, H., Kumaraguru, P.: $1.00 per rt# bostonmarathon# prayforboston: analyzing fake content on Twitter. In: eCrime Researchers Summit (eCRS), 2013, pp. 1–12. IEEE, September 2013a

    Google Scholar 

  10. Gupta, M., Zhao, P., Han, J.: Evaluating event credibility on Twitter. In: SDM, pp. 153–164, January 2012

    Google Scholar 

  11. Harrington, P.: Machine Learning in Action, vol. 5. Manning, Greenwich (2012)

    Google Scholar 

  12. Hilligoss, B., Rieh, S.Y.: Developing a unifying framework of credibility assessment: construct, heuristics, and interaction in context. Inf. Process. Manag. 44(4), 1467–1484 (2008)

    Article  Google Scholar 

  13. Ismail, S., Latif, R.A.: Authenticity issues of social media: credibility, quality and reality. In: Proceedings of World Academy of Science, Engineering and Technology, no. 74, p. 265. World Academy of Science, Engineering and Technology (WASET), February 2013

    Google Scholar 

  14. Ng, A.Y., Jordan, M.I.: On discriminative vs. generative classifiers: a comparison of logistic regression and naive bayes. Adv. Neural. Inf. Process. Syst. 14, 841 (2002)

    Google Scholar 

  15. Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a social network or a news media? In: Proceedings of the 19th International Conference on World Wide Web, pp. 591–600. ACM, April 2010

    Google Scholar 

  16. Logistic Regression: Wikipedia (2014). http://en.wikipedia.org/w/index.php?title=Logistic_Regression. Accessed 15 Nov 2014

  17. Mendoza, M., Poblete, B., Castillo, C.: Twitter under crisis: can we trust what we RT? In: Proceedings of the First Workshop on Social Media Analytics, pp. 71–79. ACM, July 2010

    Google Scholar 

  18. Montgomery, D.C., Peck, E.A., Vining, G.G.: Introduction to Linear Regression Analysis. Wiley, Hoboken (2015)

    MATH  Google Scholar 

  19. Morris, M.R., Counts, S., Roseway, A., Hoff, A., Schwarz, J.: Tweeting is believing?: understanding microblog credibility perceptions. In: Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work, pp. 441–450. ACM, February 2012

    Google Scholar 

  20. Mustafaraj, E., Metaxas, P.T.: From obscurity to prominence in minutes: political speech and real-time search (2010)

    Google Scholar 

  21. Naive Bayes Classifier: Wikipedia (2014). http://en.wikipedia.org/w/index.php?title=Naive_Bayes_classifier. Accessed 15 Nov 2014

  22. Owens, S.: How Celebrity Imposters Hurt Twitter’s Credibility. Mediashift, 20 February 2009

    Google Scholar 

  23. Ratkiewicz, J., Conover, M., Meiss, M., Gonçalves, B., Patil, S., Flammini, A., Menczer, F.: Truthy: mapping the spread of astroturf in microblog streams. In: Proceedings of the 20th International Conference Companion on World Wide Web, pp. 249–252. ACM, March 2011

    Google Scholar 

  24. Sikdar, S., Kang, B., ODonovan, J., Höllerer, T., Adah, S.: Understanding information credibility on twitter. In: 2013 International Conference on Social Computing (SocialCom), pp. 19–24. IEEE, September 2013

    Google Scholar 

  25. Sullivan, D.: Twitter’s Real Time Spam Problem. Search Engine Land (2009)

    Google Scholar 

  26. Sundar, S.S.: The MAIN model: a heuristic approach to understanding technology effects on credibility. In: Digital Media, Youth, and Credibility, vol. 73100 (2008)

    Google Scholar 

  27. Support vector machines: The linearly separable case, Nlp.stanford.edu (2008). http://nlp.stanford.edu/IR-book/html/htmledition/support-vector-machines-the-linearly-separable-case-1.html

  28. Varela, P.L., Martins, A.F., Aguiar, P.M., Figueiredo, M.A.: An empirical study of feature selection for sentiment analysis. In: 9th Conference on Telecommunications, Conftele, Castelo Branco, May 2013

    Google Scholar 

  29. Yu, H., Kim, S.: SVM tutorial—classification, regression and ranking. In: Rozenberg, G., Bäck, T., Kok, J.N. (eds.) Handbook of Natural computing, pp. 479–506. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-540-92910-9_15

    Chapter  Google Scholar 

  30. Xanthopoulos, P., Pardalos, P.M., Trafalis, T.B.: Linear discriminant analysis. Robust Data Mining. BRIEFSOPTI, pp. 27–33. Springer, New York (2013). https://doi.org/10.1007/978-1-4419-9878-1_4

    Chapter  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Manzoor Ahmad .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Basharat, S., Ahmad, M. (2018). Inferring Trust from Message Features Using Linear Regression and Support Vector Machines. In: Bhattacharyya, P., Sastry, H., Marriboyina, V., Sharma, R. (eds) Smart and Innovative Trends in Next Generation Computing Technologies. NGCT 2017. Communications in Computer and Information Science, vol 828. Springer, Singapore. https://doi.org/10.1007/978-981-10-8660-1_44

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-8660-1_44

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-8659-5

  • Online ISBN: 978-981-10-8660-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics