Abstract
With the proliferation of social media like Facebook and Twitter more and more people have started depending on it for all sorts of news. Social media (such as Facebook, twitter), topical forums, wikis etc. enable a community of end users to interact or cooperate towards a common goal. However, the effectiveness of disseminating information through social media lacks in quality as a result of less fact checking, low barriers to entry, more biases, and several rumors thereby making social media not only the source of genuine news but also fake news. Thus, in this paper we are dealing with the problem of classifying information on the basis of its reliability using freely available user generated data like tweets on twitter using machine learning algorithms. A variety of machine learning algorithms like linear regression, logistic regression, Naïve Bayes etc. are used to determine the trust carried by the tweets. In addition to basic machine learning algorithms we present a learning model that is based on the combination of naïve Bayes and logistic regression and also naïve Bayes and linear regression. Also a technique has been discussed to determine a threshold parameter which is used for the classification of data in case of SMO and linear regression algorithm in order to increase their accuracy which is achieved for modified linear regression algorithm in most of the cases. The modified linear regression algorithm when used in combination with naïve Bayes achieves a better accuracy than all the other algorithms used in this paper. Also hybrid classification works better than the individual algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Basharat, S., Chachoo, M.: On linear vs hybrid configuration: an empirical study. In: International Conference on Advances in Computers, Communication and Electronic Engineering, Commune, Srinagar, pp. 180–184 (2015)
Castillo, C., Mendoza, M., Poblete, B.: Information credibility on Twitter. In: Proceedings of the 20th International Conference on World Wide Web, pp. 675–684. ACM, March 2011
Corcoran, M.: Death by cliff plunge, with a push from twitter. The New York Times (2009)
Kleinbaum, D.G., Klein, M.: Logistic Regression. A Self-Learning Text. SBH. Springer, New York (2010). https://doi.org/10.1007/978-1-4419-1742-3
Esfandiari, G.: The Twitter devolution. Foreign Policy, 7 June 2010
Grover, R.: Ad. ly: The Art of Advertising on Twitter. Businessweek, 6 January 2011
Gupta, A., Kumaraguru, P.: Credibility ranking of tweets during high impact events. In: Proceedings of the 1st Workshop on Privacy and Security in Online Social Media, p. 2. ACM, April 2012
Gupta, A., Kumaraguru, P., Castillo, C., Meier, P.: TweetCred: real-time credibility assessment of content on Twitter. In: Aiello, L.M., McFarland, D. (eds.) SocInfo 2014. LNCS, vol. 8851, pp. 228–243. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-13734-6_16
Gupta, A., Lamba, H., Kumaraguru, P.: $1.00 per rt# bostonmarathon# prayforboston: analyzing fake content on Twitter. In: eCrime Researchers Summit (eCRS), 2013, pp. 1–12. IEEE, September 2013a
Gupta, M., Zhao, P., Han, J.: Evaluating event credibility on Twitter. In: SDM, pp. 153–164, January 2012
Harrington, P.: Machine Learning in Action, vol. 5. Manning, Greenwich (2012)
Hilligoss, B., Rieh, S.Y.: Developing a unifying framework of credibility assessment: construct, heuristics, and interaction in context. Inf. Process. Manag. 44(4), 1467–1484 (2008)
Ismail, S., Latif, R.A.: Authenticity issues of social media: credibility, quality and reality. In: Proceedings of World Academy of Science, Engineering and Technology, no. 74, p. 265. World Academy of Science, Engineering and Technology (WASET), February 2013
Ng, A.Y., Jordan, M.I.: On discriminative vs. generative classifiers: a comparison of logistic regression and naive bayes. Adv. Neural. Inf. Process. Syst. 14, 841 (2002)
Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a social network or a news media? In: Proceedings of the 19th International Conference on World Wide Web, pp. 591–600. ACM, April 2010
Logistic Regression: Wikipedia (2014). http://en.wikipedia.org/w/index.php?title=Logistic_Regression. Accessed 15 Nov 2014
Mendoza, M., Poblete, B., Castillo, C.: Twitter under crisis: can we trust what we RT? In: Proceedings of the First Workshop on Social Media Analytics, pp. 71–79. ACM, July 2010
Montgomery, D.C., Peck, E.A., Vining, G.G.: Introduction to Linear Regression Analysis. Wiley, Hoboken (2015)
Morris, M.R., Counts, S., Roseway, A., Hoff, A., Schwarz, J.: Tweeting is believing?: understanding microblog credibility perceptions. In: Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work, pp. 441–450. ACM, February 2012
Mustafaraj, E., Metaxas, P.T.: From obscurity to prominence in minutes: political speech and real-time search (2010)
Naive Bayes Classifier: Wikipedia (2014). http://en.wikipedia.org/w/index.php?title=Naive_Bayes_classifier. Accessed 15 Nov 2014
Owens, S.: How Celebrity Imposters Hurt Twitter’s Credibility. Mediashift, 20 February 2009
Ratkiewicz, J., Conover, M., Meiss, M., Gonçalves, B., Patil, S., Flammini, A., Menczer, F.: Truthy: mapping the spread of astroturf in microblog streams. In: Proceedings of the 20th International Conference Companion on World Wide Web, pp. 249–252. ACM, March 2011
Sikdar, S., Kang, B., ODonovan, J., Höllerer, T., Adah, S.: Understanding information credibility on twitter. In: 2013 International Conference on Social Computing (SocialCom), pp. 19–24. IEEE, September 2013
Sullivan, D.: Twitter’s Real Time Spam Problem. Search Engine Land (2009)
Sundar, S.S.: The MAIN model: a heuristic approach to understanding technology effects on credibility. In: Digital Media, Youth, and Credibility, vol. 73100 (2008)
Support vector machines: The linearly separable case, Nlp.stanford.edu (2008). http://nlp.stanford.edu/IR-book/html/htmledition/support-vector-machines-the-linearly-separable-case-1.html
Varela, P.L., Martins, A.F., Aguiar, P.M., Figueiredo, M.A.: An empirical study of feature selection for sentiment analysis. In: 9th Conference on Telecommunications, Conftele, Castelo Branco, May 2013
Yu, H., Kim, S.: SVM tutorial—classification, regression and ranking. In: Rozenberg, G., Bäck, T., Kok, J.N. (eds.) Handbook of Natural computing, pp. 479–506. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-540-92910-9_15
Xanthopoulos, P., Pardalos, P.M., Trafalis, T.B.: Linear discriminant analysis. Robust Data Mining. BRIEFSOPTI, pp. 27–33. Springer, New York (2013). https://doi.org/10.1007/978-1-4419-9878-1_4
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Basharat, S., Ahmad, M. (2018). Inferring Trust from Message Features Using Linear Regression and Support Vector Machines. In: Bhattacharyya, P., Sastry, H., Marriboyina, V., Sharma, R. (eds) Smart and Innovative Trends in Next Generation Computing Technologies. NGCT 2017. Communications in Computer and Information Science, vol 828. Springer, Singapore. https://doi.org/10.1007/978-981-10-8660-1_44
Download citation
DOI: https://doi.org/10.1007/978-981-10-8660-1_44
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-8659-5
Online ISBN: 978-981-10-8660-1
eBook Packages: Computer ScienceComputer Science (R0)