Abstract
Nowadays, a huge amount of opinions about specific brands of a company are shared on the Web. Such opinions are an important source of information for customers and companies. Unfortunately, there is an increasing number of deceptive opinions in order to deceive consumers by promoting a low quality product (positive deceptive) or by criticizing a potentially better quality product (negative deceptive). This paper focuses on the detection of negative deceptive opinions from tweets on specific brands of a company. We developed a classifier that detects negative deceptive opinions by combining lexical features of a tweet and personal profile and behavioural features of the writer. One of the challenges to develop this system is the lack of labeled dataset for training and testing. To resolve this issue, we collect our own dataset and label each tweet by multiple experts. Our experimental results show that the proposed system is a promising approach for detecting negative deceptive opinions. Our approach can help to identify defamers by analyzing personal profiles and writing style of each writer.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Tsytsarau M, Palpanas T (2012) Survey on mining subjective data on the web. Data Min Knowl Disc 24(3):478–514
Brown JS, Duguid P (2000) The Social Life of Information. Harvard Business Press, Boston
Dave K, Lawrence S, Pennoc DM (2003) Mining the peanut gallery: opinion extraction and semantic classification of product reviews. In: Proceedings of the 12th international conference on world wide web, pp 519–528. ACM, Budapest, Hungary
Hu M, Liu B (2004) Mining and summarizing customer reviews. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pp 168–177. ACM, Seattle, Washington
Li W, Zhong N, Liu C (2006) Combining multiple email filters based on multivariate statistical analysis. In: International symposium on methodologies for intelligent systems, pp 729–738. Springer, Berlin, Heidelberg
Ntoulas A, Najork M, Manasse M, Fetterly D (2006) Detecting spam web pages through content analysis. In: Proceedings of the 15th international conference on world wide web, pp 83–92. ACM, Edinburgh, Scotland
Sahami M, Dumais S, Heckerman D, Horvitz E (1998) A Bayesian approach to filtering junk e-mail. In: Learning for text categorization: papers from the 1998 workshop, vol 62, pp 98–105, Madison, Wiscon
Jindal N, Liu B (2007) Analyzing and detecting review spam. In: Seventh IEEE international conference on data mining, pp 547–552. IEEE, Omaha
Fetterly D, Manasse M, Najork M (2005) Detecting phrase-level duplication on the world wide web. In: Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval, pp 170–177. ACM
Castillo C, Donato D, Becchetti L, Boldi P, Leonardi S, Santini M, Vigna S (2006) A reference collection for web spam. ACM SIGIR Forum 40(2):11–24
Gyongyi Z, Gartia-Molina H, Pedersen J (2004) Combating web spam with TrustRank. In: Proceedings of the 30th VLDB conference, pp 576–587, Toronto, Canada
Henzinger M (2006) Finding near-duplicate web pages: a large-scale evaluation of algorithms. In: Proceedings of the 29th annual international SIGIR conference, pp 284–291, ACM, Seattle, Washington, USA
Liu B (2007) Web Data Mining: Exploring Hyperlinks, Contents and Usage Data. Springer, New York
Turney PD (2002) Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th annual meeting on association for computational linguistics, pp 417–424, Philadelphia
Wang Y-M, Ma M, Niu Y, Chen H (2007) Spam double-funnel: connecting web spammers with advertisers. In: Proceedings of the 16th international conference on world wide web, pp 291–300. ACM, Banff, Alberta, Canada
Jindal N, Liu B (2008) Opinion spam and analysis. In: Proceedings of the 2008 international conference on web search and web data mining, pp 219–230. ACM, Palo Alto, California, USA
Gyongyi Z, Gartia-Molina H (2005) Web spam taxonomy. In: First international workshop on adversarial information retrieval on the web, Chiba, Japan
Drucker H, Wu D, Vapnik VN (1999) Support vector machines for spam categorization. IEEE Trans Neural Netw 10(5):1048–1054
Zheng R, Li J, Chen H, Huang Z (2006) A framework for authorship identification of online messages: writing-style features and classification techniques. J Am Soc Inf Sci Technol 57(3):378–393
Wang Q, Liang B, Shi W, Liang Z, Sun W (2010) Detecting spam comments with malicious users’ behavioral characteristics. In: International conference on information theory and information security, pp 563–567. IEEE, Beijing, China
Shojaee S, Azrifah M, Murad A, Azman A, Sharef NM, Nadali S (2013) Detecting deceptive reviews using lexical and syntactic features. In: 13th international conference on intelligent systems design and applications, pp 53–58. IEEE, Bangi, Malaysia
Frank MG, Menasco MA (2009) Human behavior and deception detection. In: Handbook of science and technology for homeland security, Wiley, New York
Iqbal F, Binsalleeh H, Fung BCM, Debbabi M (2010) Mining writeprints from anonymous e-mails for forensic investigation. Digital Invest. Int J Digital Forensics 7(1–2):56–64
Ott M, Choi Y, Cardie C, Hancock JT (2011) Finding deceptive opinion spam by any stretch of the imagination. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies, vol 1, pp 309–319 Portland, Oregon
Manning CD, Raghavan P, Schutze H (2009) Introduction to Information Retrieval. Cambridge University Press, Cambridge
Huang J, Lu J, Ling CX (2003) Comparing naive bayes, decision trees, and SVM with AUC and accuracy. In: Proceedings of the third international conference on data mining, pp 553–556. IEEE, Melbourne, Florida, USA
Feng S, Banerjee R, Choi Y (2012) Syntactic stylometry for deception detection. In: Proceedings of the 50th annual meeting of the association for computational linguistics, pp 171–175, Jeju, Republic of Korea
Pearl L, Steyvers M (2012) Detecting authorship deception: a supervised machine learning approach using author writeprints. Literary Linguist Comput 27(2):183–196
Ott M (2011) Deceptive Opinion Spam Corpus v1.4. http://my1eott.com/op_spam/
Acknowledgments
This research was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education [NRF-2016R1D1A1B03933875].
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media Singapore
About this paper
Cite this paper
Molla, A., Biadgie, Y., Sohn, KA. (2018). Detecting Negative Deceptive Opinion from Tweets. In: Kim, K., Joukov, N. (eds) Mobile and Wireless Technologies 2017. ICMWT 2017. Lecture Notes in Electrical Engineering, vol 425. Springer, Singapore. https://doi.org/10.1007/978-981-10-5281-1_36
Download citation
DOI: https://doi.org/10.1007/978-981-10-5281-1_36
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-5280-4
Online ISBN: 978-981-10-5281-1
eBook Packages: EngineeringEngineering (R0)