Advertisement

Soft Computing

, Volume 23, Issue 14, pp 5431–5442 | Cite as

Enhanced cross-domain sentiment classification utilizing a multi-source transfer learning approach

  • Farhan Hassan KhanEmail author
  • Usman Qamar
  • Saba Bashir
Methodologies and Application

Abstract

Online social networks have become extremely popular with the ever-increasing reachability of internet to the common person. There are millions of tweets, Facebook messages, and product reviews posted every day. Such huge amount of data presents an opportunity to analyze the sentiment of masses in order to facilitate the decision making for the betterment of society. Sentiment analysis is the research area that quantitates the opinions expressed in natural language. It is a combination of various research fields such as text mining, natural language processing, artificial intelligence, statistics. The application of supervised machine learning algorithms is limited due to the unavailability of labeled data whereas the unsupervised or lexicon-based methodologies show weak performance. This scenario sets the stage for transfer learning or cross-domain learning approaches where the knowledge is learned from the source domain which is then applied to the target domain. The proposed approach computes the feature weights by the application of cosine similarity measure to SentiWordNet and generates revised sentiment scores. Model learning is performed by support vector machine using two experimental settings, i.e., single source and multiple target domains and multiple source and single target domains (MSST). Nine benchmark datasets have been employed for performance evaluation. Best performance was obtained using the MSST settings with 85.05% accuracy, 85.01% precision, 85.10% recall, and 85.05% F-measure. State-of-the-art performance comparison proved that the cosine similarity-based transfer learning approach outperforms other approaches.

Keywords

Cross-domain Sentiment analysis Classification Transfer learning Support vector machine SentiWordNet 

Notes

Compliance with ethical standards

Conflict of interest

All the authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

References

  1. Ash JT, Schapire RE (2016) Multi-source domain adaptation using approximate label matching. arXiv preprint arXiv:1602.04889
  2. Baccianella S, Esuli A, Sebastiani F (2010) SentiWordNet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. LREC 10:2200–2204Google Scholar
  3. Balahur A (2013) Sentiment analysis in social media texts. In: 4th workshop on computational approaches to subjectivity, sentiment and social media analysis, pp 120–128Google Scholar
  4. Blitzer J, Dredze M, Pereira F (2007) Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. In: ACL, vol 7, pp 440–447Google Scholar
  5. Bollegala D, Weir D, Carroll J (2013) Cross-domain sentiment classification using a sentiment sensitive thesaurus. IEEE Trans Knowl Data Eng 25(8):1719–1731CrossRefGoogle Scholar
  6. Bollegala D, Mu T, Goulermas JY (2016) Cross-domain sentiment classification using sentiment sensitive embeddings. IEEE Trans Knowl Data Eng 28(2):398–410CrossRefGoogle Scholar
  7. Chattopadhyay R, Sun Q, Fan W, Davidson I, Panchanathan S, Ye J (2012) Multisource domain adaptation and its application to early detection of fatigue. ACM Trans Knowl Discov Data (TKDD) 6(4):18Google Scholar
  8. Didaci L, Fumera G, Gimel’farb Roli F, Hancock E, Imiya A, Kuijper A, Kudo M, Omachi S, Windeatt T, Yamada K (2012) Analysis of co-training algorithm with very small training sets. Springer, Berlin, pp 719–726Google Scholar
  9. Domeniconi G, Moro G, Pagliarani A, Pasolini R (2015) Markov chain based method for in-domain and cross-domain sentiment classification. In: Proceedings of the 7th international conference on knowledge discovery and information retrievalGoogle Scholar
  10. Duan L, Tsang IW, Xu D, Chua TS (2009) Domain adaptation from multiple sources via auxiliary classifiers. In: Proceedings of the 26th annual international conference on machine learning, ACM, New York, pp 289–296Google Scholar
  11. Fazakis N, Karlos S, Kotsiantis S, Sgarbas K (2016) Self-trained LMT for semisupervised learning. Comput Intel Neurosci 2016:1–13CrossRefGoogle Scholar
  12. Franco-Salvador M, Cruz FL, Troyano JA, Rosso P (2015) Cross-domain polarity classification using a knowledge-enhanced meta-classifier. Knowl Based Syst 86:46–56CrossRefGoogle Scholar
  13. Gezici G, Yanikoglu B, Tapucu D, Saygın Y (2015) Sentiment analysis using domain-adaptation and sentence-based analysis. In: Gaber MM, Cocea M, Wiratunga N, Goker A (eds) Advances in social media analysis. Springer, Berlin, pp 45–64CrossRefGoogle Scholar
  14. Huang X, Rao Y, Xie H, Wong TL, Wang FL (2017) Cross-domain sentiment classification via topic-related TrAdaBoost. In: Proceedings of the thirty-first AAAI conference on artificial intelligence, pp 4939–4940Google Scholar
  15. Joachims T (1998) Making large-scale SVM learning practical. In: Schölkopf B, Burges CJC, Smola AJ (eds) Advances in kernel methods—support vector learning. MIT Press, Cambridge, pp 169–184Google Scholar
  16. Khan FH, Qamar U, Bashir S (2015) Building normalized SentiMI to enhance semi-supervised sentiment analysis. J Intel Fuzzy Syst 29:1805–1816CrossRefGoogle Scholar
  17. Khan FH, Qamar U, Bashir S (2016) eSAP: a decision support framework for enhanced sentiment analysis and polarity classification. Inf Sci 367:862–873CrossRefGoogle Scholar
  18. Khan FH, Qamar U, Bashir S (2017) A semi-supervised approach to sentiment analysis using revised sentiment strength based on SentiWordNet. Knowl Inf Syst 51(3):851–872CrossRefGoogle Scholar
  19. Kim K, Chung BS, Choi Y, Lee S, Jung JY, Park J (2014) Language independent semantic kernels for short-text classification. Expert Syst Appl 41(2):735–743CrossRefGoogle Scholar
  20. Li S, Xue Y, Wang Z, Zhou G (2013) Active learning for cross-domain sentiment classification. In: IJCAIGoogle Scholar
  21. Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologiesGoogle Scholar
  22. Mahalakshmi S, Sivasankar E (2015) Cross domain sentiment analysis using different machine learning techniques. In: Proceedings of the fifth international conference on fuzzy and neuro computing (FANCCO-2015), Springer, Berlin, pp 77–87Google Scholar
  23. Mansour Y, Mohri M, Rostamizadeh A (2009) Domain adaptation with multiple sources. In: Advances in neural information processing systems, pp 1041–1048Google Scholar
  24. Mao K, Niu J, Wang X, Wang L, Qiu M (2015) Cross-domain sentiment analysis of product reviews by combining lexicon-based and learn-based techniques. In: 2015 IEEE 17th international conference on high performance computing and communications (HPCC), 2015 IEEE 7th international symposium on cyberspace safety and security (CSS), 2015 IEEE 12th international conference on embedded software and systems (ICESS), pp 351–356Google Scholar
  25. Medhat W, Hassan A, Korashy H (2014) Sentiment analysis algorithms and applications: a survey. Ain Shams Eng J 5(4):1093–1113CrossRefGoogle Scholar
  26. Miller GA (1995) WordNet: a lexical database for English. Commun ACM 38(11):39–41CrossRefGoogle Scholar
  27. Moore A, Rayson P, Young S (2016) Domain adaptation using stock market prices to refine sentiment dictionaries. In: Proceedings of the 10th edition of language resources and evaluation conference (LREC2016). European Language Resources Association (ELRA)Google Scholar
  28. Pak MY, Gunal S (2016) Sentiment classification based on domain prediction. Elektron Elektrotech 22(2):96–99Google Scholar
  29. Pan, SJ, Ni X, Sun JT, Yang Q, Chen Z (2010) Cross-domain sentiment classification via spectral feature alignment. In: Proceedings of the 19th international conference on world wide web, ACM, New York, pp 751–760Google Scholar
  30. Pan J, Hu X, Zhang Y, Li P, Lin Y, Li H, Li L (2015) Quadruple transfer learning exploiting both shared and non-shared concepts for text classification. Knowl Based Syst 90:199–210CrossRefGoogle Scholar
  31. Pang B, Lee L (2004) A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd annual meeting on association for computational linguisticsGoogle Scholar
  32. Seah CW, Chieu HL, Chai KMA, Teow N, Yeong LW (2015) Troll detection by domain-adapting sentiment analysis. In: 18th International conference on information fusion (Fusion)Google Scholar
  33. Shinnou H, Xiao L, Sasaki M, Komiya K (2015) Hybrid method of semi-supervised learning and feature weighted learning for domain adaptation of document classification. In: Proceedings of the 29th pacific asia conference on language, information and computation, pp 496–503Google Scholar
  34. Sidorov G, Gelbukh A, Gómez-Adorno H, Pinto D (2014) Soft similarity and soft cosine measure: similarity of features in vector space model. Comput Syst 18(3):491–504Google Scholar
  35. Smailović J, Grčar M, Lavrač N, Žnidaršič M (2014) Stream-based active learning for sentiment analysis in the financial domain. Inf Sci 285:181–203CrossRefGoogle Scholar
  36. Toutanova K, Klein D, Manning CD, Singer Y (2003) Feature-rich part-of-speech tagging with a cyclic dependency network. In: Proceedings of the 2003 conference of the north american chapter of the association for computational linguistics on human language technologyGoogle Scholar
  37. Triguero I, García S, Herrera Francisco (2013) Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study. Knowl Inf Syst 42(2):245–284CrossRefGoogle Scholar
  38. Wang L, Niu J, Song H, Atiquzzaman M (2018) SentiRelated: a cross-domain sentiment classification algorithm for short texts through sentiment related index. J Netw Comput Appl 101:111–119CrossRefGoogle Scholar
  39. Wu F, Huang Y (2016) Sentiment domain adaptation with multiple sources. In: Proceedings of the 54th annual meeting on association for computational linguistics, pp 301–310Google Scholar
  40. Yang X, Zhang T, Xu C (2015) Cross-domain feature learning in multimedia. IEEE Trans Multimed 17(1):64–78CrossRefGoogle Scholar
  41. Yang L, Zhang S, Lin H, Wei X (2015) Incorporating sample filtering into subject-based ensemble model for cross-domain sentiment classification. In: Chinese computational linguistics and natural language processing based on naturally annotated big data, Springer, Berlin, pp 116–127Google Scholar
  42. Yoshida Y, Hirao T, Iwata T, Nagata M, Matsumoto Y (2011) Transfer learning for multiple-domain sentiment analysis-identifying domain dependent/independent word polarity. In: Proceedings of the twenty-fifth AAAI conference on artificial intelligence, pp 1286–1291Google Scholar
  43. Zhang Y, Hu X, Li P, Li L, Wu X (2015a) Cross-domain sentiment classification-feature divergence, polarity divergence or both? Pattern Recognit Lett 65:44–50CrossRefGoogle Scholar
  44. Zhang S, Liu H, Yang L, Lin H (2015b) A cross-domain sentiment classification method based on extraction of key sentiment sentence. In: National CCF conference on natural language processing and chinese computing, Springer, Berlin, pp 90–101Google Scholar
  45. Zhang Y, Xu X, Hu X (2015c) A common subspace construction method in cross-domain sentiment classification. In: International conference on electronic science and automation control (ESAC). Atlantis Press, Amsterdam. pp 48–52Google Scholar
  46. Zhou G, Zhou Y, Guo X, Tu X, He T (2015) Cross-domain sentiment classification via topical correspondence transfer. Neurocomputing 159:298–305CrossRefGoogle Scholar
  47. Zhu E, Huang G, Mo B, Wu Q (2016) Features extraction based on neural network for cross-domain sentiment classification. In: International conference on database systems for advanced applications, Springer, Berlin, pp 81–88Google Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  • Farhan Hassan Khan
    • 1
    Email author
  • Usman Qamar
    • 1
  • Saba Bashir
    • 1
    • 2
  1. 1.Knowledge and Data Science Research Center (KDRC), Department of Computer and Software Engineering, College of Electrical and Mechanical EngineeringNational University of Sciences and Technology (NUST)IslamabadPakistan
  2. 2.Department of Computer ScienceFederal Urdu University of Arts, Science and Technology (FUUAST)IslamabadPakistan

Personalised recommendations