Advertisement

Sentiment analysis using convolutional neural network via word embeddings

  • Nadia NedjahEmail author
  • Igor Santos
  • Luiza de Macedo Mourelle
Special Issue
  • 26 Downloads

Abstract

Convolutional neural networks are known for their excellent performance in computer vision, achieving results in the state of the art. Moreover, recent research has shown that these networks can also provide promising results for natural language processing. In this case, the basic idea is to concatenate the vector representations of words into a single block and use it as an image. However, despite the good results, the problem of using convolution networks is the large numbers of design decisions that need to be made á priori. These models require the definition of many hyper-parameters, including the type of word embeddings, which consists of the data vectorized representation, the activation function that prints the non-linearity characteristics to the model, the size of the filter that applies data convolution, the number of feature maps, which are responsible for identifying the attributes and the pooling method used for data reduction. In addition, one must also predefine the regularization constant and the dropout rate, which are responsible for avoiding any network over-fitting. In existing research works, convolutional neural network architectures capable of overcoming the performance of traditional machine learning models are presented. Even though these can compete with more complex models, the problem of how the different setting of the hyper-parameters may affect the performance of this type of network has not yet been explored. In this paper, we propose an efficient sentiment analysis classifier using convolutional neural networks by analyzing the impact of the hyper-parameters on the model performance. The main interest in analyzing sentiment comes from the advent of social media and the technological advances that flood the Internet with opinions. Nonetheless, mining the Internet for opinion and sentiment analysis is not an easy task and thus needs outstanding models with the best hyper-parameters setting to be able to get pertinent answers. The results achieved are obtained with the use of GPU and show that the different configurations exceed the reference models in the most of the cases with gains of up to 18% and have similar performance to the models of the state of the art with gains of up to 2% in some cases.

Keywords

Sentiment analysis Crowd intelligence Convolutional neural networks Text embeddings 

Notes

Acknowledgements

This work is supported by CAPES, the Coordination of Improvement of Higher Education Personnel of the Brazilian Federal Government. This study is also funded by FAPERJ (Fundação de Amparo à Pesquisa do Estado do Rio de Janeiro) via the grant number 203.111/2018.

References

  1. 1.
    Abadi M et al (2015) TensorFlow: large-scale machinelearning on heterogeneous systems, 2015. Software https://www.tensorflow.org/. Accessed 20 Aug 2015
  2. 2.
    Al-Smadi M, Qawasmeh O, Al-Ayyouba M, Jararweh Y, Gupta B (2018) Deep recurrent neural network vs. support vectormachine for aspect-based sentiment analysis of arabic hotels’ reviews. J Comput Sci 27:386–393CrossRefGoogle Scholar
  3. 3.
    Archak N, Ghose A, Ipeirotis PG (2007) Show me the money!: Deriving the pricing power of product features by mining consumer reviews. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’07. ACM, New York, pp 56–65Google Scholar
  4. 4.
    Asur S, Huberman BA (2010) Predicting the future with social media. In: Proceedings of the 2010 IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology—volume 01, WI-IAT ’10. IEEE Computer Society, Washington, pp 492–499Google Scholar
  5. 5.
    Bagui S, Nguyen LT (2015) Database sharding: to provide faulttolerance and scalability of big data on the cloud. Int J Cloud Appl Comput 5:36–52Google Scholar
  6. 6.
    Balbaert I (2015) Getting started with Julia programming language. Packt Publishing, New YorkGoogle Scholar
  7. 7.
    Barbosa L, Feng J (2010) Robust sentiment detection on twitter from biased and noisy data. In: Proceedings of the 23rd international conference on computational linguistics: posters, COLING ’10. Association for Computational Linguistics, Stroudsburg, pp 36–44Google Scholar
  8. 8.
    Bar-Haim R, Dinur E, Feldman R, Fresko M, Goldstein G (2011) Identifying and following expert investors in stock microblogs. In: Proceedings of the conference on empirical methods in natural language processing, EMNLP ’11. Association for Computational Linguistics, Stroudsburg, pp 1310–1319Google Scholar
  9. 9.
    Bhushan K, Gupta BB (2017) Network flow analysis for detection and mitigation of fraudulent resource consumption (frc) attacks in multimedia cloud computing. Multimed Tools Appl 78(4):4267–4298CrossRefGoogle Scholar
  10. 10.
    Bhushan K, Gupta BB Distributed denial of service (ddos) attack mitigation in software defined network (sdn)-based cloud computing environment. J Ambient Intell Humaniz Comput 1–13.  https://doi.org/10.1007/s12652-018-0800-9
  11. 11.
    Bojanowski P, Grave E, Joulin A, Mikolov T (2016) Enriching word vectors with subword information. arXiv:1607.04606
  12. 12.
    Bollen J, Mao H, Zeng X-J (2010) Twitter mood predicts the stock market. CoRR. arXiv:abs/1010.3003
  13. 13.
    Castellanos M, Dayal U, Hsu M, Ghosh R, Dekhil M, Lu Y, Zhang L, Schreiman M (2011) Lci: a social channel analysis platform for live customer intelligence. In: Proceedings of the 2011 ACM SIGMOD international conference on management of data, SIGMOD ’11. ACM, New York, pp 1049–1058Google Scholar
  14. 14.
    Chen Y, Xie J (2008) Online consumer review: word-of-mouth asa new element of marketing communication mix. Manag Sci 54(3):477–491CrossRefGoogle Scholar
  15. 15.
    Collins M, Schapire RE, Singer Y (2002) Logisticregression, adaboost and bregman distances. Mach Learn 48(1–3):253–285CrossRefzbMATHGoogle Scholar
  16. 16.
    Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297zbMATHGoogle Scholar
  17. 17.
    Das SR, Chen MY (2007) Yahoo! for amazon: sentimentextraction from small talk on the web. Manag Sci 53(9):1375–1388CrossRefGoogle Scholar
  18. 18.
    Dav. word2vec. https://github.com/dav/word2vec. 10, 2016
  19. 19.
    Dave K, Lawrence S, Pennock DM (2003) Mining the peanut gallery: opinion extraction and semantic classification of product reviews. In: Proceedings of the 12th international conference on World Wide Web, WWW ’03. ACM, New York, pp 519–528Google Scholar
  20. 20.
    Dellarocas C, Zhang X, Awad NF (2007) Exploring the value of online product reviews in forecasting sales: the case of motion pictures. J Interact Market 21(4):23–45CrossRefGoogle Scholar
  21. 21.
    Dong L, Wei F, Liu S, Zhou M, Xu K (2014) A statistical parsing framework for sentiment classification. CoRR. arXiv:abs/1401.6330
  22. 22.
    Dumais S, Platt J, Heckerman D, Sahami M (1988) Inductive learning algorithms and representations for text categorization. In: Proceedings of the seventh international conference on information and knowledge management, CIKM ’98. ACM, New York, pp 148–155Google Scholar
  23. 23.
    Esuli A, Sentiwordnet FS (2006) A publicly available lexical resource for opinion mining. https://nmis.isti.cnr.it/sebastiani/Publications/LREC06.pdf. Accessed 30 June 2018
  24. 24.
    Feldman R, Rosenfeld B, Bar-Haim R, Fresko M (2011) The stock sonar—sentiment analysis of stocks based on a hybrid approach. In Shapiro DG, Fromherz MPJ (eds) IAAI. AAAI, San FranciscoGoogle Scholar
  25. 25.
    Gerard S, McGill Michael J (1983) Introduction to moderninformation retrieval. McGraw-Hill Book Company, New YorkzbMATHGoogle Scholar
  26. 26.
    Glove global vectors for word representation. https://nlp.stanford.edu/projects/glove/. 08, 2014
  27. 27.
    Gosset WS (1908) The probable error of a mean. Biometrika 1:1–25, March 1908. Originally published under the pseudonym “Student”Google Scholar
  28. 28.
    Groh G, Hauffa J (2011) Characterizing social relations vianlp-based sentiment analysis. In: Adamic LA, Baeza-Yates RA, Counts S (eds) ICWSM. The AAAI PressGoogle Scholar
  29. 29.
    Guido R (1995) Python reference manual. Technical report, Amsterdam, The NetherlandsGoogle Scholar
  30. 30.
    Hermann KM, Blunsom P (2013) The role of syntax in vector space models of compositional semantics. In: Proceedings of the 51st annual meeting of the association for computational linguistics, Sofia, Bulgaria, 2013. Association for Computational LinguisticsGoogle Scholar
  31. 31.
    Hong Y, Skiena S (2010) The wisdom of bookies—sentiment analysis versus the nfl point spread. In: Cohen WW, Gosling S (eds) ICWSM. The AAAI Press, AtlantaGoogle Scholar
  32. 32.
    Hu Z, Hu J, Ding W, Zheng X (2015) Review sentiment analysis based on deep learning. In: 2015 IEEE 12th international conference on, e-Business engineering (ICEBE). IEEE, Taipei, pp 87–94Google Scholar
  33. 33.
    Hu M, Liu B (2004) Mining and summarizing customer reviews. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’04. ACM, New York, pp 168–177Google Scholar
  34. 34.
    Hu N, Pavlou PA, Zhang J (2006) Can online reviews reveal a product’s true quality? Empirical findings and analytical modeling of online word-of-mouth communication. In: Proceedings of the 7th ACM conference on electronic commerce, EC ’06. ACM, New York, pp 324–330Google Scholar
  35. 35.
    John N, Ian B, Michael G, Kevin S (2008) Scalable parallel programming with cuda. Queue 6(2):40–53CrossRefGoogle Scholar
  36. 36.
    Jones E, Oliphant T, Peterson P et al (2001) SciPy: Open source scientific tools for Python, 2001. Accessed 24 June, 2017Google Scholar
  37. 37.
    Joshi M, Das D, Gimpel K, Smith, NA (2010) Movie reviews and revenues: an experiment in text regression. In: Human language technologies: the 2010 annual conference of the north american chapter of the association for computational linguistics, HLT ’10, 2010. Association for Computational Linguistics, Stroudsburg, pp 293–296Google Scholar
  38. 38.
    Kalchbrenner N, Grefenstette E, Blunsom P (2014) A convolutional neural network for modelling sentences. CoRR. arXiv: abs/1404.2188
  39. 39.
    Kernighan BW (1998) The C programming language, 2nd edn. Prentice Hall Professional Technical Reference, New YorkGoogle Scholar
  40. 40.
    Kim Y (2014) Convolutional neural networks for sentence classification. CoRR. arXiv: abs/1408.5882
  41. 41.
    Liu J, Cao Y, Lin C-Y, Huang Y, Ming Z (2007) Low-quality product review detection in opinion summarization. In: Proceedings of the joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL), 2007, Poster paper. The Association for Computational Linguistics, Prague, pp 334–342Google Scholar
  42. 42.
    Li C, Xu B, Wu G, He S, Tian G, Hao H (2014) Recursive deep learning for sentiment analysis over social data. In: Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 2014 IEEE/WIC/ACM international joint conferences on, vol 2, Aug 2014. IEEE, Warsaw, pp 180–185Google Scholar
  43. 43.
    McGlohon M, Glance N, Reiter Z (2010) Star quality: aggregating reviews to rank products and merchants. In: Proceedings of fourth international conference on weblogs and social media (ICWSM), 2010. Association for the Advancement of Artificial Intelligence, WashingtonGoogle Scholar
  44. 44.
    Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. CoRR. arXiv: abs/1301.3781
  45. 45.
    Miller M, Sathi C, Wiesenthal D, Leskovec J, Potts C (2011) Sentiment flow through hyperlink networks. In: Adamic LA, Baeza-Yates RA, Counts S (eds) ICWSM, 2011. The AAAI Press, San FranciscoGoogle Scholar
  46. 46.
    Mitchell TM (1997) Machine learning, 1st edn. McGraw-Hill Inc, New YorkzbMATHGoogle Scholar
  47. 47.
    Mohamed NM, Shimaa OL (2015) Cloud computing: the futureof big data management. Int J Cloud Appl Comput 5:53–61Google Scholar
  48. 48.
    Mohammad SM, Kiritchenko S, Zhu X (2013) Nrc-canada: building the state-of-the-art in sentiment analysis of tweets. CoRR. arXiv: abs/1308.6242
  49. 49.
    Mohammad SM, Yang T (2013) Tracking sentiment in mail: How genders differ on emotional axes. CoRR. arXiv: abs/1309.6347
  50. 50.
    Mucheol K, Gupta BB, Seunmin R (2018) Crowd sourcing based scientific issue tracking with topic analysis. Appl Soft Comput 66:506–511CrossRefGoogle Scholar
  51. 51.
    Narayanan V, Arora I, Bhatia A (2013) Fast and accurate sentiment classification using an enhanced naive bayes model. CoRR. arXiv: abs/1305.6143
  52. 52.
    Nasukawa T, Yi J (2003) Sentiment analysis: capturing favorability using natural language processing. In: Proceedings of the 2nd international conference on knowledge capture, K-CAP ’03, 2003. ACM, New York, pp 70–77Google Scholar
  53. 53.
    Nielsen MA (2015) Neural networks and deep learning. Determination Press, New YorkGoogle Scholar
  54. 54.
    Nir F, Dan G, Moises G (1997) Bayesiannetwork classifiers. Mach Learn 29(2–3):131–163zbMATHGoogle Scholar
  55. 55.
    O’Connor B, Balasubramanyan R, Routledge BR, Smith NA (2010) From tweets to polls: linking text sentiment to public opinion time series. In: Proceedings of the international AAAI conference on weblogs and social media, 2010. Association for the Advancement of Artificial Intelligence, WashingtonGoogle Scholar
  56. 56.
    Odersky M, Spoon L, Venners B (2008) Programming in Scala: a comprehensive step-by-step guide, 1st edn. Artima Incorporation, New YorkGoogle Scholar
  57. 57.
    Ouyang X, Zhou P, Li CH, Liu L (2015) Sentiment analysis using convolutional neural network. In: 2015 IEEE international conference on computer and information technology; ubiquitous computing and communications; dependable, autonomic and secure computing; pervasive intelligence and computing (CIT/IUCC/DASC/PICOM), Oct 2015. IEEE, Allahabad, pp 2359–2364Google Scholar
  58. 58.
    Pak A, Paroubek P (2010) Twitter as a corpus for sentiment analysis and opinion mining. In: Calzolari N (Conference Chair), Choukri K, Maegaard B, Mariani J, Odijk J, Piperidis S, Rosner M, Tapias D (eds) Proceedings of the seventh international conference on language resources and evaluation (LREC’10), May 2010. European Language Resources Association (ELRA), VallettaGoogle Scholar
  59. 59.
    Pang B, Lee L, Vaithyanathan S (2002) Thumbs up: sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 conference on empirical methods in natural language processing—volume 10, EMNLP ’02, 2002. Association for Computational Linguistics, Stroudsburg, pp 79–86Google Scholar
  60. 60.
    Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, PassosA CD, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830MathSciNetzbMATHGoogle Scholar
  61. 61.
    Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Empirical methods in natural language processing (EMNLP), 2014. Association for Computational Linguistics, Doha, pp 1532–1543Google Scholar
  62. 62.
    Press WH, Teukolsky SA, Vetterling WT, Flannery BP (1993) Numerical recipes in FORTRAN; the art of scientific computing, 2nd edn. Cambridge University Press, New YorkzbMATHGoogle Scholar
  63. 63.
    Rudner LM, Liang T (2002) Automated essay scoring using bayes’ theorem. J Technol Learn Assess 1(2)Google Scholar
  64. 64.
    Sadikov E, Parameswaran AG, Venetis P (2009) Blogs aspredictors of movie success. In: Adar E, Hurst M, Finin T, Glance NS, Nicolov N, Tseng BL (eds) ICWSM, 2009. The AAAI Press, ArlingtonGoogle Scholar
  65. 65.
    Sakunkoo P, Sakunkoo N (2009) Analysis of social influencein online book reviews. In: Adar E, Hurst M, Finin T, Glance NS, Nicolov N, Tseng BL (eds) ICWSM, 2009. The AAAI Press, ArlingtonGoogle Scholar
  66. 66.
    Socher R, Huval B, Manning CD, Ng AY (2012) Semantic compositionality through recursive matrix-vector spaces. In: Proceedings of the 2012 conference on empirical methods in natural language processing (EMNLP), 2012. Association for Computational Linguistics, SeattleGoogle Scholar
  67. 67.
    Socher R, Pennington J, Huang EH, Ng AY, Manning CD (2011) Semi-supervised recursive autoencoders for predicting sentiment distributions. In: Proceedings of the conference on empirical methods in natural language processing, EMNLP ’11, 2011. Association for Computational Linguistics, StroudsburgGoogle Scholar
  68. 68.
    Socher R, Perelygin A, Wu JY, Chuang J, Manning CD, Ng AY, Potts CP (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: EMNLP, 2013. Association for Computational Linguistics, SeattleGoogle Scholar
  69. 69.
    Steinwart I, Christmann A (2008) Support vector machines, 1st edn. Springer Publishing Company, Incorporated, New YorkzbMATHGoogle Scholar
  70. 70.
    Stone PJ, Hunt EB (1963) A computer approach to content analysis: studies using the general inquirer system. In: Proceedings of the May 21–23, 1963, Spring joint computer conference, AFIPS ’63 (Spring), 1963. ACM, New York, pp 241–256Google Scholar
  71. 71.
    Tapas K, Mount David M, Netanyahu Nathan S, PiatkoChristine D, Silverman Ruth W, Angela Y (2002) An efficientk-means clustering algorithm: analysis and implementation. IEEETrans Pattern Anal Mach Intell 24(7):881–892CrossRefGoogle Scholar
  72. 72.
    Tumasjan A, Sprenger TO, Sandner PG, Welpe IM (2010) Predicting elections with twitter: what 140 characters reveal about political sentiment. In: Proceedings of the fourth international AAAI conference on weblogs and social media, 2010. Association for the Advancement of Artificial Intelligence, Washington, pp 178–185Google Scholar
  73. 73.
    Turney PD (2002) Thumbs up or thumbs down: semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th annual meeting on association for computational linguistics, ACL ’02, 2002. Association for Computational Linguistics, Stroudsburg, pp 417–424Google Scholar
  74. 74.
    van der Walt S, Colbert SC, Varoquaux G (2011) Thenumpy array: a structure for efficient numerical computation. Comput Sci Eng 13(2):22–30CrossRefGoogle Scholar
  75. 75.
    Vapnik Vladimir N (1995) The nature of statistical learning theory. Springer, New YorkCrossRefzbMATHGoogle Scholar
  76. 76.
    Wang S, Manning C (2013) Fast dropout training. In: Dasgupta S, Mcallester D (eds) Proceedings of the 30th international conference on machine learning (ICML-13), vol 28, May 2013. JMLR workshop and conference proceedings. Atlanta, pp 118–126Google Scholar
  77. 77.
    Wiebe JM, Bruce RF, O’Hara TP (1999) Development and use of a gold-standard data set for subjectivity classifications. In: Proceedings of the 37th annual meeting of the association for computational linguistics on computational linguistics, ACL ’99, 1999. Association for Computational Linguistics, Stroudsburg, pp 246–253Google Scholar
  78. 78.
    Wilson T, Wiebe J, Hwa R (2004) Just how mad are you? finding strong and weak opinion clauses. In: Proceedings of the 19th national conference on artifical intelligence, AAAI’04. AAAI Press, pp 761–767Google Scholar
  79. 79.
    Yano T, Smith NA (2010) What’s worthy of comment? Content andcomment volume in political blogs. In: Cohen WW, Gosling S (eds) ICWSM, 2010. The AAAI Press, WashingtonGoogle Scholar
  80. 80.
    Zhang W, Skiena S (2010) Trading strategies to exploit blog and news sentiment. In: Proceedings of the fourth international AAAI conference on weblogs and social media, May 23th, 2010. Association for the Advancement of Artificial Intelligence, Washington, DC, USA, pp 375–378Google Scholar
  81. 81.
    Zhang Y, Wallace BC (2015) A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification. CoRR. arXiv: abs/1510.03820
  82. 82.
    Zharmagambetov AS, Pak AA (2015) Sentiment analysis of a document using deep learning approach and decision trees. In: 2015 Twelve international conference on electronics computer and computation (ICECCO), Sept 2015. IEEE, Almaty, pp 1–4Google Scholar
  83. 83.
    Zhou S, Chen Q, Wang X (2010) Active deep networks for semi-supervised sentiment classification. In: Proceedings of the 23rd international conference on computational linguistics: posters, COLING ’10, 2010. Association for Computational Linguistics, Stroudsburg, pp 1515–1523Google Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  • Nadia Nedjah
    • 1
    Email author
  • Igor Santos
    • 1
  • Luiza de Macedo Mourelle
    • 2
  1. 1.Department of Electronics Engineering and Telecommunications, Faculty of EngineeringState University of Rio de JaneiroRio de JaneiroBrazil
  2. 2.Department of Systems Engineering and Computation, Faculty of EngineeringState University of Rio de JaneiroRio de JaneiroBrazil

Personalised recommendations