Sentiment Analysis of Code-Mixed Bambara-French Social Media Text Using Deep Learning Techniques

Computer Science
  • 5 Downloads

Abstract

The global growth of the Internet and the rapid expansion of social networks such as Facebook make multilingual sentiment analysis of social media content very necessary. This paper performs the first sentiment analysis on code-mixed Bambara-French Facebook comments. We develop four Long Short-term Memory (LSTM)-based models and two Convolutional Neural Network (CNN)-based models, and use these six models, Naïve Bayes, and Support Vector Machines (SVM) to conduct experiments on a constituted dataset. Social media text written in Bambara is scarce. To mitigate this weakness, this paper uses dictionaries of character and word indexes to produce character and word embedding in place of pre-trained word vectors. We investigate the effect of comment length on the models and perform a comparison among them. The best performing model is a one-layer CNN deep learning model with an accuracy of 83.23 %.

Key words

sentiment analysis code-mixed Bambara-French Facebook comments deep learning Long Short-Term Memory( LSTM) Convolutional Neural Network (CNN) 

CLC number

TP 391.1 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    Muysken P. Code-Switching and Grammatical Theory[M]. Cambridge: CUP, 1995.CrossRefGoogle Scholar
  2. [2]
    Gafaranga J, Torras M C. Interactional otherness: Towards a redefinition of code switching[J]. International Journal of Bilingualism, 2002 6(1): 1–22.CrossRefGoogle Scholar
  3. [3]
    Xiao Z, Liang P. Chinese Sentiment Analysis Using Bidirectional LSTM with Word Embedding[M]. Berlin Heidelberg: Springer-Verlag, 2016.CrossRefGoogle Scholar
  4. [4]
    Vydrin V. Bamana reference corpus (BRC)[J]. Procedia-Soc Behav Sci, 2013, 95(4): 75–80.CrossRefGoogle Scholar
  5. [5]
    Balahur A, Steinberger R, Kabadjov M. Sentiment analysis in the news[J]. Infrared Phys Technol, 2014, 65: 94–102.CrossRefGoogle Scholar
  6. [6]
    Long J, Yu M, Zhou M, et al. Target-dependent Twitter sentiment classification[C]// Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics. Oregon: Association for Computational Linguistics, 2011: 151–160.Google Scholar
  7. [7]
    Vaibhavi N P, Sheikh I R. Twitter as a corpus for sentiment analysis and opinion mining [J]. International Journal of Advanced Research in Computer and Communication Engineering, 2016, 5(12):320–322.CrossRefGoogle Scholar
  8. [8]
    Go A, Bhayani R, Huang L. Twitter Sentiment Classification Using Distant Supervision[EB/OL]. [2018-03-14]. https://www.researchgate.net/publication/228523135_ Twitter_ sentiment_ classification_using_distant_supervision.
  9. [9]
    Pang B, Lee L, Vaithyanathan S. Thumbs up ? Sentiment classification using machine learning techniques[C]// Proceedings of the Conference on Empirical Methods in Natural Language Processing. Philadelphia: Association for Computational Linguistics, 2002: 79–86.Google Scholar
  10. [10]
    Bing L. Sentiment Analysis and Opinion Mining[M]. Williston: Morgan & Claypool Publishers, 2012.Google Scholar
  11. [11]
    Mohammad S M, Kiritchenko S, Zhu X. NRC-Canada: Building the state-of-the-art in sentiment analysis of Tweets[C] //Proceedings of the 7th International Workshop on Semantic Evaluation. Atlanta: Association for Computational Linguistics, 2013: 321–327.Google Scholar
  12. [12]
    Hochreiter S, Schmidhuber J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735–1780.CrossRefPubMedGoogle Scholar
  13. [13]
    Lecun Y, Bottou L, Bengio J, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11):2278–2324.CrossRefGoogle Scholar
  14. [14]
    Wang X, Liu Y, Sun C, et al. Predicting polarities of Tweets by composing word embeddings with Long Short-Term Memory[C]// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Beijing: Association for Computational Linguistics, 2015: 1343–1353.Google Scholar
  15. [15]
    Stojanovski D, Strezoski G, Madjarov G, et al. Twitter Sentiment Analysis Using Deep Convolutional Neural Network [M]. Berlin Heidelberg: Springer-Verlag, 2015.CrossRefGoogle Scholar
  16. [16]
    Severyn A, Moschitti A. UNITN: Training deep convolutional neural network for twitter sentiment classification[J]. Semeval, 2014, 54(2): 161–176.Google Scholar
  17. [17]
    Santosd C N, Gatti M. Deep convolutional neural networks for sentiment analysis of short texts[C]// Proceedings of COLING the 25th International Conference on Computational Linguistics: Technical Papers. Dublin: Association for Computational Linguistics, 2014: 69–78.Google Scholar
  18. [18]
    Chang J C, Lin C. Recurrent-Neural-Network for Language Detection on Twitter Code-Switching Corpus[EB/OL]. [2018-03-14]. https://arxiv.org/pdf/1412.4314.pdf.
  19. [19]
    Joshi A K. Processing of sentences with intra-sentential code-switching[C] // Proceedings of the 9th Conference on Computational Linguistics. Prague: Association for Computational Linguistics, 1982: 145–150.Google Scholar
  20. [20]
    Samih Y, Maharjan S, Attia M, et al. Multilingual code-switching identification via LSTM recurrent neural networks [C]// Proceedings of the Second Workshop on Computational Approaches to Code Switching. Austin: Association for Computational Linguistics, 2016: 50–59.CrossRefGoogle Scholar
  21. [21]
    Collobert R, Weston J, Karlen M, et al. Natural language processing (Almost) from scratch [J]. J Mach Learn Res, 2011, 12: 2493–2537.Google Scholar
  22. [22]
    Kim Y, Jernite Y, David S, et al. Character-aware neural language models [C] // Thirtieth AAAI Conf Artif Intell. Phoenix: AAAI Press, 2016: 2741–2749.Google Scholar
  23. [23]
    Hinton G E, Srivastava N, Krizhevsky, et al. Improving Neural Networks by Preventing Co-adaptation of Feature Detectors [EB/OL]. [2018-03-14].https://arxiv.org/pdf/1207. 0580. pdf.
  24. [24]
    James G, Witten D, Hastie T, et al. An Introduction to Statistical Learning: With Applications in R [M]. Berlin Heidelberg: Springer-Verlag, 2013.CrossRefGoogle Scholar
  25. [25]
    Brownlee J. What is the Difference Between Test and Validation Datasets?[EB/OL]. [2018-04-06]. https://machine learningmastery.com/difference-test-validation-datasets/.Google Scholar
  26. [26]
    MacLachlan Geoffrey J, Do K-A, Ambroise K. Analyzing Microarray Gene Expression Data[M]. Hoboken: Willey, 2013.Google Scholar
  27. [27]
    Al-Rfou R, Alain G, Almahairi A, et al. Theano: A Python Framework for Fast Computation of Mathematical Expressions [EB/OL].[2018-03-14]. https://arxiv.org/pdf/1605. 02688.pdf.
  28. [28]
    Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: Machine learning in Python[J]. J Mach Learn Res, 2012, 12(10): 2825–2830.Google Scholar

Copyright information

© Wuhan University and Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.State Key Laboratory of Software Engineering/School of ComputerWuhan UniversityWuhan, HubeiChina
  2. 2.Collaborative Innovation Center of Geospatial TechnologyWuhan, HubeiChina

Personalised recommendations