TwitterBERT: Framework for Twitter Sentiment Analysis Based on Pre-trained Language Model Representations

Azzouza, Noureddine; Akli-Astouati, Karima; Ibrahim, Roliana

doi:10.1007/978-3-030-33582-3_41

TwitterBERT: Framework for Twitter Sentiment Analysis Based on Pre-trained Language Model Representations

Noureddine Azzouza¹⁷,
Karima Akli-Astouati¹⁷ &
Roliana Ibrahim¹⁸

Conference paper
First Online: 02 November 2019

2289 Accesses
16 Citations

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1073))

Abstract

Sentiment analysis has been a topic of discussion in the exploration domain of language understanding. Yet, the neural networks deployed in it are deficient to some extent. Currently, the majority of the studies proceeds on identifying the sentiments by focusing on vocabulary and syntax. Moreover, the task is recognised in Natural Language Processing (NLP) and, for calculating the noteworthy and exceptional results, Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) have been employed. In this study, we propose a four-phase framework for Twitter Sentiment Analysis. This setup is based on the pre-trained Bidirectional Encoder Representations from Transformers (BERT) model as an encoder for generating sentence depictions. For more effective utilisation of this model, we deploy various classification models. Additionally, we concatenate pre-trained representations of word embeddings with BERT representation method to enhance sentiment classification. Experimental results show better implementation when it is evaluated against the baseline framework on all datasets. For example, our best model attains an F1-score of 71.82% on the SemEval 2017 dataset. A comparative analysis on experimental results offers some recommendations on choosing pre-training steps to obtain improved results. The outcomes of the experiment confirm the effectiveness of our system.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: 3rd International Conference on Learning Representations ICLR 2015 (2015)
Google Scholar
Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: 31st International Conference on Machine Learning, vol. 32, pp. II–1188–II–1196 (2014)
Google Scholar
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.H.: Hierarchical attention networks for document classification. In: HLT-NAACL 2016, pp. 1480–1489 (2016)
Google Scholar
Jianqiang, Z., Xiaolin, G., Xuejun, Z.: Deep convolution neural networks for Twitter sentiment analysis. IEEE Access 6, 23253–23260 (2018)
Article Google Scholar
Zhang, L., Zhou, Y., Duan, X., Chen, R.: A hierarchical multi-input and output Bi-GRU model for sentiment analysis on customer reviews. In: IOP Conference Series: Materials Science and Engineering, vol. 322, no. 6 (2018)
Google Scholar
Abid, F., Alam, M., Yasir, M., Li, C.: Sentiment analysis through recurrent variants latterly on convolutional neural network of Twitter. Future Gener. Comput. Syst. 95, 292–308 (2019)
Article Google Scholar
Peters, M., et al.: Deep contextualized word representations. In: NAACL-HLT 2018, pp. 2227–2237 (2018)
Google Scholar
Ramachandran, P., Liu, P., Le, Q.: Unsupervised pretraining for sequence to sequence learning. In: EMNLP 2017, pp. 383–391 (2017)
Google Scholar
Howard, J., Ruder, S.: Universal language model fine-tuning for text classification. In: ACL, no. 1, pp. 328–339 (2018)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bi-directional transformers for language understanding. In: NAACL-HLT 2019, pp. 4171–4186 (2019)
Google Scholar
Go, A., Bhayani, R., Huang, L.: Twitter sentiment classification using distant supervision.pdf. CS224 N Project Report (2009)
Google Scholar
Kiritchenko, S., Zhu, X., Mohammad, S.M.: Sentiment analysis of short informal texts. J. Artif. Intell. Res. 50, 723–762 (2014)
Article Google Scholar
Baccianella, S., Esuli, A., Sebastiani, F.: SentiWordNet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: Proceedings of Lr, vol. 10, pp. 2200–2204 (2010)
Google Scholar
Strapparava, C., Valitutti, A.: WordNet-affect: an affective extension of WordNet. In: Proceedings of 4th International Conference on Language Resources and Evaluation, vol. 4, pp. 1083–1086 (2004)
Google Scholar
Giahanou, A., Crestani, F.: Like it or not: a survey of Twitter sentiment analysis methods. ACM Comput. Surv. 49, 28:1–28:41 (2016)
Google Scholar
Arslan, Y., Küçük, D., Birturk, A.: Twitter sentiment analysis experiments using word embeddings on datasets of various scales. In: NLDB 2018, pp. 40–47 (2018)
Google Scholar
Vo, D.T., Zhang, Y.: Target-dependent Twitter sentiment classification with rich automatic features. In: IJCAI 2015, pp. 1347–1353 (2015)
Google Scholar
Zhang, P., He, Z.: Using data-driven feature enrichment of text representation and ensemble technique for sentence-level polarity classification. J. Inf. Sci. 41(4), 531–549 (2015)
Article Google Scholar
Liao, S., Wang, J., Yu, R., Sato, K., Cheng, Z.: CNN for situations understanding based on sentiment analysis of Twitter data. Procedia Comput. Sci. 111(2015), 376–381 (2017)
Article Google Scholar
Balikas, G., Moura, S., Amini, M.-R.: Multitask learning for fine-grained twitter sentiment analysis. In: SIGIR 2017, pp. 1005–1008 (2017)
Google Scholar
Chen, T., Xu, R., He, Y., Wang, X.: Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN. Expert Syst. Appl. 72, 221–230 (2017)
Article Google Scholar
Kamkarhaghighi, M., Makrehchi, M.: Content Tree Word Embedding for document representation. Expert Syst. Appl. 90, 241–249 (2017)
Article Google Scholar
McCann, B., Bradbury, J., Xiong, C., Socher, R.: Learned in translation: contextualized word vectors. In: NIPS 2017, pp. 6297–6308 (2017)
Google Scholar
Radford, A., Salimans, T.: Improving language understanding by generative pre-training. Technical report, OpenAI, pp. 1–12 (2018)
Google Scholar
Vadicamo, L., et al.: Cross-media learning for image sentiment analysis in the wild. In: ICCV Workshops 2017, pp. 308–317 (2017)
Google Scholar
Godin, F., Vandersmissen, B., De Neve, W., Van de Walle, R.: Named entity recognition for Twitter microposts using distributed word representations. In: NUT@IJCNLP 2015, pp. 146–153 (2015)
Google Scholar
Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. In: EMNLP 2014, pp. 1532–1543 (2014)
Google Scholar
Mikolov, T., Grave, E., Bojanowski, P., Puhrsch, C., Joulin, A.: Advances in pre-training distributed word representations. In: LREC 2018 (2018)
Google Scholar
Speer, R., Chin, J., Havasi, C.: Conceptnet 5.5: an open multilingual graph of general knowledge. In: AAAI 2017, pp. 4444–4451 (2017)
Google Scholar
Nakov, P., Rosenthal, S., Kozareva, Z., Stoyanov, V., Ritter, A., Wilson, T.: SemEval-2013 Task 2: sentiment analysis in Twitter. In: SemEval@NAACL-HLT 2013, pp. 312–320 (2013)
Google Scholar
Rosenthal, S., Ritter, A., Nakov, P., Stoyanov, V.: SemEval-2014 Task 9: sentiment analysis in Twitter. In: SemEval@COLING, pp. 73–80 (2014)
Google Scholar
Rosenthal, S., Nakov, P., Kiritchenko, S., Mohammad, S., Ritter, A., Stoyanov, V.: SemEval-2015 Task 10: sentiment analysis in Twitter. In: SemEval@NAACL-HLT 2015, pp. 451–463 (2015)
Google Scholar
Nakov, P., Ritter, A., Rosenthal, S., Sebastiani, F., Stoyanov, V.: SemEval-2016 Task 4: sentiment analysis in Twitter. In: SemEval@NAACL-HLT 2016, pp. 1–18 (2016)
Google Scholar
Rosenthal, S., Farra, N., Nakov, P.: SemEval-2017 Task 4: sentiment analysis in Twitter. In: SemEval@ACL 2017, pp. 502–518 (2017)
Google Scholar
Kenyon-Dean, K., et al.: Sentiment Analysis: It’s Complicated! In: NAACL-HLT 2018, pp. 1886–1895 (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

FEI - Department of Computer Science, RIIMA Laboratory, University of Science and Technology Houari Boumediene, Bab ezzouar, Algiers, Algeria
Noureddine Azzouza & Karima Akli-Astouati
School of Computing, Faculty of Engineering, Universiti Teknologi Malaysia (UTM), 81310, Johor Bahru, Johor, Malaysia
Roliana Ibrahim

Authors

Noureddine Azzouza
View author publications
You can also search for this author in PubMed Google Scholar
Karima Akli-Astouati
View author publications
You can also search for this author in PubMed Google Scholar
Roliana Ibrahim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Noureddine Azzouza .

Editor information

Editors and Affiliations

College of Computer Science and Engineering, Taibah University, Medina, Saudi Arabia
Faisal Saeed
School of Computing, Universiti Utara Malaysia (UUM), Sintok, Kedah Darul Aman, Malaysia
Fathey Mohammed
Management of Information Systems Department College of Business Administration, Taibah University, Yanbu, Saudi Arabia
Nadhmi Gazem

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Azzouza, N., Akli-Astouati, K., Ibrahim, R. (2020). TwitterBERT: Framework for Twitter Sentiment Analysis Based on Pre-trained Language Model Representations. In: Saeed, F., Mohammed, F., Gazem, N. (eds) Emerging Trends in Intelligent Computing and Informatics. IRICT 2019. Advances in Intelligent Systems and Computing, vol 1073. Springer, Cham. https://doi.org/10.1007/978-3-030-33582-3_41

Download citation

DOI: https://doi.org/10.1007/978-3-030-33582-3_41
Published: 02 November 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33581-6
Online ISBN: 978-3-030-33582-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics