Abstract
In the literature, most of existing works of semantic orientation analysis focus on the distinguishment of two polarities (positive and negative). In this paper, we propose a lexicon-based multi-class semantic orientation analysis for microblogs. To better capture the social attention on public events, we introduce Concern into the conventional psychological classes of sentiments and build up a sentiment lexicon with five categories(Concern, Joy, Blue, Anger, Fear). The seed words of the lexicon are extracted from HowNet, NTUSD, and catchwords of the Sina Weibo posts. The semantic similarity in HowNet is adopted to detect more sentiment words to enrich the lexicon. Accordingly, each Weibo post is represented as a multi-dimensional numerical vector in feature space. Then we adopt the Semi-Supervised Gaussian Mixture Model (Semi-GMM) and an adaptive K-nearst neighbour (KNN) with symmetric Kullback-Leibler divergence (KL-divergence) as similarity measurements to classify the posts. We compare our proposed methodologies with a few competitive baseline methods e.g., majority vote, KNN by using Cosine similarity, and SVM. The experimental evaluation shows that our proposed methods outperform other approaches by a large margin in terms of the accuracy and F1 score.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Zhou, L., He, Y., Wang, J.: Survey on research of sentiment analysis. Journal of Computer Applications 28(11), 2725–2728 (2008)
Cornelius, R.R.: The science of emotion: Research and tradition in the psychology of emotions. Prentice-Hall, Inc. (1996)
HowNet[EBOL] (2007), http://www.keenage.com
Melville, P., Gryc, W., Lawrence, R.D.: Sentiment analysis of blogs by combining lexical knowledge with text classification. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1275–1284. ACM (2009)
Mohammad, S.M., Kiritchenko, S., Zhu, X.: Nrc-canada: Building the state-of-the-art in sentiment analysis of tweets. arXiv preprint arXiv:1308.6242 (2013)
Pang, B., Lee, L.: Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval 2(1-2), 1–135 (2008)
Ortigosa-Hernández, J., RodrÃguez, J.D., Alzate, L., Lucania, M., Inza, I., Lozano, J.A.: Approaching sentiment analysis by using semi-supervised learning of multi-dimensional classifiers. Neurocomputing 92, 98–115 (2012)
He, Y.: A bayesian modeling approach to multi-dimensional sentiment distributions prediction. In: Proceedings of the First International Workshop on Issues of Sentiment Discovery and Opinion Mining, p. 1. ACM (2012)
Zhao, J., Dong, L., Wu, J., Xu, K.: Moodlens: an emoticon-based sentiment analysis system for chinese tweets. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1528–1531. ACM (2012)
Dodds, P.S., Harris, K.D., Kloumann, I.M., Bliss, C.A., Danforth, C.M.: Temporal patterns of happiness and information in a global social network: Hedonometrics and twitter. PloS One 6(12), e26752 (2011)
Ku, L.-W., Liang, Y.-T., Chen, H.-H.: Opinion extraction, summarization and tracking in news and blog corpora. In: AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs, vol. 100107 (2006)
Zhu, Y.-L., Min, J., Zhou, Y.-Q., Huang, X.-J., Wu, L.-D.: Semantic orientation computing based on hownet. Journal of Chinese Information Processing 20(1), 14–20 (2006)
Parrott, W.: Emotions in social psychology: Essential readings. Psychology Press (2001)
Ye, Q., Zhang, Z., Law, R.: Sentiment classification of online reviews to travel destinations by supervised machine learning approaches. Expert Systems with Applications 36(3), 6527–6535 (2009)
Ronglu, L., Jianhui, W., Xiaoyun, C., Xiaopeng, T., Yunfa, H.: Using maximum entropy model for chinese text categorization. Journal of Computer Research and Development 1, 22–29 (2005)
Johnson, D.H., Sinanovic, S., et al.: Symmetrizing the kullback-leibler distance. Technical report, Rice University (2001)
Nguyen-Dinh, L.-V., Rossi, M., Blanke, U., Tröster, G.: Combining crowd-generated media and personal data: semi-supervised learning for context recognition. In: Proceedings of the 1st ACM International Workshop on Personal Data Meets Distributed Multimedia, pp. 35–38. ACM (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Li, Y., Li, X., Li, F., Zhang, X. (2014). A Lexicon-Based Multi-class Semantic Orientation Analysis for Microblogs. In: Chen, L., Jia, Y., Sellis, T., Liu, G. (eds) Web Technologies and Applications. APWeb 2014. Lecture Notes in Computer Science, vol 8709. Springer, Cham. https://doi.org/10.1007/978-3-319-11116-2_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-11116-2_8
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11115-5
Online ISBN: 978-3-319-11116-2
eBook Packages: Computer ScienceComputer Science (R0)