Abstract
This paper adopts a semi-supervised method which is based on bootstrapping to analyze Sina microblog data which size is about 269 M. The Support Vector Machine (SVM) method is used in subjective and objective classification and polarity classification. Our method can extend the size of seed samples by learning automatically with a small size of labeled corpus. It can improve the ability of sentiment classification of SVM by using the iteration method. A weighted factor to control the weight of new seed samples during the following training process can improve classification performance. The experiment results show that sentiment analysis of Chinese microblog based on bootstrapping not only saves much time of manual annotation but also can get better performance. The results of subjective and objective classification achieve the best accuracy rate of 62.9%, and the best accuracy rate of sentiment polarity classification is 57%.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Li, S.S., Huang, C.R., Zhou, G.D., et al.: Employing personal/impersonal views in supervised and semi-supervised sentiment classification. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, 2010, pp. 414–423
Niu, G., Luo, A.B., Shang, L.: A survey of semi-supervised text categorization. J. Front. Comput. Sci. Technol. 5(4), 313–323 (2011)
Davidov, D., Tsur, O., Rappoport, A.: Semi-supervised recognition of sarcastic sentences in Twitter and Amazon. In: AAAI, 2010
Riloff, E., Wiebe, J., Wilson, T.: Learning subjective nouns using extraction pattern bootstrapping. In: Proceedings of the Seventh CoNLL Conference Held at HLT-NAACL2003, Edmonton, 2003, pp. 25–32
Wan, X.J.: Co-training for cross-lingual sentiment classification. In: Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP of the AFNLP, Suntec, Singapore, 2009, pp. 235–243
Li, S.S., Wang, Z.Q., Zhou, G.D., et al.: Semi-supervised learning for imbalanced sentiment classification. In: Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, 2011, pp. 1826–1831
Jin, W., Ho, H.H.: A novel lexicalized hmm-based learning framework for Web opinion mining. In: Proceedings of the 26th International Conference on Machine Learning, Montreal, 2009, pp. 465–472
Wang, B., Wang, H.F.: Bootstrapping both product properties and opinion words from Chinese reviews with cross-training. In: 2007 IEEE/WIC/ACM International Conference on Web Intelligence, 2007, pp. 259–262
Jiang, L., Yu, M., Zhou, M., et al.: Target-dependent Twitter sentiment classification. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, Portland, 2011, pp. 151–160
Wang, X.L., Wei, F.R., Liu, X.H., et al.: Topic sentiment analysis in Twitter: a graph-based Hashtag sentiment classification approach. In: CIKM, 2011, pp. 1031–1040
Xie, L.X.: Sentiment analysis of Chinese micro blog using SVM. Master Thesis. Tsinghua University, Beijing (2011)
Chen, W.L., Zhu, M.H., Zhu, J.B., Yao, T.S.: Semi-supervised text categorization using bootstrapping. J. Chin. Inform. Process. 19(2), 86–92 (2005)
Acknowledgments
This work is supported by the National Natural Science Foundation of China (Nos. 61073130 and 61173073) and the project of National High Technology Research and Development Program of China (863 Program) (No. 2011AA01A207).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this paper
Cite this paper
Zhu, S., Xu, B., Zheng, D., Zhao, T. (2013). Chinese Microblog Sentiment Analysis Based on Semi-supervised Learning. In: Li, J., Qi, G., Zhao, D., Nejdl, W., Zheng, HT. (eds) Semantic Web and Web Science. Springer Proceedings in Complexity. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-6880-6_28
Download citation
DOI: https://doi.org/10.1007/978-1-4614-6880-6_28
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-6879-0
Online ISBN: 978-1-4614-6880-6
eBook Packages: Computer ScienceComputer Science (R0)