Abstract
Sentiment analysis has gained so much interest from many companies and organizations in Thailand. However, there are a few research studies focused on developing Thai sentiment lexicon, which is an important resource for the sentiment analysis. In this work, we developed a web-based automatic Thai lexicon construction tool. Our tool employed a semi-supervised approach for semi-automatically extracting the sentiment lexicon entries. To reduce a negative impact from unreliable parser, we provide simple heuristic rules and mutual information for recognizing sentiment words and its features. The polarity of recognized sentiment words is automatically identified through a bootstrapping process that utilizes a small set of sentiment seeds, the context coherency characteristics, and statistical co-occurrence. In the evaluation, we received quite fair results for lexicon construction task, 76.06 and 75.28 F-Score for hotel review and laptop review, respectively.
Notes
- 1.
Here, relation mod* is used to represent all types of modifier relation.
- 2.
These conjunctions are, such as, แต่/but, อย่างไรก็ดี/however, นอกจากนี้/besides, etc.
- 3.
The hotel reviews were collected from Agoda website: http://www.agoda.co.th.
- 4.
The laptop reviews were collected from Notebookspec website: http://www.notebookspec.com.
References
Damdoung, W., Chanlekha, H., Kawtrakul, A.: A context-induced bootstrapping approach for constructing contextual-dependent Thai sentiment lexicon. In: 10th SNLP, pp. 225–230. Thailand (2013)
Kanayama, H., Nasukawa, T.: Fully automatic lexicon expansion for domain-oriented sentiment analysis. In: the 2006 EMNLP, pp. 355–363. Association for Computational Linguistics, Pennsylvania (2006)
Hu, M., Liu, B.: Mining and summarizing customer reviews. In: 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 168–177. ACM, New York (2004)
Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity in phrase-level sentiment analysis. In: HLT/EMNLP 2005, pp. 347–354. Association for Computational Linguistics, Pennsylvania (2005)
Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.J.: Introduction to WordNet: an on-line lexical database. Int. J. Lexicogr. 3, 235–244 (1990)
Kamps, J., Marx, M., Mokken, R.J., De Rijke, M.: Using WordNet to measure semantic orientations of adjectives. In: 4th LREC, pp. 1115–1118. ELRA (2004)
Asian WordNet Project. http://www.asianwordnet.org
Hatzivassiloglou, V., McKeown, K.R.: Predicting the semantic orientation of adjectives. In: 35th ACL and 8th EACL, pp. 174–181. Association for Computational Linguistics, Pennsylvania (1997)
Banea, C., Mihalcea, R., Wiebe, J.: A bootstrapping method for building subjectivity lexicons for languages with scarce resources. In: 6th LREC, pp. 2764–2767. ELRA (2008)
Qiu, G., Liu, B., Bu, J., Chen, C.: Expanding domain sentiment lexicon through double propagation. In: 21st IJCAI, pp. 1199–1204. The AAAI Press, California (2009)
Baccianella, S., Esuli, A., Sebastiani, F.: SentiWordNet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: 7th LREC, pp. 2200–2204. ELRA (2010)
Stone, P.J., Dunphy, D.C., Smith, M.S.: The General Inquirer: A Computer Approach to Content Analysis. The MIT Press, Massachusetts (1966)
Wiebe, J., Wilson, T., Cardie, C.: Annotating expressions of opinions and emotions in language. Language Resour. Eval. 39, 165–210 (2005)
Haruechaiyasak, C., Kongthon, A., Palingoon, P., Sangkeettrakarn, C.: Constructing Thai opinion mining resource: a case study on hotel reviews. In: 8th Workshop on Asian Language Resources, pp. 64–71. CIPS, Beijing (2010)
Das, A., Bandyopadhyay, S.: Towards the global SentiWordNet. In: 24th PACLIC, pp. 799–808. Institute for Digital Enhancement of Cognitive Development, Waseda University (2010)
Ding, X., Liu, B., Yu, P.S.: A holistic lexicon-based approach to opinion mining. In: the 2008 WSDM, pp. 231–240. ACM, New York (2008)
Jijkoun, V., de Rijke, M., Weerkamp, W.: Generating focused topic-specific sentiment lexicons. In: 48th ACL, pp. 585–594. Association for Computational Linguistics, Pennsylvania (2010)
Pengphon, N., Kawtrakul, A., Suktarachan, M.: Word formation approach to noun phrase analysis for Thai. In: 5th SNLP, pp. 277–282. Thailand (2002)
Sudprasert, S.: Design and development of a lattice structure dependency parser for under-resourced languages. Dissertation, Kasetsart University (2010)
Turney, P.D.: Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In: 40th ACL, pp. 417–424. Association for Computational Linguistics, Pennsylvania (2002)
Peng, W., Park, D.H.: Generate adjective sentiment dictionary for social media sentiment analysis using constrained nonnegative matrix factorization. In: 5th ICWSM, pp. 273–280. The AAAI Press, California (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Chanlekha, H., Damdoung, W., Suktarachan, M. (2018). The Development of Semi-automatic Sentiment Lexicon Construction Tool for Thai Sentiment Analysis. In: Theeramunkong, T., Kongkachandra, R., Supnithi, T. (eds) Advances in Natural Language Processing, Intelligent Informatics and Smart Technology. SNLP 2016. Advances in Intelligent Systems and Computing, vol 684. Springer, Cham. https://doi.org/10.1007/978-3-319-70016-8_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-70016-8_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70015-1
Online ISBN: 978-3-319-70016-8
eBook Packages: EngineeringEngineering (R0)