Abstract
In the fast and big data era, we all desire to understand trend or big picture of a story instantly. This work wants to find an automatic approach to extract the good-enough key terms of each event appear in Thai Twitter society. The core idea is to help reducing time for human to do the key term extraction, yet the quality of such selected key terms are acceptable by human and is better than our previous implementation. Our studied approaches focus to work on Thai language and covered preprocessing, feature selections and weighting schemes on three Thai real tweet events with different characteristics. Our experiment comprise four main approaches and a number of hypothesis. Our findings confirm the usefulness of hashtag terms with five or more character length, the benefit of bigram with stop words and the importance of event characteristics. In fact, we conclude to use different approaches for different types of event. The performance and rational evaluations are done by statistical analysis, evaluators voting, and F-Score measurement and are confirmed to be better than previous work twice as much.
References
Hong, L., Davison B.D.: Empirical study of topic modeling in twitter. In: Proceedings of the First Workshop on Social Media Analytics, pp. 80–88 (2010)
Kaur, J., Gupta, Vishal: Effective approaches for extraction of keywords. J. Comput. Sci. 7(6), 144–148 (2010)
Abilhoa, W.D., De Castro, L.N.: A keyword extraction method from twitter messages represented as graphs. Appl. Math. Comput. 240, 308–325 (2014)
O’Connor, B., Krieger, M., Ahn, D.: TweetMotif: exploratory search and topic summarization for twitter. In: 4th International AAAI Conference on Weblogs and Social Media, pp. 2–3 (2010)
Haruechaiyasak, C., Kongthon, A.: LexToPlus: a Thai lexeme tokenization and normalization tool. In: The 4th Workshop on South and Southeast Asian NLP (WSSANLP), International Joint Conference on Natural Language Processing, pp. 9–16. Nagoya, Japan (2013)
Inc. 2015 Twitter. https://support.twitter.com/articles/77606
LST, NECTEC. BEST: http://thailang.nectec.or.th/downloadcenter/ (2016)
Sukhum, K., Nitsuwat, S.: Opinion detection in Thai political news columns based on subjectivity analysis. In: The 7th International Conference on Computing and Information Technology IC2IT2011, pp. 27–31 (2011)
Zwicky, A.M.: Heads, bases, and functors. In: Heads in Grammatical Theory, pp. 292–315. Cambridge University Press (1993)
Apache Spark. http://spark.apache.org/ (2015)
R-tutor. http://www.r-tutor.com/elementary-statistics/non-parametric-methods/wilcoxon-signed-rank-test (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Piyatumrong, A., Sangkeettrakarn, C., Haruechaiyasak, C., Kongthon, A. (2018). Finding Key Terms Representing Events from Thai Twitter. In: Theeramunkong, T., Kongkachandra, R., Supnithi, T. (eds) Advances in Natural Language Processing, Intelligent Informatics and Smart Technology. SNLP 2016. Advances in Intelligent Systems and Computing, vol 684. Springer, Cham. https://doi.org/10.1007/978-3-319-70016-8_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-70016-8_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70015-1
Online ISBN: 978-3-319-70016-8
eBook Packages: EngineeringEngineering (R0)