Abstract
Since early 2000, sentiment analysis has grown to be one of the most active research areas in Natural Language Processing (NLP). Since then, researchers have shown a tremendous interest in building automated Sentiment analysis applications for English language and non-English languages such as Arabic Language, French language, Deutsch language, Chinese language, Italian language, etc. Yet, very limited researches have been attributed to Malay opinionated social media text despite the big number of Malay native speakers which recorded to be approximately 215 million native speaker worldwide. In this paper, a framework is proposed to tackle some of the most common challenges posed by Malay social media text (informal text). Among the features discussed in this paper are the handling of Bahasa Rojak also known as Mix language (Malay-English language), the handling of Bahasa SMS, the proper handling of Emoticon and finally the handling of Valance shifter. As a result, RojakLex lexicon was constructed consists of 4 different lexicons combined together, namely (1) MySentiDic: a Malay lexicon, (2) English Lexicon: Translated version of MySentiDic, (3) Emoticon lexicon: a combination of 9 different well known lists of commonly used online emoticons, (4) Neologism lexicon: consists of common neologism words used in Malay social media text. The proposed system shows tremendous improvement in accuracy by recording 79.28% compared to baseline which recorded 51.38% only. Discussion and implication of these findings are further elaborated.
Contact author at k.chekima@gmail.com for a copy of RojakLex.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Liu, B.: Sentiment Analysis and Opinion Mining. Synthesis Lectures on Human Language Technologies, vol. 22(2), no. 1, pp. 1–167. Morgan & Claypool, San Rafael (2012)
Nasukawa, T., Yi, J.: Sentiment analysis: capturing favorability using natural language processing. In: Proceedings of the 2nd International Conference on Knowledge Capture, pp. 70–77 (2003)
Das, S.R., Chen, M.Y.: Yahoo! for Amazon: sentiment extraction from small talk on the web. Manage. Sci. 53(9), 1375–1388 (2007)
Morinaga, S., Yamanishi, K.: Mining product reputations on the web. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2002, pp. 341–349 (2002)
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment classification using machine learning techniques. Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, pp. 79–86 (2002)
Turney, P.D.: Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 417–424, July 2002
Wiebe, J.M.: Learning subjective adjectives from corpora. In: Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence, no. 1, pp. 735–741 (2000)
Hatzivassiloglou, V., et al.: Predicting the semantic orientation of adjectives. ACM Trans. Inf. Syst. 21(4), 315–346 (2009)
Wiebe, J.: Tracking point of view in narrative. Comput. Linguist. 20(2), 233–287 (1994)
Hearst, M.A.: Direction-based text interpretation as an information access refinement. In: Text-Based Intelligent Systems, pp. 257–274. L. Erlbaum Associates Inc., Hillsdale (1992)
Dave, K., Lawrence, S., Pennock, D.M.: Mining the peanut gallery: opinion extraction and semantic classification of product reviews. In: Proceedings of 12th International Confrence on World Wide Web, pp. 519–528 (2003)
Alsaffar, A., Omar, N.: Integrating a Lexicon based approach and K nearest neighbour for Malay sentiment analysis. J. Comput. Sci. 11(4), 639–644 (2015)
Samsudin, N., Puteh, M., Hamdan, A.R., Nazri, M.Z.A.: Is artificial immune system suitable for opinion mining? In: 2012 4th Conference on Data Mining and Optimization (DMO), pp. 131–136, September 2012
Isa, N., Puteh, M., Mohamad, R., Raja, H.: Sentiment Classification of Malay Newspaper Using Immune Network (SCIN), vol. III (2013)
Samsudin, N., Puteh, M., Hamdan, A.R.: Bess or xbest: mining the Malaysian online reviews. In: 2011 3rd Conference on Data Mining and Optimization (DMO), pp. 38–43, June 2011
Puteh, M., Isa, N., Puteh, S., Redzuan, N.A.: Sentiment mining of Malay newspaper (SAMNews) using artificial immune system. In: Proceedings of the World Congress on Engineering, vol. III (2013)
Alsaffar, A., Omar, N.: Study on feature selection and machine learning algorithms for Malay sentiment classification. In: 2014 International Conference on Information Technology and Multimedia (ICIMU), pp. 270–275 (2014)
Samsudin, N., Puteh, M., Hamdan, A.R., Ahmad, M.Z.: Mining opinion in online messages. Int. J. Adv. Comput. Sci. Appl. 4(8), 19–24 (2013)
Samsudin, N., Puteh, M., Hamdan, A.R., Nazri, M.Z.A.: Normalization of common noisy terms in Malaysian online media. In: Proceedings of the Knowledge Management International Conference, pp. 515–520, July 2012
Zamani, N.A.M., Abidin, S.Z.Z., Omar, N., Abiden, M.Z.Z.: Sentiment analysis: determining people's emotions in facebook 2 related work. In: Proceedings of the 13th International Conference on Applied Computer and Applied Computational Science, pp. 111–116 (2014). ISBN 978-960-474-368-1
Shamsudin, N.F., Basiron, H., Saaya, Z., Abdul Rahman, A.F.N., Zakaria, M.H., Hassim, N.: Sentiment classification of unstructured data using lexical based techniques. J. Teknol. 77(18), 113–120 (2015)
Darwich, M., Azman, S., Noah, M., Omar, N.: Inducing a domain-independent sentiment lexicon in Malay, no. 1 (2012)
Chekima, K., Alfred, R.: Automatic construction of Malay stopword list. In: Berry, M.W., Mohamed, A., Yap, B.W. (eds.) Soft Computing in Data Science. Springer, Singapore (2016)
Chekima, K., Alfred, R.: Non-english sentiment dictionary construction. Adv. Sci. Lett. 4, 400–407 (2016)
Hogenboom, A., Bal, D., Frasincar, F., Bal, M., de Jong, F., Kaymak, U.: Exploiting emoticons in sentiment analysis. In: Proceedings of the 28th Annual ACM Symposium on Applied Computing, pp. 703–710 (2013)
Gilbert, E.: VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Chekima, K., Alfred, R. (2018). Sentiment Analysis of Malay Social Media Text. In: Alfred, R., Iida, H., Ag. Ibrahim, A., Lim, Y. (eds) Computational Science and Technology. ICCST 2017. Lecture Notes in Electrical Engineering, vol 488. Springer, Singapore. https://doi.org/10.1007/978-981-10-8276-4_20
Download citation
DOI: https://doi.org/10.1007/978-981-10-8276-4_20
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-8275-7
Online ISBN: 978-981-10-8276-4
eBook Packages: EngineeringEngineering (R0)