Skip to main content

Factors Affecting Sentiment Prediction of Malay News Headlines Using Machine Learning Approaches

  • Conference paper
  • First Online:

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 652))

Abstract

Most sentiment analysis researches are done with the help of supervised machine learning techniques. Analyzing sentiment for these English text reviews is a non-trivial task in order to gauge public perception and acceptance of a particular issue being addressed. Nevertheless, there are not many studies conducted on analyzing sentiment of Malay news headlines due to lack of resources and tools. The Malay news headlines normally consist of a few words and are often written with creativity to attract the readers’ attention. This paper proposes a standard framework that investigates factors affecting sentiment prediction of Malay news headlines using machine learning approaches. It is important to investigate factors (e.g., types of classifiers, proximity measurements and number of Nearest Neighbors, k) that influence the prediction performance of the sentiment analysis as it helps to study and understand the parameters that can be tuned to optimize the prediction performance. Based on the results obtained, Support Vector Machine and Naïve Bayes classifiers were capable to obtain higher accuracy compared to the k-Nearest Neighbors (k-NN) classifier. In term of proximity measurement and number of Nearest Neighbors, k, the k-NN classifier achieved higher prediction performance when the Cosine similarity is applied with a small value of k (e.g., 3 and 5), compared to the Euclidean distance because it measures can be affected by the high dimensionality of the data.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Liu, B.: Sentiment Analysis and Opinion Mining. Morgan & Claypool Publishers, San Rafael (2012)

    Google Scholar 

  2. Cassinelli, A., Chen, C.-W.: CS 224 N Final Project Boost up! Sentiment Categorization with Machine Learning Techniques. Stanford University: The Stanford Natural Language Processing Group (2009)

    Google Scholar 

  3. Gebremeskel, G.: Sentiment Analysis of Twitter posts about news. University of Malta: Department of Computer Science and Artificial Intelligence (2011)

    Google Scholar 

  4. Thelwall, M., Buckley, K., Paltoglou, G.: Sentiment in twitter events. J. Am. Soc. Inform. Sci. Technol. 62(2), 406–418 (2011)

    Article  Google Scholar 

  5. Noah, S.A., Ismail, F.: Automatic classifications of Malay proverbs using naïve bayesian algorithm. Inf. Technol. J. 7(7), 1016–1022 (2008)

    Article  Google Scholar 

  6. Kaur, J., Saini, J.R.: An analysis of opinion mining research works based on language, writing style and feature selection parameters. Int. J. Adv. Netw. Appl. (2013)

    Google Scholar 

  7. Naradhipa, A.R., Purwarianti, A.: Sentiment classification for indonesian message in social media. In: International Conference on Electrical Engineering and Informatics 17–19 July, Bandung, Indonesia (2011)

    Google Scholar 

  8. Jamal, N.: Masnizah mohd and shahrul azman noah: poetry classification using support vector machines. J. Comput. Sci. 8(9), 1441–1446 (2012)

    Article  Google Scholar 

  9. Alsaffar, A., Omar, N.: Study on feature selection and machine learning algorithms for Malay sentiment classification. In: ICIMU2014, Putrajaya, Malaysia (2014)

    Google Scholar 

  10. Zhang, W., Gao, F.: An improvement to naive bayes for text classification. Proc. Eng. 15, 2160–2164 (2011)

    Article  Google Scholar 

  11. Multilingual sentiment-Data Science Labs. Accessed https://sites.google.com/site/datascienceslab/projects/multilingualsentiment

  12. Kwee, A.T., Tsai, F.S., Tang, W.: Sentence-level novelty detection in English and Malay. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, T.-B. (eds.) PAKDD 2009. LNCS, vol. 5476, pp. 40–51. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  13. Raschka, S.: Naive Bayes and Text Classification: Introduction and Theory. Cornell university library, Ithaca (2014)

    Google Scholar 

  14. Kalaivai, P.: Sentiment classification of movie reviews by supervised machine learning approaches. Indian J. Comput. Sci. Eng. (IJCSE) 4(4), 317–323 (2013)

    Google Scholar 

  15. Patel, F.N., Soni, N.R.: Increasing accuracy of k-NN classifier for text classification. Int. J. Comput. Sci. Inform., ISSN (PRINT) 3(2), 2231–5292 (2013)

    Google Scholar 

  16. Khamar, K.: Short text classification using kNN based on distance function. Int. J. Adv. Res. Comput. Commun. Eng. 2(4) (2013)

    Google Scholar 

  17. Ashari, A., Paryudi, I., Tjoa, A.M.: Performance comparison between naïve bayes, decision tree and k-nearest neighbor in searching alternative design in an energy simulation tool. Int. J. Adv. Comput. Sci. Appl. 4(11) (2013)

    Google Scholar 

  18. Wu, X., Kumar, V., Quinlan, J.R., Ghosh, J. Yang, Q., Motoda, H.: Top 10 algorithms in data mining. © Springer-Verlag London Limited (2007)

    Google Scholar 

  19. Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398. Springer, Heidelberg (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rayner Alfred .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper

Alfred, R., Yee, W.W., Lim, Y., Obit, J.H. (2016). Factors Affecting Sentiment Prediction of Malay News Headlines Using Machine Learning Approaches. In: Berry, M., Hj. Mohamed, A., Yap, B. (eds) Soft Computing in Data Science. SCDS 2016. Communications in Computer and Information Science, vol 652. Springer, Singapore. https://doi.org/10.1007/978-981-10-2777-2_26

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-2777-2_26

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-2776-5

  • Online ISBN: 978-981-10-2777-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics