Abstract
Text mining is a broad field having sentiment mining as its important constituent in which we try to deduce the behavior of people towards a specific item, merchandise, politics, sports, social media comments, review sites, etc. Out of many issues in sentiment mining, analysis and classification, one major issue is that the reviews and comments can be in different languages, like English, Arabic, Urdu, etc. Handling each language according to its rules is a difficult task. A lot of research work has been done in English Language for sentiment analysis and classification but limited sentiment analysis work is being carried out on other regional languages, like Arabic, Urdu and Hindi. In this paper, Waikato Environment for Knowledge Analysis (WEKA) is used as a platform to execute different classification models for text classification of Roman Urdu text. Reviews dataset has been scrapped from different automobiles’ sites. These extracted Roman Urdu reviews, containing 1000 positive and 1000 negative reviews are then saved in WEKA attribute-relation file format (ARFF) as labeled examples. Training is done on 80% of this data and rest of it is used for testing purpose which is done using different models and results are analyzed in each case. The results show that Multinomial Naïve Bayes outperformed Bagging, Deep Neural Network, Decision Tree, Random Forest, AdaBoost, k-NN and SVM Classifiers in terms of more accuracy, precision, recall and F-measure.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Khushboo, T.N., Vekariya, S.K., Mishra, S.: Mining of sentence level opinion using supervised term weighted approach of Naïve Bayesian algorithm. Int. J. Comput. Technol. Appl. 3(3), 987 (2012)
Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends® Inf. Retr. 2(1–2), 1–135 (2008)
Rashid, A., Anwer, N., Iqbal, M., Sher, M.: A survey paper: areas, techniques and challenges of opinion mining. IJCSI Int. J. Comput. Sci. Issues 10(2), 18–31 (2013)
Katsiavriades, K., Qureshi, T.: The 30 Most Spoken Languages of the World. Krysstal, London (2002)
Ahmed, T.: Roman to Urdu transliteration using wordlist. In: Proceedings of the Conference on Language and Technology, vol. 305, p. 309 (2009)
Kaur, A., Gupta, V.: N-gram based approach for opinion mining of Punjabi text. In: International Workshop on Multi-disciplinary Trends in Artificial Intelligence, pp. 81–88. Springer, Cham (2014)
Jebaseel, A., Kirubakaran, D.E.: M-learning sentiment analysis with data mining techniques. Int. J. Comput. Sci. Telecommun. 3(8), 45–48 (2012)
Zhang, C., Zuo, W., Peng, T., He, F.: Sentiment classification for chinese reviews using machine learning methods based on string kernel. In: Third International Conference on Convergence and Hybrid Information Technology, ICCIT’08, 2008, vol. 2, pp. 909–914. IEEE, November 2008
Syed, A.Z., Aslam, M., Martinez-Enriquez, A.M.: Associating targets with SentiUnits: a step forward in sentiment analysis of Urdu text. Artif. Intell. Rev. 41(4), 535–561 (2014)
Abbasi, A., Chen, H., Salem, A.: Sentiment analysis in multiple languages: feature selection for opinion classification in web forums. ACM Trans. Inf. Syst. 26(3), 12 (2008)
Wilbur, W.J., Sirotkin, K.: The automatic identification of stop words. J. Inf. Sci. 18(1), 45–55 (1992)
Fox, C.: A stop list for general text. In: ACM SIGIR forum, vol. 24, no. 1–2, pp. 19–21. ACM, September 1989
R. NL.: Ranks nl webmaster tools (2016). http://www.ranks.nl/stopwords/urdu
http://weka.sourceforge.net/doc.dev/weka/filters/unsupervised/attribute/StringToWordVector.html
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up?: sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, vol. 10, pp. 79–86. Association for Computational Linguistics, Chicago, July 2002
Amor, N.B., Benferhat, S., Elouedi, Z.: Naive bayes vs decision trees in intrusion detection systems. In Proceedings of the 2004 ACM Symposium on Applied Computing, pp. 420–424. ACM, March 2004
Domingos, P., Pazzani, M.:. Beyond independence: conditions for the optimality of the simple bayesian classifier. In: Proceedings of the 13th International Conference on Machine Learning, pp. 105–112, Chicago, July 1996
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Khan, M., Malik, K. (2019). Sentiment Classification of Customer’s Reviews About Automobiles in Roman Urdu. In: Arai, K., Kapoor, S., Bhatia, R. (eds) Advances in Information and Communication Networks. FICC 2018. Advances in Intelligent Systems and Computing, vol 887. Springer, Cham. https://doi.org/10.1007/978-3-030-03405-4_44
Download citation
DOI: https://doi.org/10.1007/978-3-030-03405-4_44
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03404-7
Online ISBN: 978-3-030-03405-4
eBook Packages: EngineeringEngineering (R0)