Abstract
Sentiment analysis is a computational process to identify positive or negative sentiments expressed in a piece of text. In this paper, we present a sentiment analysis system for Roman Urdu. For this task, we gathered Roman Urdu data of 779 reviews for five different domains, i.e., Drama, Movie/Telefilm, Mobile Reviews, Politics, and Miscellaneous (Misc). We selected unigram, bigram and uni-bigram (unigram + bigram) features for this task and used five different classifiers to compute accuracies before and after feature reduction. In total, thirty-six (36) experiments were performed, and they established that Naïve Bayes (NB) and Logistic Regression (LR) performed better than the rest of the classifiers on this task. It was also observed that the overall results were improved after feature reduction.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Wan, X.: Using bilingual knowledge and ensemble techniques for unsupervised Chinese sentiment analysis. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 553–561. Association for Computational Linguistics (2008)
Pak, A., Paroubek, P.: Twitter as a corpus for sentiment analysis and opinion mining. In: LREC, vol. 10, no. 2010 (2010)
Simons, G.F., Fennig, C.D. (eds.) Ethnologue: Languages of the World, Twentieth edition. SIL International, Dallas (2017). http://www.ethnologue.com
Feldman, Ronen: Techniques and applications for sentiment analysis. Commun. ACM 56(4), 82–89 (2013)
Tatemura, J.: Virtual reviewers for collaborative exploration of movie reviews. In: Proceedings of the 5th International Conference on Intelligent User Interfaces, pp. 272–275. ACM (2000)
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, vol. 10, pp. 79–86. Association for Computational Linguistics (2002)
Turney, P.D.: Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 417–424. Association for Computational Linguistics (2002)
Taboada, M., Brooke, J., Tofiloski, M., Voll, K., Stede, M.: Lexicon-based methods for sentiment analysis. Comput. Linguist. 37(2), 267–307 (2011)
Alessia, D., Ferri, F., Grifoni, P., Guzzo, T.: Approaches, tools and applications for sentiment analysis implementation. Int. J. Comput. Appl. 125(3) (2015)
Medhat, W., Hassan, A., Korashy, H.: Sentiment analysis algorithms and applications: a survey. Ain Shams Eng. J. 5(4), 1093–1113 (2014)
Yessenalina, A., Yue, Y., Cardie, C.: Multi-level structured models for document-level sentiment classification. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp. 1046–1056. Association for Computational Linguistics (2010)
Moraes, R., Valiati, J.F., Neto, W.P.G.: Document-level sentiment classification: an empirical comparison between SVM and ANN. Expert Syst. Appl. 40(2), 621–633 (2013)
Zhang, C., Zeng, D., Li, J., Wang, F.Y., Zuo, W.: Sentiment analysis of Chinese documents: from sentence to document level. J. Assoc. Inf. Sci. Technol. 60(12), 2474–2487 (2009)
Abbasi, A., Chen, H., Salem, A.: Sentiment analysis in multiple languages: feature selection for opinion classification in web forums. ACM Trans. Inf. Syst. (TOIS) 26(3), 12 (2008)
Singh, V.K., Piryani, R., Uddin, A., Waila, P.: Sentiment analysis of movie reviews: a new feature-based heuristic for aspect-level sentiment classification. In: 2013 International Multi-Conference on Automation, Computing, Communication, Control and Compressed Sensing (iMac4s), pp. 712–717. IEEE (2013)
Socher, R., Perelygin, A., Wu, J.Y., Chuang, J., Manning, C.D., Ng, A.Y., Potts, C.: Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), vol. 1631, p. 1642 (2013)
Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 347–354. Association for Computational Linguistics (2005)
Agarwal, A., Xie, B., Vovsha, I., Rambow, O., Passonneau, R.: Sentiment analysis of twitter data. In: Proceedings of the Workshop on Languages in Social Media, pp. 30–38. Association for Computational Linguistics (2011)
Xu, T., Peng, Q., Cheng, Y.: Identifying the semantic orientation of terms using S-HAL for sentiment analysis. Knowl. Based Syst. 35, 279–289 (2012)
Yu, L.C., Wu, J.L., Chang, P.C., Chu, H.S.: Using a contextual entropy model to expand emotion words and their intensity for the sentiment classification of stock market news. Knowl. Based Syst. 41, 89–97 (2013)
Hagenau, M., Liebmann, M., Neumann, D.: Automated news reading: stock price prediction based on financial news using context-capturing features. Decis. Support Syst. 55(3), 685–697 (2013)
Maks, I., Vossen, P.: A lexicon model for deep sentiment analysis and opinion mining applications. Decis. Support Syst. 53(4), 680–688 (2012)
Malik, M.K.: Urdu named entity recognition and classification system using artificial neural network. ACM Trans. Asian Low-Resour. Lang. Inf. Process. (TALLIP) 17(1), 2 (2017)
Malik, M.K., Sarwar, S.M.: Urdu named entity recognition system using hidden Markov model. Pak. J. Eng. Appl. Sci. (2017)
Malik, Muhammad Kamran, Sarwar, Syed Mansoor: Named entity recognition system for postpositional languages: urdu as a case study. Int. J. Adv. Comput. Sci. Appl. 7(10), 141–147 (2016)
Usman, Muhammad, Shafique, Zunaira, Ayub, Saba, Malik, Kamran: Urdu text classification using majority voting. Int. J. Adv. Comput. Sci. Appl. 7(8), 265–273 (2016)
Ali, A., Hussain, A., Malik, M.K.: Model for english-urdu statistical machine translation. World Appl. Sci. 24, 1362–1367 (2013)
Shahzadi, S., Fatima, B., Malik, K., Sarwar, S.M.: Urdu word prediction system for mobile phones. World Appl. Sci. J. 22(1), 113–120 (2013)
Karamat, N., Malik, K., Hussain, S.: Improving generation in machine translation by separating syntactic and morphological processes. In: Frontiers of Information Technology (FIT), pp. 195–200. IEEE (2011)
Siddiq, S., Hussain, S., Ali, A., Malik, K., Ali, W.: Urdu noun phrase chunking-hybrid approach. In: 2010 International Conference on Asian Language Processing (IALP), pp. 69–72. IEEE (2010)
Malik, M.K., Ali, A., Siddiq, S.: Behavior of Word ‘kaa’ in Urdu language. In: 2010 International Conference on Asian Language Processing (IALP), pp. 23–26. IEEE (2010)
Ali, W., Malik, M.K., Hussain, S., Siddiq, S., Ali, A.: Urdu noun phrase chunking: HMM based approach. In: 2010 International Conference on Educational and Information Technology (ICEIT), vol. 2, pp. V2-494. IEEE (2010)
Ali, A., Siddiq, S., Malik, M.K.: Development of parallel corpus and english to urdu statistical machine translation. Int. J. Eng. Technol. IJET-IJENS 10, 31–33 (2010)
Malik, K., Ahmed, T., Sulger, S., Bögel, T., Gulzar, A., Raza, G., Hussain, S., Butt, M.: Transliterating Urdu for a broad-coverage Urdu/Hindi LFG grammar. In: Seventh International Conference on Language Resources and Evaluation, LREC 2010, pp. 2921–2927 (2010)
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Mehmood, K., Essam, D., Shafi, K. (2019). Sentiment Analysis System for Roman Urdu. In: Arai, K., Kapoor, S., Bhatia, R. (eds) Intelligent Computing. SAI 2018. Advances in Intelligent Systems and Computing, vol 858. Springer, Cham. https://doi.org/10.1007/978-3-030-01174-1_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-01174-1_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01173-4
Online ISBN: 978-3-030-01174-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)