Sentiment Analysis System for Roman Urdu

Mehmood, Khawar; Essam, Daryl; Shafi, Kamran

doi:10.1007/978-3-030-01174-1_3

Khawar Mehmood¹⁷,
Daryl Essam¹⁷ &
Kamran Shafi¹⁷

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 858))

Included in the following conference series:

Science and Information Conference

1350 Accesses
15 Citations

Abstract

Sentiment analysis is a computational process to identify positive or negative sentiments expressed in a piece of text. In this paper, we present a sentiment analysis system for Roman Urdu. For this task, we gathered Roman Urdu data of 779 reviews for five different domains, i.e., Drama, Movie/Telefilm, Mobile Reviews, Politics, and Miscellaneous (Misc). We selected unigram, bigram and uni-bigram (unigram + bigram) features for this task and used five different classifiers to compute accuracies before and after feature reduction. In total, thirty-six (36) experiments were performed, and they established that Naïve Bayes (NB) and Logistic Regression (LR) performed better than the rest of the classifiers on this task. It was also observed that the overall results were improved after feature reduction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Wan, X.: Using bilingual knowledge and ensemble techniques for unsupervised Chinese sentiment analysis. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 553–561. Association for Computational Linguistics (2008)
Google Scholar
Pak, A., Paroubek, P.: Twitter as a corpus for sentiment analysis and opinion mining. In: LREC, vol. 10, no. 2010 (2010)
Google Scholar
Simons, G.F., Fennig, C.D. (eds.) Ethnologue: Languages of the World, Twentieth edition. SIL International, Dallas (2017). http://www.ethnologue.com
Feldman, Ronen: Techniques and applications for sentiment analysis. Commun. ACM 56(4), 82–89 (2013)
Article Google Scholar
Tatemura, J.: Virtual reviewers for collaborative exploration of movie reviews. In: Proceedings of the 5th International Conference on Intelligent User Interfaces, pp. 272–275. ACM (2000)
Google Scholar
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, vol. 10, pp. 79–86. Association for Computational Linguistics (2002)
Google Scholar
Turney, P.D.: Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 417–424. Association for Computational Linguistics (2002)
Google Scholar
Taboada, M., Brooke, J., Tofiloski, M., Voll, K., Stede, M.: Lexicon-based methods for sentiment analysis. Comput. Linguist. 37(2), 267–307 (2011)
Article Google Scholar
Alessia, D., Ferri, F., Grifoni, P., Guzzo, T.: Approaches, tools and applications for sentiment analysis implementation. Int. J. Comput. Appl. 125(3) (2015)
Google Scholar
Medhat, W., Hassan, A., Korashy, H.: Sentiment analysis algorithms and applications: a survey. Ain Shams Eng. J. 5(4), 1093–1113 (2014)
Article Google Scholar
Yessenalina, A., Yue, Y., Cardie, C.: Multi-level structured models for document-level sentiment classification. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp. 1046–1056. Association for Computational Linguistics (2010)
Google Scholar
Moraes, R., Valiati, J.F., Neto, W.P.G.: Document-level sentiment classification: an empirical comparison between SVM and ANN. Expert Syst. Appl. 40(2), 621–633 (2013)
Article Google Scholar
Zhang, C., Zeng, D., Li, J., Wang, F.Y., Zuo, W.: Sentiment analysis of Chinese documents: from sentence to document level. J. Assoc. Inf. Sci. Technol. 60(12), 2474–2487 (2009)
Article Google Scholar
Abbasi, A., Chen, H., Salem, A.: Sentiment analysis in multiple languages: feature selection for opinion classification in web forums. ACM Trans. Inf. Syst. (TOIS) 26(3), 12 (2008)
Article Google Scholar
Singh, V.K., Piryani, R., Uddin, A., Waila, P.: Sentiment analysis of movie reviews: a new feature-based heuristic for aspect-level sentiment classification. In: 2013 International Multi-Conference on Automation, Computing, Communication, Control and Compressed Sensing (iMac4s), pp. 712–717. IEEE (2013)
Google Scholar
Socher, R., Perelygin, A., Wu, J.Y., Chuang, J., Manning, C.D., Ng, A.Y., Potts, C.: Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), vol. 1631, p. 1642 (2013)
Google Scholar
Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 347–354. Association for Computational Linguistics (2005)
Google Scholar
Agarwal, A., Xie, B., Vovsha, I., Rambow, O., Passonneau, R.: Sentiment analysis of twitter data. In: Proceedings of the Workshop on Languages in Social Media, pp. 30–38. Association for Computational Linguistics (2011)
Google Scholar
Xu, T., Peng, Q., Cheng, Y.: Identifying the semantic orientation of terms using S-HAL for sentiment analysis. Knowl. Based Syst. 35, 279–289 (2012)
Article Google Scholar
Yu, L.C., Wu, J.L., Chang, P.C., Chu, H.S.: Using a contextual entropy model to expand emotion words and their intensity for the sentiment classification of stock market news. Knowl. Based Syst. 41, 89–97 (2013)
Article Google Scholar
Hagenau, M., Liebmann, M., Neumann, D.: Automated news reading: stock price prediction based on financial news using context-capturing features. Decis. Support Syst. 55(3), 685–697 (2013)
Article Google Scholar
Maks, I., Vossen, P.: A lexicon model for deep sentiment analysis and opinion mining applications. Decis. Support Syst. 53(4), 680–688 (2012)
Article Google Scholar
Malik, M.K.: Urdu named entity recognition and classification system using artificial neural network. ACM Trans. Asian Low-Resour. Lang. Inf. Process. (TALLIP) 17(1), 2 (2017)
Article MathSciNet Google Scholar
Malik, M.K., Sarwar, S.M.: Urdu named entity recognition system using hidden Markov model. Pak. J. Eng. Appl. Sci. (2017)
Google Scholar
Malik, Muhammad Kamran, Sarwar, Syed Mansoor: Named entity recognition system for postpositional languages: urdu as a case study. Int. J. Adv. Comput. Sci. Appl. 7(10), 141–147 (2016)
Google Scholar
Usman, Muhammad, Shafique, Zunaira, Ayub, Saba, Malik, Kamran: Urdu text classification using majority voting. Int. J. Adv. Comput. Sci. Appl. 7(8), 265–273 (2016)
Google Scholar
Ali, A., Hussain, A., Malik, M.K.: Model for english-urdu statistical machine translation. World Appl. Sci. 24, 1362–1367 (2013)
Google Scholar
Shahzadi, S., Fatima, B., Malik, K., Sarwar, S.M.: Urdu word prediction system for mobile phones. World Appl. Sci. J. 22(1), 113–120 (2013)
Google Scholar
Karamat, N., Malik, K., Hussain, S.: Improving generation in machine translation by separating syntactic and morphological processes. In: Frontiers of Information Technology (FIT), pp. 195–200. IEEE (2011)
Google Scholar
Siddiq, S., Hussain, S., Ali, A., Malik, K., Ali, W.: Urdu noun phrase chunking-hybrid approach. In: 2010 International Conference on Asian Language Processing (IALP), pp. 69–72. IEEE (2010)
Google Scholar
Malik, M.K., Ali, A., Siddiq, S.: Behavior of Word ‘kaa’ in Urdu language. In: 2010 International Conference on Asian Language Processing (IALP), pp. 23–26. IEEE (2010)
Google Scholar
Ali, W., Malik, M.K., Hussain, S., Siddiq, S., Ali, A.: Urdu noun phrase chunking: HMM based approach. In: 2010 International Conference on Educational and Information Technology (ICEIT), vol. 2, pp. V2-494. IEEE (2010)
Google Scholar
Ali, A., Siddiq, S., Malik, M.K.: Development of parallel corpus and english to urdu statistical machine translation. Int. J. Eng. Technol. IJET-IJENS 10, 31–33 (2010)
Google Scholar
Malik, K., Ahmed, T., Sulger, S., Bögel, T., Gulzar, A., Raza, G., Hussain, S., Butt, M.: Transliterating Urdu for a broad-coverage Urdu/Hindi LFG grammar. In: Seventh International Conference on Language Resources and Evaluation, LREC 2010, pp. 2921–2927 (2010)
Google Scholar
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

University of New South Wales (UNSW), Kensington, Australia
Khawar Mehmood, Daryl Essam & Kamran Shafi

Authors

Khawar Mehmood
View author publications
You can also search for this author in PubMed Google Scholar
Daryl Essam
View author publications
You can also search for this author in PubMed Google Scholar
Kamran Shafi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Khawar Mehmood .

Editor information

Editors and Affiliations

Faculty of Science and Engineering, Department of Information Science, Saga University, Honjo, Saga, Japan
Kohei Arai
The Science and Information (SAI) Organization, Bradford, West Yorkshire, UK
Supriya Kapoor
The Science and Information (SAI) Organization, Bradford, West Yorkshire, UK
Rahul Bhatia

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mehmood, K., Essam, D., Shafi, K. (2019). Sentiment Analysis System for Roman Urdu. In: Arai, K., Kapoor, S., Bhatia, R. (eds) Intelligent Computing. SAI 2018. Advances in Intelligent Systems and Computing, vol 858. Springer, Cham. https://doi.org/10.1007/978-3-030-01174-1_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-01174-1_3
Published: 02 November 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01173-4
Online ISBN: 978-3-030-01174-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics