Skip to main content

Sentiment Analysis System for Roman Urdu

  • Conference paper
  • First Online:
Intelligent Computing (SAI 2018)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 858))

Included in the following conference series:

Abstract

Sentiment analysis is a computational process to identify positive or negative sentiments expressed in a piece of text. In this paper, we present a sentiment analysis system for Roman Urdu. For this task, we gathered Roman Urdu data of 779 reviews for five different domains, i.e., Drama, Movie/Telefilm, Mobile Reviews, Politics, and Miscellaneous (Misc). We selected unigram, bigram and uni-bigram (unigram + bigram) features for this task and used five different classifiers to compute accuracies before and after feature reduction. In total, thirty-six (36) experiments were performed, and they established that Naïve Bayes (NB) and Logistic Regression (LR) performed better than the rest of the classifiers on this task. It was also observed that the overall results were improved after feature reduction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Wan, X.: Using bilingual knowledge and ensemble techniques for unsupervised Chinese sentiment analysis. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 553–561. Association for Computational Linguistics (2008)

    Google Scholar 

  2. Pak, A., Paroubek, P.: Twitter as a corpus for sentiment analysis and opinion mining. In: LREC, vol. 10, no. 2010 (2010)

    Google Scholar 

  3. Simons, G.F., Fennig, C.D. (eds.) Ethnologue: Languages of the World, Twentieth edition. SIL International, Dallas (2017). http://www.ethnologue.com

  4. Feldman, Ronen: Techniques and applications for sentiment analysis. Commun. ACM 56(4), 82–89 (2013)

    Article  Google Scholar 

  5. Tatemura, J.: Virtual reviewers for collaborative exploration of movie reviews. In: Proceedings of the 5th International Conference on Intelligent User Interfaces, pp. 272–275. ACM (2000)

    Google Scholar 

  6. Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, vol. 10, pp. 79–86. Association for Computational Linguistics (2002)

    Google Scholar 

  7. Turney, P.D.: Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 417–424. Association for Computational Linguistics (2002)

    Google Scholar 

  8. Taboada, M., Brooke, J., Tofiloski, M., Voll, K., Stede, M.: Lexicon-based methods for sentiment analysis. Comput. Linguist. 37(2), 267–307 (2011)

    Article  Google Scholar 

  9. Alessia, D., Ferri, F., Grifoni, P., Guzzo, T.: Approaches, tools and applications for sentiment analysis implementation. Int. J. Comput. Appl. 125(3) (2015)

    Google Scholar 

  10. Medhat, W., Hassan, A., Korashy, H.: Sentiment analysis algorithms and applications: a survey. Ain Shams Eng. J. 5(4), 1093–1113 (2014)

    Article  Google Scholar 

  11. Yessenalina, A., Yue, Y., Cardie, C.: Multi-level structured models for document-level sentiment classification. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp. 1046–1056. Association for Computational Linguistics (2010)

    Google Scholar 

  12. Moraes, R., Valiati, J.F., Neto, W.P.G.: Document-level sentiment classification: an empirical comparison between SVM and ANN. Expert Syst. Appl. 40(2), 621–633 (2013)

    Article  Google Scholar 

  13. Zhang, C., Zeng, D., Li, J., Wang, F.Y., Zuo, W.: Sentiment analysis of Chinese documents: from sentence to document level. J. Assoc. Inf. Sci. Technol. 60(12), 2474–2487 (2009)

    Article  Google Scholar 

  14. Abbasi, A., Chen, H., Salem, A.: Sentiment analysis in multiple languages: feature selection for opinion classification in web forums. ACM Trans. Inf. Syst. (TOIS) 26(3), 12 (2008)

    Article  Google Scholar 

  15. Singh, V.K., Piryani, R., Uddin, A., Waila, P.: Sentiment analysis of movie reviews: a new feature-based heuristic for aspect-level sentiment classification. In: 2013 International Multi-Conference on Automation, Computing, Communication, Control and Compressed Sensing (iMac4s), pp. 712–717. IEEE (2013)

    Google Scholar 

  16. Socher, R., Perelygin, A., Wu, J.Y., Chuang, J., Manning, C.D., Ng, A.Y., Potts, C.: Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), vol. 1631, p. 1642 (2013)

    Google Scholar 

  17. Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 347–354. Association for Computational Linguistics (2005)

    Google Scholar 

  18. Agarwal, A., Xie, B., Vovsha, I., Rambow, O., Passonneau, R.: Sentiment analysis of twitter data. In: Proceedings of the Workshop on Languages in Social Media, pp. 30–38. Association for Computational Linguistics (2011)

    Google Scholar 

  19. Xu, T., Peng, Q., Cheng, Y.: Identifying the semantic orientation of terms using S-HAL for sentiment analysis. Knowl. Based Syst. 35, 279–289 (2012)

    Article  Google Scholar 

  20. Yu, L.C., Wu, J.L., Chang, P.C., Chu, H.S.: Using a contextual entropy model to expand emotion words and their intensity for the sentiment classification of stock market news. Knowl. Based Syst. 41, 89–97 (2013)

    Article  Google Scholar 

  21. Hagenau, M., Liebmann, M., Neumann, D.: Automated news reading: stock price prediction based on financial news using context-capturing features. Decis. Support Syst. 55(3), 685–697 (2013)

    Article  Google Scholar 

  22. Maks, I., Vossen, P.: A lexicon model for deep sentiment analysis and opinion mining applications. Decis. Support Syst. 53(4), 680–688 (2012)

    Article  Google Scholar 

  23. Malik, M.K.: Urdu named entity recognition and classification system using artificial neural network. ACM Trans. Asian Low-Resour. Lang. Inf. Process. (TALLIP) 17(1), 2 (2017)

    Article  MathSciNet  Google Scholar 

  24. Malik, M.K., Sarwar, S.M.: Urdu named entity recognition system using hidden Markov model. Pak. J. Eng. Appl. Sci. (2017)

    Google Scholar 

  25. Malik, Muhammad Kamran, Sarwar, Syed Mansoor: Named entity recognition system for postpositional languages: urdu as a case study. Int. J. Adv. Comput. Sci. Appl. 7(10), 141–147 (2016)

    Google Scholar 

  26. Usman, Muhammad, Shafique, Zunaira, Ayub, Saba, Malik, Kamran: Urdu text classification using majority voting. Int. J. Adv. Comput. Sci. Appl. 7(8), 265–273 (2016)

    Google Scholar 

  27. Ali, A., Hussain, A., Malik, M.K.: Model for english-urdu statistical machine translation. World Appl. Sci. 24, 1362–1367 (2013)

    Google Scholar 

  28. Shahzadi, S., Fatima, B., Malik, K., Sarwar, S.M.: Urdu word prediction system for mobile phones. World Appl. Sci. J. 22(1), 113–120 (2013)

    Google Scholar 

  29. Karamat, N., Malik, K., Hussain, S.: Improving generation in machine translation by separating syntactic and morphological processes. In: Frontiers of Information Technology (FIT), pp. 195–200. IEEE (2011)

    Google Scholar 

  30. Siddiq, S., Hussain, S., Ali, A., Malik, K., Ali, W.: Urdu noun phrase chunking-hybrid approach. In: 2010 International Conference on Asian Language Processing (IALP), pp. 69–72. IEEE (2010)

    Google Scholar 

  31. Malik, M.K., Ali, A., Siddiq, S.: Behavior of Word ‘kaa’ in Urdu language. In: 2010 International Conference on Asian Language Processing (IALP), pp. 23–26. IEEE (2010)

    Google Scholar 

  32. Ali, W., Malik, M.K., Hussain, S., Siddiq, S., Ali, A.: Urdu noun phrase chunking: HMM based approach. In: 2010 International Conference on Educational and Information Technology (ICEIT), vol. 2, pp. V2-494. IEEE (2010)

    Google Scholar 

  33. Ali, A., Siddiq, S., Malik, M.K.: Development of parallel corpus and english to urdu statistical machine translation. Int. J. Eng. Technol. IJET-IJENS 10, 31–33 (2010)

    Google Scholar 

  34. Malik, K., Ahmed, T., Sulger, S., Bögel, T., Gulzar, A., Raza, G., Hussain, S., Butt, M.: Transliterating Urdu for a broad-coverage Urdu/Hindi LFG grammar. In: Seventh International Conference on Language Resources and Evaluation, LREC 2010, pp. 2921–2927 (2010)

    Google Scholar 

  35. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Khawar Mehmood .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mehmood, K., Essam, D., Shafi, K. (2019). Sentiment Analysis System for Roman Urdu. In: Arai, K., Kapoor, S., Bhatia, R. (eds) Intelligent Computing. SAI 2018. Advances in Intelligent Systems and Computing, vol 858. Springer, Cham. https://doi.org/10.1007/978-3-030-01174-1_3

Download citation

Publish with us

Policies and ethics