A Novel Feature Selection Method Based on Genetic Algorithm for Opinion Mining of Social Media Reviews

  • Savita SangamEmail author
  • Subhash Shinde
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 835)


Use of social media for sharing the opinions about the products or the services by individuals or business organizations is becoming very common nowadays. Consumers are keen to share their views on certain products or commodities. This leads to the generation of large amount of unstructured social media data. Thus text data is being formed gradually in many areas like automated business, education, health care, show business and so on. Opinion mining, the sub field of text mining, deals with mining of review text and classifying the opinions or the sentiments of that text as positive or negative. The work in this paper develops a framework for opinion mining. It includes a novel feature selection method called Most Persistent Feature Selection (MPFS) for feature selection and a genetic algorithm (GA) based optimization technique for optimizing the feature set. MPFS method uses information gain of the features in the review documents. The feature set thus produced is optimized using GA technique to get the most effective feature set for sentiment classification. Then a Support Vector Machine (SVM) algorithm is used for classifying the sentiments of reviews expressed in text with the proposed feature selection and optimization method. The classifier models generated show the acceptable performance in terms of accuracy when compared with the other existing models.


Feature selection Genetic algorithm Opinion mining Sentiment classification 


  1. 1.
    Liu, B.: Sentiment Analysis and Opinion Mining, vol. 5, no. 1. Morgan & Claypool Publishers, San Rafael, May 2012Google Scholar
  2. 2.
    Pang, B., Lee, L.: Opinion mining and sentiment analysis (2008)CrossRefGoogle Scholar
  3. 3.
    Pang, B., Lee, L.: A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of ACL (2004)Google Scholar
  4. 4.
    Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings ACM SIGKDD, pp. 168–177 (2004)Google Scholar
  5. 5.
    Wang, S., Manning, C.D.: Baselines and bigrams: simple, good sentiment and topic classification. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, pp. 90–94, Jeju, Republic of Korea, 8–14 July 2012Google Scholar
  6. 6.
    Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? sentiment classification using machine learning techniques. In: Proceedings of EMNLP, pp. 79–86 (2002)Google Scholar
  7. 7.
    Jurek, A., Mulvenna, M.D., Bi, Y.: Improved lexicon-based sentiment analysis for social media analytics. Secur. Inform. 4(1), 9 (2015)Google Scholar
  8. 8.
    Fu, G., Wang, X.: Chinese Sentence-Level Sentiment Classification Based on Fuzzy Sets, Coling 2010: Poster Volume, pp. 312–319, Beijing, August 2010Google Scholar
  9. 9.
    Fang, X., Zhan, J.: Sentiment analysis using product review data. J. Big Data 2(1), 5 (2015)Google Scholar
  10. 10.
    Tripathy, A., Anand, A., Rath, S.K.: Classification of sentiment reviews using N-gram machine learning approach. Expert Syst. Appl. 57, 117–126 (2016)CrossRefGoogle Scholar
  11. 11.
    Sohail, S.S., Siddiqui, J., Ali, R.: Feature extraction and analysis of online reviews for the recommendation of books using opinion mining technique. Perspect. Sci. 8, 754–756 (2016)CrossRefGoogle Scholar
  12. 12.
    Zhou, X., Wan, X., Xiao, J.: CL opinion miner: opinion target extraction in a cross-language scenario. In: IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, no. 4, April 2015CrossRefGoogle Scholar
  13. 13.
    Tartir, S., Nabi, I.A.: Semantic sentiment analysis in arabic social media. J. King Saud Univ. Comput. Inf. Sci. 29, 229–233 (2017)CrossRefGoogle Scholar
  14. 14.
    Tripathy, A., Anand, A., Rath, S.K.: Document-level sentiment classification using hybrid machine learning approach. Knowl. Inf. Syst. 53, 805 (2017)CrossRefGoogle Scholar
  15. 15.
    Zainuddin, N., Selamat, A.: Sentiment Analysis Using Support Vector Machine, IEEE I4CT, Langkawi, Kedah, Malaysia, pp. 333–337 (2014)Google Scholar
  16. 16.
    Jurafsky, D., Martin, J.H.: Naive Bayes and Sentiment Classification, Speech and Language Processing, 7 November 2016Google Scholar
  17. 17.
    Manek, A.S., Shenoy, P.D., Mohan, M.C., Venugopal, K.: Aspect term extraction for sentiment analysis in large movie reviews using Gini Index feature selection method and SVM classifier. world wide web 20(2), 135–154 (2016)CrossRefGoogle Scholar
  18. 18.
    Chunping, O., Yongbin, L., Shuqing, Z., Xiaohua, Y.: Opinion objects identification and sentiment analysis. Int. J. Database Theor. Appl. 8(6), 1–12 (2015)CrossRefGoogle Scholar
  19. 19.
    Ferreira, L.C., Dosciatti, M.M., Nievola, J.C., Paraiso, E.C.: Using a genetic algorithm approach to study the impact of imbalanced corpora in sentiment analysis. In: Proceedings of the Twenty-Eighth International Florida Artificial Intelligence Research Society ConferenceGoogle Scholar
  20. 20.
    Catak, F., Bilgem, T.: Genetic algorithm based feature selection in high dimensional text dataset classification. WSEAS Trans. Inf. Sci. Appl. 12(1), 290–296 (2015)Google Scholar
  21. 21.
    Gómez, F., Quesada, A.: Genetic algorithms for feature selection in data analytics. Artelnics

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.University of MumbaiMumbaiIndia

Personalised recommendations