Cultivar Prediction of Target Consumer Class Using Feature Selection with Machine Learning Classification

  • Shyamala Devi MunisamyEmail author
  • Suguna Ramadass
  • Aparna Shashikant Joshi
  • Mahesh B. Lonare
Conference paper
Part of the Learning and Analytics in Intelligent Systems book series (LAIS, volume 3)


Recently, Industries are focusing on cultivar prediction of customer classes for the promotion of their product for increasing the profit. The prediction of customer class is a time consuming process and may not be accurate while performing manually. By considering these aspects, this paper proposes the usage of machine learning algorithms for predicting the customer cultivar of Wine Access. This paper uses multivariate Wine data set extracted from UCI machine learning repository and is subjected to the feature selection methods like Random Forest, Forward feature selection and Backward elimination. The optimized dimensionality reduced dataset from each of the above methods are processed with various classifiers like Logistic Regressor, K-Nearest Neighbor (KNN), Random Forest, Support Vector Machine (SVM), Naive Bayes, Decision Tree and Kernel SVM. We have achieved the accurate cultivar prediction in two ways. Firstly, the dimensionality reduction is done using three feature selection methods which results in the existence of reasonable components to predict the dependent variable cultivar. Secondly, the prediction of customer class is done for various classifiers to compare the accuracy. The performance analysis is done by implementing python scripts in Anaconda Spyder Navigator. The better cultivar prediction is done by examining the metrics like Precision, Recall, FScore and Accuracy. Experimental Result shows that maximum accuracy of 97.2% is obtained for Random Projection with SVM, Decision Tree and Random Forest Classifier.


Machine learning Dimensionality reduction Feature selection KNN SVM Naïve Bayes Decision Tree and Random Forest 


  1. 1.
    Azadi TE, Almasganj F (2009) Using backward elimination with a new model order reduction algorithm to select best double mixture model for document clustering. Expert Syst Appl Int J 36(7):10485–10493Google Scholar
  2. 2.
    Bo L, Wang L, Jiao L (2006) Sparse Gaussian processes using backward elimination. In: Wang J, Yi Z, Zurada JM, Lu BL, Yin H (eds) Advances in neural networks - ISNN 2006. LNCS, vol 3971. Springer, Berlin, HeidelbergGoogle Scholar
  3. 3.
    Chabathula KJ, Jaidhar CD, Ajay Kumara MA (2015) Comparative study of principal component analysis based intrusion detection approach using machine learning algorithms. In: 3rd international conference on signal processing, communication and networking. Chennai, pp 1–6Google Scholar
  4. 4.
    Shimpi P, Shah S, Shroff M, Godbole A (2017) A machine learning approach for the classification of cardiac arrhythmia. In: International conference on computing methodologies and communication. Erode, pp 603–607Google Scholar
  5. 5.
    Nair-Benrekia NY, Kuntz P, Meyer F (2017) Combining dimensionality reduction with random forests for multi-label classification under interactivity constraints. In: The Pacific-Asia conference on knowledge discovery and data mining, pp 828–839Google Scholar
  6. 6.
    Mim MA, Zamil KS (2018) GIS-based analysis of changing surface water in Rajshahi City corporation area using support vector machine, decision tree & random forest technique. Mach Learn Res 3(2):11–17Google Scholar
  7. 7.
    Karnan M, Kalyani P (2010) Attribute reduction using backward elimination algorithm. In: IEEE international conference on computational intelligence and computing research. Coimbatore, pp 1–4Google Scholar
  8. 8.
    Muthukrishnan R, Rohini R (2016) LASSO: a feature selection technique in predictive modeling for machine learning. In: IEEE international conference on advances in computer applications. Coimbatore, pp 18–20Google Scholar
  9. 9.
    Yan H, Tianyu H (2017) Unsupervised dimensionality reduction for high-dimensional data classification. Mach Learn Res 2(4):125–132Google Scholar
  10. 10.
    Pavya K, Srinivasan B (2017) Feature selection algorithms to improve thyroid disease diagnosis. In: International conference on innovations in green energy and healthcare technologies. Coimbatore, pp 1–5Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Shyamala Devi Munisamy
    • 1
    Email author
  • Suguna Ramadass
    • 1
  • Aparna Shashikant Joshi
    • 1
  • Mahesh B. Lonare
    • 1
  1. 1.Vel Tech Rangarajan Dr. Sagunthala R&D Institute of Science and TechnologyChennaiIndia

Personalised recommendations