Prediction of Different Types of Wine Using Nonlinear and Probabilistic Classifiers

  • Satyabrata AichEmail author
  • Mangal Sain
  • Jin-Han Yoon
Part of the Studies in Computational Intelligence book series (SCI, volume 771)


In the past few years, machine-learning techniques have garnered much attention across disciplines. Most of these techniques are capable of producing highly accurate results that compel a majority of scientists to implement the approach in cases of predictive analytics. Few works related to wine data have been undertaken using different classifiers, and thus far, no studies have compared the performance metrics of the different classifiers with different feature sets for the prediction of quality among types of wine. In this chapter, an intelligent approach is proposed by considering a recursive feature elimination (RFE) algorithm for feature selection, as well as nonlinear and probabilistic classifiers. Performance metrics including accuracy, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) are compared by implementing different classifiers with original feature sets (OFS) as well as reduced feature sets (RFS). The results show accuracy ranging from 97.61 to 99.69% among the different feature sets. This analysis will aid wine experts in differentiating various wines according to their features.


Machine learning Feature selection Classifiers Performance metrics Prediction 


  1. 1.
    Janszky, I., M. Ericson, M. Blom, A. Georgiades, J.O. Magnusson, H. Alinagizadeh, and S. Ahnve. 2005. Wine drinking is associated with increased heart rate variability in women with coronary heart disease. Heart 91 (3): 314–318.CrossRefGoogle Scholar
  2. 2.
    Preedy, V., M.L.R. Mendez. 2016. Wine applications with electronic noses. In Electronic noses and tongues in food science, 137–151. Cambridge, MA, USA: Academic Press.Google Scholar
  3. 3.
    Er, Y., and A. Atasoy. 2016. The classification of white wine and red wine according to their physicochemical qualities. International Journal of Intelligent Systems and Applications Engineering 4: 23–26.CrossRefGoogle Scholar
  4. 4.
    Cortez, P., A. Cerdeira, F. Almeida, T. Matos, and J. Reis. 2009. Modeling wine preferences by data mining from physicochemical properties. Decision Support Systems 47 (4): 547–553.CrossRefGoogle Scholar
  5. 5.
    Appalasamy, P., A. Mustapha, N.D. Rizal, F. Johari, and A.F. Mansor. 2012. Classification-based data mining approach for quality control in wine production. Journal of Applied Sciences 12 (6): 598–601.CrossRefGoogle Scholar
  6. 6.
    Beltran, N.H., M.A. Duarte-MErmound, V.A.S. Vicencio, S.A. Salah, and M.A. Bustos. 2008. Chilean wine classification using volatile organic compounds data obtained with a fast GC analyzer. IEEE Transactions on Instrumentation and Measurement 57: 2421–2436.Google Scholar
  7. 7.
    Chen, B., C. Rhodes, A. Crawford, and L. Hambuchen. 2014. Wineinformatics: Applying data mining on wine sensory reviews processed by the computational wine wheel. In IEEE international conference on data mining workshop, 142–149, Dec. 2014.Google Scholar
  8. 8.
    Forina, M., R. Leardi, C. Armanino, and S. Lanteri. 1998. PARVUS an extendible package for data exploration, classification and correla.Google Scholar
  9. 9.
    Granitto, P.M., C. Furlanello, F. Biasioli, and F. Gasperi. 2006. Recursive feature elimination with random forest for PTR-MS analysis of agroindustrial products. Chemometrics and Intelligent Laboratory Systems 83 (2): 83–90.CrossRefGoogle Scholar
  10. 10.
  11. 11.
    Vijayarani, S., and M. Divya. 2011. An efficient algorithm for generating classification rules. International Journal of Computer Science and Technology 2 (4).Google Scholar
  12. 12.
  13. 13.
    Breiman, L. 1996. Bagging predictors. Machine Learning 26 (2): 123–140.zbMATHGoogle Scholar
  14. 14.
    Sun, X. 2002. Pitch accent prediction using ensemble machine learning. In Seventh international conference on spoken language processing.Google Scholar
  15. 15.
    Breiman, L. 2001. Random forests. Machine Learning 45 (1): 5–32.CrossRefGoogle Scholar
  16. 16.
    Ellis, K., J. Kerr, S. Godbole, G. Lanckriet, D. Wing, and S. Marshall. 2014. A random forest classifier for the prediction of energy expenditure and type of physical activity from wrist and hip accelerometers. Physiological Measurement 35 (11): 2191–2203.CrossRefGoogle Scholar
  17. 17.
    Is See5/C5.0 Better Than C4.5?. 2009. (Online).
  18. 18.
    Nilashi, M., O. bin Ibrahim, H. Ahmadi, and L. Shahmoradi. 2017. An analytical method for diseases prediction using machine learning techniques. Computers & Chemical Engineering 10 (6): 212–223.CrossRefGoogle Scholar
  19. 19. Software for Predictive Modelling and Forecasting.
  20. 20.
    Hsu, C.C., Y.P. Huang, and K.W. Chang. 2008. Extended naive Bayes classifier for mixed data. Expert Systems with Applications 35 (3): 1080–1083.CrossRefGoogle Scholar
  21. 21.
    Wong, H.B., and G.H. Lim. 2011. Measures of diagnostic accuracy: sensitivity, specificity, PPV and NPV. Proceedings of Singapore Healthcare 20 (4): 316–318.CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.Department of Computer EngineeringInje UniversityGimhaeSouth Korea
  2. 2.Department of Computer EngineeringDongseo UniversityBusanSouth Korea
  3. 3.Daedong CollegeBusanSouth Korea

Personalised recommendations