Abstract
Wine is a broad field of study and is more and more popular today. However, limited amounts of data science and data mining research are applied on this topic to benefit wine producers, distributors, and consumers. According to the American Association of Wine Economics, “Who is a reliable wine judge?” and “Are wine judges consistent?” are typical questions that beg for formal statistical answers.
This paper proposes to use the white box classification algorithms to understand the wine judges and evaluate the consistency while they score a wine as 90+ or 90−. Three white box classification algorithms, Naïve Bayes, Decision Tree, and K-nearest neighbors are applied to wine sensory data derived from professional wine reviews. Each algorithm is able to tell how the judges make their decision. The extracted information is also useful to wine producers, distributors, and consumers. The data set includes 1000 wines with 500 scored as 90+ points (positive class) and 500 scored as 90− points (negative class). 5-fold cross validation is used to validate the performance of classification algorithms. The higher prediction accuracy indicates the higher consistency of the wine judge. The best white box classification algorithm prediction accuracy we produced is as high as 85.7 % from a modified version of Naïve Bayes algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Kantardzic, M.: Data Mining: Concepts, Models, Methods, and Algorithms. Wiley, Hoboken (2011)
International Organization of Wine and Vine. http://www.oiv.int/oiv/cms/index?lang=en
Foods & Wines from Spain. Wine. http://www.winesfromspain.com/icex/cda/controller/pageGen/0,3346,1549487_6763472_6778161_0,00.html
Sun, L.-X., Danzer, K., Thiel, G.: Classification of wine samples by means of artificial neural networks and discrimination analytical methods. Fresen. J. Anal. Chem. 359(2), 143–149 (1997)
Yang, N.: Quality differentiation in wine markets. Washington State University (2010)
Masset, P., Weisskopf, J.P., Cossutta, M.: Wine tasters, ratings, and en primeur prices. J. Wine Econ. 10(01), 75–107 (2015)
Storchmann, K.: Introduction to the issue. J. Wine Econ. 10(01), 1–3
Bodington, J.C.: Evaluating wine-tasting results and randomness with a mixture of rank preference models. J. Wine Econ. 10(01), 31–46 (2015)
Stuen, E.T., Miller, J.R., Stone, R.W.: An analysis of wine critic consensus: a study of Washington and California wines. J. Wine Econ. 10(01), 47–61 (2015)
Lewis, D.D.: Naïve (Bayes) at forty: the independence assumption in information retrieval. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 4–15. Springer, Heidelberg (1998)
Magerman, D.M.: Statistical decision-tree models for parsing. In: Proceedings of the 33rd Annual Meeting on Association for Computational Linguistics, pp. 276–283. Association for Computational Linguistics (1995)
Arias - Bolzmann, L., Orkun, S., Andres, M., Len, L.: Emerald insight. Int. J. Wine Mark. 7 Apr 2014
De Villiers, A., Alberts, P., Tredoux, A.G.J., Nieuwoudt, H.H.: Analytical techniques for wine analysis: an African perspective; a review. Analytica Chimica Acta 730, 2–23 (2012)
Chen, B.: Wine Attributes. http://www.cs.gsu.edu/~cscbecx/Wine%20Informatics.htm. File Wine_Wheel_01242014.dat
Wine Spectator. http://www.winespectator.com
Wine Spectator (2011) Top 100. http://www.winespectator.com/display/show?id=45906
eRobertParker.com. A glossary of Wine Terms. http://www.erobertparker.com/info/glossary.asp
Sutton, O.: Introduction to k nearest neighbour classification and condensed nearest neighbour data reduction (2012)
Altman, N.S.: An introduction to kernel and nearest-neighbor nonparametric regression. Am. Stat. 46 (1992)
Kamber, M., Han, J.: Data Mining: Concepts and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
Hsu, C.-W., Chang, C.-C., Lin, C.-J.: A practical guide to support vector classification. Taipei, 19 March 2015
Friedl, M.A., Brodley, C.E.: Decision tree classification of land cover from remotely sensed data. Remote Sens. Environ. 61(3), 399–409 (1997)
Chen, B., Rhodes, C., Crawford, A., Hambuchen, L.: Wine informatics: applying data mining on wine sensory. Accepted by 2014 Workshop on Domain Driven Data Mining (DDDM 2014) (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Chen, B., Le, H., Rhodes, C., Che, D. (2016). Understanding the Wine Judges and Evaluating the Consistency Through White-Box Classification Algorithms. In: Perner, P. (eds) Advances in Data Mining. Applications and Theoretical Aspects. ICDM 2016. Lecture Notes in Computer Science(), vol 9728. Springer, Cham. https://doi.org/10.1007/978-3-319-41561-1_18
Download citation
DOI: https://doi.org/10.1007/978-3-319-41561-1_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41560-4
Online ISBN: 978-3-319-41561-1
eBook Packages: Computer ScienceComputer Science (R0)