Comparison of different classification algorithms to identify geographic origins of olive oils

  • Ozgur Gumus
  • Erkan Yasar
  • Z. Pinar GumusEmail author
  • Hasan Ertas
Original Article


Research on investigation and determination of geographic origins of olive oils is increased by consumers’ demand to authenticated olive oils. Classification algorithms which are machine learning methods can be employed for the authentication of olive oils. In this study, different classification algorithms were evaluated to reveal the most accurate one for authentication of Turkish olive oils. BayesNet, Naive Bayes, Multilayer Perception, IBK, Kstar, SMO, Random Forest, J48, LWL, Logistic Regression, Simple Logistic, LogitBoost algorithms were implemented on 61 chemical analysis parameters of 49 olive oil samples from 6 different locations at Western Turkey. These 61 parameters were obtained from five different chemical analyses which are stable carbon isotope ratio, trace elements, sterol compositions, FAMEs and TAGs. This study is the most comprehensive study to determine the geographical origin of Turkish olive oils in terms of these mentioned features. Classification performances of the algorithms were compared using accuracy, specificity and sensitivity metrics. Random Forest, BayesNet, and LogitBoost algorithms were found as the best classification algorithms for authentication of Turkish olive oils. Using the classification model in this study, geographic origin of an unknown olive oil can be predicted with high accuracy. Besides, similar models can be developed to obtain useful information for authentication of other food products.


Machine learning Classification algorithms Authentication Geographic origin Olive oil 



This study was supported Ege University, Council of Scientific Research Projects (Project No. 14-MUH-063 BAP project). Chemical analyses of this work was supported by the EGE University Drug Research and Pharmacokinetic Development and Applied Center (ARGEFAR).

Compliance with ethical standards

Conflict of interest

The authors have declared that they have no conflict of interest.

Supplementary material

13197_2019_4189_MOESM1_ESM.docx (155 kb)
Supplementary material 1 (DOCX 155 kb)
13197_2019_4189_MOESM2_ESM.docx (43 kb)
Supplementary material 2 (DOCX 43 kb)


  1. Ai FF, Bin J, Zhang ZM, Huang JH, Wang JB, Liang YZ, Yu L, Yang ZY (2014) Application of random forests to select premium quality vegetable oils by their fatty acid composition. Food Chem 143:472–478CrossRefGoogle Scholar
  2. Aparicio R, Morales MT, Aparicio-Ruiz R, Tena N, García-González DL (2013) Authenticity of olive oil: mapping and comparing official methods and promising alternatives. Food Res Int 54:2025–2038CrossRefGoogle Scholar
  3. Bajoub A, Ajal EA, Fernández-Gutiérrez A, Carrasco-Pancorbo A (2016) Evaluating the potential of phenolic profiles as discriminant features among extra virgin olive oils from Moroccan controlled designations of origin. Food Res Int 84:41–51CrossRefGoogle Scholar
  4. Bakhouche A, Lozáno-Sanchez J, Fernández-Gutiérrez A, Carretero AS (2015) Trends in chemical characterization of virgin olive oil phenolic profile: an overview and new challenges. Olivea 3–15.
  5. Beltrán M, Sánchez-Astudillo M, Aparicio R, García-González DL (2015) Geographical traceability of virgin olive oils from south-western Spain by their multi-elemental composition. Food Chem 169:350–357CrossRefGoogle Scholar
  6. Breiman L, Cutler A (2005). Random forests. BerkeleyGoogle Scholar
  7. Buscema M, Consonni V, Ballabio D, Mauri A, Massini G, Breda M, Todeschini R (2014) K-CM: a new artificial neural network. Application to supervised pattern recognition. Chemom Intell Lab Syst 138:110–119CrossRefGoogle Scholar
  8. Camin F, Larcher R, Perini M, Bontempo L, Bertoldi D, Gagliano G, Nicolini G, Versini G (2010) Characterisation of authentic Italian extra-virgin olive oils by stable isotope ratios of C, O and H and mineral composition. Food Chem 118:901–909CrossRefGoogle Scholar
  9. Christopher A, Andrew M, Stefan S (1997) Locally weighted learning. Artif Intell Rev 11:11–73CrossRefGoogle Scholar
  10. Cleary JG, Trigg LE (1995) K*: an instance-based learner using an entropic distance measure. Proc Twelveth Int Conf Mach Learn 5:108–114Google Scholar
  11. Drivelos S, Georgiou C (2012) Multi-element and multi-isotope-ratio analysis to determine the geographical origin of foods in the European Union. TrAC Trends Anal Chem 40:38–51CrossRefGoogle Scholar
  12. Friedman J, Hastie T, Tibshirani R (2000) Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors). Ann Stat 28(2):337–407CrossRefGoogle Scholar
  13. García-González DL, Luna G, Morales MT, Aparicio R (2009) Stepwise geographical traceability of virgin olive oils by chemical profiles using artificial neural network models. Eur J Lipid Sci Technol 111:1003–1013CrossRefGoogle Scholar
  14. Gonzalvez A, Armenta S, de la Guardia M (2009) Trace-element composition and stable-isotope ratio for discrimination of foods with protected designation of origin. TrAC Trends Anal Chem 28:1295–1311CrossRefGoogle Scholar
  15. Gumus ZP, Celenk VU, Tekin S, Yurdakul O, Ertas H (2017) Determination of trace elements and stable carbon isotope ratios in virgin olive oils from Western Turkey to authenticate geographical origin with a chemometric approach. Eur Food Res Technol 243:1719–1727CrossRefGoogle Scholar
  16. Gumus ZP, Ertas H, Yasar E, Gumus O (2018) Classification of olive oils using chromatography, principal component analysis and artificial neural network modelling. Food Measur Charact 12:1325–1333CrossRefGoogle Scholar
  17. Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten I (2009) The WEKA data mining software: an update. ACM SIGKDD Explor Newsl 11:10–18CrossRefGoogle Scholar
  18. Huang X, Shi L, Suykens JAK (2015) Sequential minimal optimization for SVM with pinball loss. Neurocomputing 149:1596–1603CrossRefGoogle Scholar
  19. Karabagias I, Michos C, Badeka A, Kontakos S, Stratis I, Kontominas MG (2013) Classification of Western Greek virgin olive oils according to geographical origin based on chromatographic, spectroscopic, conventional and chemometric analyses. Food Res Int 54:1950–1958CrossRefGoogle Scholar
  20. Karakatič S, Podgorelec V (2016) Improved classification with allocation method and multiple classifiers. Inf Fusion 31:26–42CrossRefGoogle Scholar
  21. Kavitha AP, Jaleel UCA, Mujeeb VMA, Muraleedharan K (2016) Performance of knowledge-based biological models in higher dimensional chemical space. Chemom Intell Lab Syst 153:58–66CrossRefGoogle Scholar
  22. Kelly S, Heaton K, Hoogewerff J (2005) Tracing the geographical origin of food: the application of multi-element and multi-isotope analysis. Trends Food Sci Technol 16:555–567CrossRefGoogle Scholar
  23. Longobardi F, Ventrella A, Casiello G, Sacco D, Tasioula-Margari M, Kiritsakis K, Kontominas MG (2012) Characterisation of the geographical origin of Western Greek virgin olive oils based on instrumental and multivariate statistical analysis. Food Chem 133:169–175CrossRefGoogle Scholar
  24. Loubiri A, Taamalli A, Talhaoui N, Mohamed SN, Carretero AS, Zarrouk M (2017) Usefulness of phenolic profile in the classification of extra virgin olive oils from autochthonous and introduced cultivars in Tunisia. Eur Food Res Technol 243(3):467–479CrossRefGoogle Scholar
  25. Nasibov E, Kantarcı S, Vahaplar A, Kınay AÖ (2016) A survey on geographic classification of virgin olive oil with using T-operators in fuzzy decision tree approach. Chemom Intell Lab Syst 155:86–96CrossRefGoogle Scholar
  26. Nettleton DF, Orriols-Puig A, Fornells A (2010) A study of the effect of different types of noise on the precision of supervised learning techniques. Artif Intell Rev 33:275–306CrossRefGoogle Scholar
  27. Parlos AG, Member S, Femandez B, Atiya AF, Ieee M, Muthusami J, Tsai WK (1994) An accelerated learning algorithm for multilayer perceptron networks. IEEE Trans Neural Netw Learn Syst 5:493–497CrossRefGoogle Scholar
  28. Pearl J (1988) Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann, LosAliosGoogle Scholar
  29. Petrakis PV, Agiomyrgianaki A, Christophoridou S, Spyros A, Dais P (2008) Geographical characterization of Greek virgin olive oils (cv. Koroneiki) using 1H and 31P NMR fingerprinting with canonical discriminant analysis and classification binary trees. J Agric Food Chem 56:3200–3207CrossRefGoogle Scholar
  30. RandomForest Accessed 09 June 2019
  31. Romero JR, Roncallo PF, Akkiraju PC, Ponzoni I, Echenique VC, Carballido JA (2013) Using classification algorithms for predicting durum wheat yield in the province of Buenos Aires. Comput Electron Agric 96:173–179CrossRefGoogle Scholar
  32. Ropodi AI, Panagou EZ, Nychas GJE (2016) Data mining derived from food analyses using non-invasive/non-destructive analytical techniques; determination of food authenticity, quality & safety in tandem with computer science disciplines. Trends Food Sci Technol 50:11–25CrossRefGoogle Scholar
  33. Ruiz-Samblás C, Cadenas JM, Pelta DA, Cuadros-Rodríguez L (2014) Application of data mining methods for classification and prediction of olive oil blends with other vegetable oils. Anal Bioanal Chem 406:2591–2601CrossRefGoogle Scholar
  34. Viera AJ, Garrett JM (2005) Understanding interobserver agreement: the kappa statistic. Fam Med 37(5):360PubMedGoogle Scholar
  35. WEKA link: Accessed 09 June 2019

Copyright information

© Association of Food Scientists & Technologists (India) 2019

Authors and Affiliations

  1. 1.Department of Computer Engineering, Faculty of EngineeringEge UniversityBornova, IzmirTurkey
  2. 2.Central Research Testing and Analysis Laboratory Research and Application Center (EGE-MATAL)Ege UniversityBornova, IzmirTurkey
  3. 3.Department of Chemistry, Faculty of ScienceEge UniversityBornova, IzmirTurkey

Personalised recommendations