Abstract
There is a general consensus that the consumption of organic food can contribute to a healthy diet; nevertheless, large-scale production of organic food is not an easy task since it requires intense care due to the number of pests, fungi, and diseases that can wipe out an entire crop. Researchers evaluating food quality are often concerned with the use of pesticides, antibiotics, and hormones in agriculture, along with genetic modification (GMOs) and additives in food processing. Thus, a major challenge that arises in this context is how to obtain products that are free of these toxic elements. In this review, we give an overview of the research conducted in relation to the chemometric tools for extraction of variables in several types of food and the use of data mining techniques and statistical analysis to classify samples grown in organic and conventional systems. The expansion of the organic sector, driven by growing demand and high prices, could lead to fraud. Then, creating mechanisms that can be used by regulators, supervisory bodies, or even installed in supermarkets so the client can do this verification may be a deterrent for this type of deception. Results presented by recent research have shown that chemometric methods associated with data mining algorithms or statistical methods can be used to successfully classify products grown in organic and conventional systems.
Similar content being viewed by others
References
Abellán J, Mantas CJ, Castellano JG (2017) A random forest approach using imprecise probabilities. Knowl-Based Syst 134:72–84
Adeniyi DA, Wei Z, Yongquan Y (2016) Automated web usage data mining and recommendation system using K-nearest neighbor (KNN) classification method. Appl Comput Inform 12:90–108
Amodio ML, Ceglie F, Chaudhry MMA, Piazzolla F, Colelli G (2017) Potential of NIR spectroscopy for predicting internal quality and discriminating among strawberry fruits from different production systems. Postharvest Biol Technol 125:112–121
Baraud F, Leleyter L (2012) Prediction of phytoavailability of trace metals to plants: comparison between chemical extractions and soil-grown radish. Compt Rendus Geosci 344:385–395
Barbosa RM, Batista BL, Varrique RM, Coelho VA, Campiglia AD, Barbosa F Jr (2014a) The use of advanced chemometric techniques and trace element levels for controlling the authenticity of organic coffee. Food Res Int 61:246–251
Barbosa RM, Nacano LR, Freitas R, Batista BL, Barbosa F Jr (2014b) The use of decision trees and naive Bayes algorithms and trace element patterns for controlling the authenticity of free-range-pastured hens’ eggs. J Food Sci 79:C1672–C1677
Barbosa RM, Batista BL, Barião CV, Varrique RM, Coelho VA, Campiglia AD, Barbosa F Jr (2015) A simple and practical control of the authenticity of organic sugarcane samples based on the use of machine-learning algorithms and trace elements determination by inductively coupled plasma mass spectrometry. Food Chem 184:154–159
Barbosa RM, de Paula ES, Paulelli AC, Moore AF, Souza JMO, Batista BL, Campiglia AD, Barbosa F Jr (2016) Recognition of organic rice samples based on trace elements and support vector machines. J Food Compos Anal 45:95–100
Batista BL, da Silva LRS, Rocha BA, Rodrigues JL, Berretta-Silva AA, Bonates TO, Gomes VSD, Barbosa RM, Barbosa F (2012) Multi-element determination in Brazilian honey samples by inductively coupled plasma mass spectrometry and estimation of geographic origin with data mining techniques. Food Res Int 49:209–215
Beauchemin D (2010) Inductively coupled plasma mass spectrometry. Anal Chem 82:4786–4810
Bellon-Maurel V, McBratney A (2011) Near-infrared (NIR) and mid-infrared (MIR) spectroscopic techniques for assessing the amount of carbon stock in soils—critical review and research perspectives. Soil Biol Biochem 43:1398–1410
Berrar D (2019) Bayes’ theorem and naive Bayes classifier. Encycl Bioinforma Comput Biol 1:403–412
Bigot C, Meile J, Kapitan A, Montet D (2015) Discriminating organic and conventional foods by analysis of their microbial ecology: an application on fruits. Food Control 48:123–129
Bode P, De Nadai FE, Greenberg R (2000) Metrology for chemical measurements and the position of INAA. J Radioanal Nucl Chem 245:109–114
Bona E, Marquetti I, Link JV, Makimori GYF, da Costa Arca V, Guimarães Lemes AL, Ferreira JMG, dos Santos Scholz MB, Valderrama P, Poppi RJ (2017) Support vector machines in tandem with infrared spectroscopy for geographical classification of green arabica coffee. LWT - Food Sci Technol 76:330–336
Borges EM, Gelinski JMLN, de Oliveira Souza VC et al (2015) Monitoring the authenticity of organic rice via chemometric analysis of elemental data. Food Res Int 77:299–309
Brantsæter AL, Ydersbond TA, Hoppin JA et al (2017) Organic food in the diet: exposure and health implications. Annu Rev Public Health 38:295–313
Brasil (2007) Decreto no 6.323, de 27 de dezembro de 2007. Regulamenta a Lei no 10.831, de 23 de dezembro de 2003, que dispõe sobre a agricultura orgânica, e dá outras providências. In: Ministério da Agric. Pecuária e do Abast. http://www.planalto.gov.br/ccivil_03/_ato2007-2010/2007/decreto/d6323.htm. Accessed 10 Feb 2018
Breiman L (2001) Random forests. Mach Learn 45:5–32
Calderón-Celis F, Encinar JR, Sanz-Medel A (2018) Standardization approaches in absolute quantitative proteomics with mass spectrometry. Mass Spectrom Rev 37(6):715–737
Callao MP, Ruisánchez I (2018) An overview of multivariate qualitative methods for food fraud detection. Food Control 86:283–293
Chandrashekar G, Sahin F (2014) A survey on feature selection methods. Comput Electr Eng 40:16–28
Charlebois S, Schwab A, Henn R, Huck CW (2016a) Food fraud: an exploratory study for measuring consumer perception towards mislabeled food products and influence on self-authentication intentions. Trends Food Sci Technol 50:211–218
Charlebois S, Schwab A, Henn R, Huck CW (2016b) Food fraud: an exploratory study for measuring consumer perception towards mislabeled food products and influence on self- authentication intentions. Trends Food Sci Technol 50:211–218
Chen Y, Lin C (2006) Combining SVMs with various feature selection strategies. Featur Extr 324:315–324
Chen HL, Yang B, Liu J, Liu DY (2011) A support vector machine classifier with rough set-based feature selection for breast cancer diagnosis. Expert Syst Appl 38:9014–9022
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297
Cozzolino D, Holdstock M, Dambergs RG, Cynkar WU, Smith PA (2009) Mid infrared spectroscopy and multivariate analysis: a tool to discriminate between organic and non-organic wines grown in Australia. Food Chem 116:761–765
da Costa NL, Castro IA, Barbosa R (2016) Classification of cabernet sauvignon from two different countries in South America by chemical compounds and support vector machines. Appl Artif Intell 30:679–689
da Costa NL, Llobodanin LAG, de Lima MD, Castro IA, Barbosa R (2018) Geographical recognition of Syrah wines by combining feature selection with extreme learning machine. Measurement 120:92–99
Danezis GP, Tsagkaris AS, Brusic V, Georgiou CA (2016) Food authentication: state of the art and prospects. Curr Opin Food Sci 10:22–31
Das AK, Goswami S, Chakrabarti A, Chakraborty B (2017) A new hybrid feature selection approach using feature association map for supervised and unsupervised classification. Expert Syst Appl 88:81–94
dos Santos AMP, Lima JS, dos Santos IF et al (2017) Mineral and centesimal composition evaluation of conventional and organic cultivars sweet potato (Ipomoea batatas (L.) Lam) using chemometric tools. Food Chem 273:166–171
Downey G (1998) Food and food ingredient authentication by mid-infrared spectroscopy and chemometrics. TrAC Trends Anal Chem 17:418–424
Esteki M, Simal-Gandara J, Shahsavari Z, Zandbaaf S, Dashtaki E, Vander Heyden Y (2018) A review on the application of chromatographic methods, coupled to chemometrics, for food authentication. Food Control 93:165–182
Gao C z, Cheng Q, He P et al (2018) Privacy-preserving naive Bayes classifiers secure against the substitution-then-comparison attack. Inf Sci (Ny) 444:72–88
Garcia JM, Teixeira P (2017) Organic versus conventional food: a comparison regarding food safety. Food Rev Int 33:424–446
Geana EI, Popescu R, Costinel D, Dinca OR, Ionete RE, Stefanescu I, Artem V, Bala C (2016) Classification of red wines using suitable markers coupled with multivariate statistic analysis. Food Chem 192:1015–1024
Gobbetti M, Rizzello CG, Di Cagno R, De Angelis M (2014) How the sourdough may affect the functional features of leavened baked goods. Food Microbiol 37:30–40
Granato D, Santos JS, Escher GB, Ferreira BL, Maggio RM (2018) Use of principal component analysis (PCA) and hierarchical cluster analysis (HCA) for multivariate association between bioactive compounds and functional properties in foods: a critical perspective. Trends Food Sci Technol 72:83–90
Guo BL, Wei YM, Pan JR, Li Y (2010) Stable C and N isotope ratio analysis for regional geographical traceability of cattle in China. Food Chem 118:915–920
Hamidatou L, Slamene H, Akhal T, Zouranen B (2013) Concepts, instrumentation and techniques of neutron activation analysis. In: F. K (ed) Imaging and Radioanalytical Techniques in Interdisciplinary Research-Fundamentals and Cutting Edge Applications. InTech, Rijeka, Croatia, pp 141–178
Hidalgo MJ, Pozzi MT, Furlong OJ, Marchevsky EJ, Pellerano RG (2018) Classification of organic olives based on chemometric analysis of elemental data. Microchem J 142:30–35
Huang H, Yu H, Xu H, Ying Y (2008) Near infrared spectroscopy for on/in-line monitoring of quality in foods and beverages: a review. J Food Eng 87:303–313
Kamal M, Karoui R (2015) Analytical methods coupled with chemometric tools for determining the authenticity and detecting the adulteration of dairy products: a review. Trends Food Sci Technol 46:1–22
Ko Y, Park J, Seo J (2004) Improving text categorization using the importance of sentences. Inf Process Manag 40:65–79
Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. Int Jt 14:0–6
Kohavi R, Provost F (1998) Glossary of terms. Mach Learn 30:271–274
Kotsiantis SB, Zaharakis ID, Pintelas PE (2006) Machine learning : a review of classification and combining techniques. Artif Intell Rev 26:159–190
Ku YG, Bae JH, Namieśnik J, Barasch D, Nemirovski A, Katrich E, Gorinstein S (2018) Detection of bioactive compounds in organically and conventionally grown Asparagus spears. Food Anal Methods 11:309–318
Kumar N, Bansal A, Sarma GS, Rawal RK (2014) Chemometrics tools used in analytical chemistry: an overview. Talanta 123:186–199
Laursen KH, Schjoerring JK, Olesen JE, Askegaard M, Halekoh U, Husted S (2011) Multielemental fingerprinting as a tool for authentication of organic wheat, barley, faba bean, and potato. J Agric Food Chem 59:4385–4396
Laursen KH, Mihailova A, Kelly SD, Epov VN, Bérail S, Schjoerring JK, Donard OFX, Larsen EH, Pedentchouk N, Marca-Bell AD, Halekoh U, Olesen JE, Husted S (2013) Is it really organic?—multi-isotopic analysis as a tool to discriminate between organic and conventional plants. Food Chem 141:2812–2820
Letaief H, Zemni H, Mliki A, Chebil S (2016) Composition of Citrus sinensis (L.) Osbeck cv <<Maltaise demi-sanguine>> juice. A comparison between organic and conventional farming. Food Chem 194:290–295
Liu M, Wang M, Wang J, Li D (2013) Comparison of random forest, support vector machine and back propagation neural network for electronic tongue data classification: application to the recognition of orange beverage and Chinese vinegar. Sensors Actuators B Chem 177:970–980. https://doi.org/10.1016/j.snb.2012.11.071
Liu D, Wang L, Sun DW, Zeng XA, Qu J, Ma J (2014) Lychee variety discrimination by hyperspectral imaging coupled with multivariate classification. Food Anal Methods 7:1848–1857
Luykx DMAM, Van Ruth SM (2008) An overview of analytical methods for determining the geographical origin of food products. Food Chem 107:897–911
Maillo J, Triguero I, Herrera F (2015) A mapreduce-based k-nearest neighbor approach for big data classification. In: Trustcom/BigDataSE/ISPA, 2015 IEEE, IEEE, pp 167–172
Maione C, Barbosa RM (2018) Recent applications of multivariate data analysis methods in the authentication of rice and the most analyzed parameters: a review. Crit Rev Food Sci Nutr 8398:1–12
Maione C, Batista BL, Campiglia AD, Barbosa F Jr, Barbosa RM (2016a) Classification of geographic origin of rice by data mining and inductively coupled plasma mass spectrometry. Comput Electron Agric 121:101–107
Maione C, de Paula ES, Gallimberti M, Batista BL, Campiglia AD, Jr FB, Barbosa RM (2016b) Comparative study of data mining techniques for the authentication of organic grape juice based on ICP-MS analysis. Expert Syst Appl 49:60–73
Maione C, de Oliveira Souza VC, Togni LR, et al (2018) Establishing chemical profiling for ecstasy tablets based on trace element levels and support vector machine. Neural Comput Applic 30(3):947–955
Margraf T, Santos ÉNT, de Andrade EF, van Ruth SM, Granato D (2016) Effects of geographical origin, variety and farming system on the chemical markers and in vitro antioxidant capacity of Brazilian purple grape juices. Food Res Int 82:145–155
McLachlan GJ (2004) Discriminant analysis and statistical pattern recognition, 544th edn. John Wiley & Sons, Hoboken
Mees C, Souard F, Delporte C, Deconinck E, Stoffelen P, Stévigny C, Kauffmann JM, de Braekeleer K (2018) Identification of coffee leaves using FT-NIR spectroscopy and SIMCA. Talanta 177:4–11
Mie A, Laursen KH, Aberg KM et al (2014) Discrimination of conventional and organic white cabbage from a long-term field trial study using untargeted LC-MS-based metabolomics. Anal Bioanal Chem 406:2885–2897
Monokrousos N, Papatheodorou EM, Stamou GP (2008) The response of soil biochemical variables to organic and conventional cultivation of Asparagus sp. Soil Biol Biochem 40:198–206
Monteiro PI, Santos JS, Alvarenga Brizola VR, Pasini Deolindo CT, Koot A, Boerrigter-Eenling R, van Ruth S, Georgouli K, Koidis A, Granato D (2018) Comparison between proton transfer reaction mass spectrometry and near infrared spectroscopy for the authentication of Brazilian coffee: a preliminary chemometric study. Food Control 91:276–283
Muccio Z, Jackson GP (2009) Isotope ratio mass spectrometry. Analyst 134:213–222
Nardi EP, Evangelista FS, Tormen L, Saint´Pierre TD, Curtius AJ, Souza SS, Barbosa F Jr (2009) The use of inductively coupled plasma mass spectrometry (ICP-MS) for the determination of toxic and essential elements in different types of food samples. Food Chem 112:727–732
Novotná H, Kmiecik O, Gałazka M et al (2012) Metabolomic fingerprinting employing DART-TOFMS for authentication of tomatoes and peppers from organic and conventional farming. Food Addit Contam - Part A Chem Anal Control Expo Risk Assess 29:1335–1346
Nunes-Damaceno M, Muñoz-Ferreiro N, Romero-Rodríguez MA, Vázquez-Odériz ML (2013) A comparison of kiwi fruit from conventional, integrated and organic production systems. LWT - Food Sci Technol 54:291–297
Oliveri P (2017) Class-modelling in food analytical chemistry: development, sampling, optimisation and validation issues – a tutorial. Anal Chim Acta 982:9–19
Oliveri P, Downey G (2013) Discriminant and class-modelling chemometric techniques for food PDO verification. Compr Anal Chem 60:317–338
Oliveri P, López MI, Casolino MC, Ruisánchez I, Callao MP, Medini L, Lanteri S (2014) Partial least squares density modeling (PLS-DM)—a new class-modeling strategy applied to the authentication of olives in brine by near-infrared spectroscopy. Anal Chim Acta 851:30–36
Orhan U, Hekim M, Ozer M (2011) EEG signals classification using the K-means clustering and a multilayer perceptron neural network model. Expert Syst Appl 38:13475–13481
Pasquini C (2003) Near infrared spectroscopy: fundamentals, practical aspects and analytical applications. J Braz Chem Soc 14:198–219
Pearson K (1992) On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. In: Breakthroughs in Statistics. Springer, Berlin, pp 11–28
Pérez-Martín L, Bustamante-Rangel M, Delgado-Zamarreño MM (2017) Classification of lentils, chickpeas and beans based on their Isoflavone content. Food Anal Methods 10:1191–1201
Pontonio E, Rizzello CG, Di Cagno R et al (2016) How organic farming of wheat may affect the sourdough and the nutritional and technological features of leavened baked goods. Int J Food Microbiol 239:44–53
Reinholds I, Bartkevics V, Silvis ICJ, van Ruth SM, Esslinger S (2015) Analytical techniques combined with chemometrics for authentication and determination of contaminants in condiments: a review. J Food Compos Anal 44:56–72
Ropodi AI, Panagou EZ, Nychas GJE (2016) Data mining derived from food analyses using non-invasive/non-destructive analytical techniques; determination of food authenticity, quality & safety in tandem with computer science disciplines. Trends Food Sci Technol 50:11–25
Ruiz-Angel MJ, García-Alvarez-Coque MC, Berthod A, Carda-Broch S (2014) Are analysts doing method validation in liquid chromatography? J Chromatogr A 1353:2–9
Shiri Harzevili N, Alizadeh SH (2018) Mixture of latent multinomial naive Bayes classifier. Appl Soft Comput J 69:516–527
Song W, Wang H, Maguire P, Nibouche O (2016) Differentiation of organic and non-organic apples using near infrared reflectance spectroscopy—a pattern recognition approach. In: SENSORS, 2016 IEEE, IEEE, pp 1–3
Souza SS, Cruz AG, Walter EHM, Faria JAF, Celeghini RMS, Ferreira MMC, Granato D, Sant’Ana AS (2011) Monitoring the authenticity of Brazilian UHT milk: a chemometric approach. Food Chem 124:692–695
Svetnik V, Liaw A, Tong C, Culberson JC, Sheridan RP, Feuston BP (2003) Random forest: a classification and regression tool for compound classification and QSAR modeling. J Chem Inf Comput Sci 43:1947–1958
Swartz ME (2005) Ultra performance liquid chromatography (UPLC): an introduction. Sep Sci Redefined 5:8–14
Thullner M, Centler F, Richnow HH, Fischer A (2012) Quantification of organic pollutant degradation in contaminated aquifers using compound specific stable isotope analysis—review of recent developments. Org Geochem 42:1440–1460
Tokalıoğlu Ş, Çiçek B, İnanç N, Zararsız G, Öztürk A (2018) Multivariate statistical analysis of data and ICP-MS determination of heavy metals in different brands of spices consumed in Kayseri, Turkey. Food Anal Methods 11:2407–2418
Turra C, Fernandes EAN, Bacchi MA, Tagliaferro FS, França EJ (2006) Differences between elemental composition of orange juices and leaves from organic and conventional production systems. J Radioanal Nucl Chem 270:203–208
Turra C, de Lima MD, Fernandes EADN et al (2017) Multielement determination in orange juice by ICP-MS associated with data mining for the classification of organic samples. Inf Process Agric 4:199–205
Vanneschi L, Castelli M (2018) Multilayer perceptrons. Ref Modul Life Sci:1–9. https://doi.org/10.1016/B978-0-12-809633-8.20339-7
Vermeesch P (2018) Dissimilarity measures in detrital geochronology. Earth-Science Rev 178:310–321
Vervoort J, Mueller P (2018) Multicollector-inductively coupled plasma mass spectrometer (MC-ICPMS). In: Carlet Coll Geochemical Instrum Anal.https://serc.carleton.edu/research_education/geochemsheets/techniques/MCICPMS.html. Accessed 21 Aug 2018
Vigneau E, Courcoux P, Symoneaux R, Guérin L, Villière A (2018) Random forests: a machine learning methodology to highlight the volatile organic compounds involved in olfactory perception. Food Qual Prefer 68:135–145
Wang Z, Chen P, Yu L, De Harrington PB (2013) Authentication of organically and conventionally grown basils by gas chromatography/mass spectrometry chemical profiles. Anal Chem 85:2945–2953
Wold S (1976) Pattern recognition by means of disjoint principal components models. Pattern Recogn 8:127–139
Wu Y, Engen JR, Hobbins WB (2006) Ultra performance liquid chromatography (UPLC) further improves hydrogen/deuterium exchange mass spectrometry. J Am Soc Mass Spectrom 17:163–167
Yang Y, Pedersen JO (1997) A comparative study on feature selection in text categorization. In: Machine Learning-International Workshop Then Conference, vol 97. Morgan Kaufmann Publishers, Inc., San Francisco, pp 412–420
Yang B, Xiang M, Zhang Y (2016) Multi-manifold discriminant Isomap for visualization and classification. Pattern Recogn 55:215–230
Yuan L m, Chen X, Lai Y et al (2018) A novel strategy of clustering informative variables for quantitative analysis of potential toxics element in Tegillarca Granosa using laser-induced breakdown spectroscopy. Food Anal Methods 11:1405–1416
Zhang GP (2000) Neural networks for classification: a survey. IEEE Trans Syst Man Cybern Part C Applications Rev 30:451–462
Zou J, Li W, Du Q (2015) Sparse representation-based nearest neighbor classifiers for hyperspectral imagery. IEEE Geosci Remote Sens Lett 12:2418–2422
Zvikas V, Pukeleviciene V, Ivanauskas L et al (2016) Variety-based research on the phenolic content in the aerial parts of organically and conventionally grown buckwheat. Food Chem 213:660–667
Acknowledgements
The authors would like to thank the editor and anonymous reviewers whose valuable comments and feedback have helped us to improve the content and presentation of the paper.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
Author Márcio Lima declares that he has no conflict of interest. Author Rommel Barbosa declares that he has no conflict of interest.
Ethical Approval
This article does not contain any studies with human participants or animals performed by either of the authors.
Informed Consent
Not applicable.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
de Lima, M.D., Barbosa, R. Methods of Authentication of Food Grown in Organic and Conventional Systems Using Chemometrics and Data Mining Algorithms: a Review. Food Anal. Methods 12, 887–901 (2019). https://doi.org/10.1007/s12161-018-01413-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12161-018-01413-3