Bio-Chemical Data Classification by Dissimilarity Representation and Template Selection
The identification and classification of bio-chemical substances are very important tasks in chemical, biological and forensic analysis. In this work we present a new strategy to improve the accuracy of the supervised classification of this type of data obtained from different analytical techniques that combine two processes: first, a dissimilarity representation of data and second, the selection of templates for the refinement of the representative samples in each class set.
In order to evaluate the performance of our proposal, a comparative study between three approaches is presented. As a baseline, entropy template selection (ETS) is performed in the original feature space and selected templates are used for training. The underlying concept of the other two alternatives, is the combination of Dissimilarity Representations and ETS. The first alternative performs ETS in the original feature space and uses the selected templates as prototypes for the generation of the dissimilarity space and as training set. The second one represents the data in the dissimilarity space, and next ETS is performed.
The experimental results showed that an adequate combination of the representation in the dissimilarity the space and the selection of templates based on entropy, outperformed the baseline in accuracy and/or efficiency for the majority of the problems studied.
KeywordsDissimilarity representation Entropy Template selection Classification Bio-chemical data
- 1.Quality and Technology website (2017). http://www.models.life.ku.dk/tablets
- 2.Quality and Technology website (2017). http://www.models.life.ku.dk/RAMANporkfat
- 6.Duin, R.P., et al.: The dissimilarity representation for pattern recognition: foundations and applications, vol. 64. World Scientific (2005)Google Scholar
- 8.Infometrix: Infometrix website (2017). https://infometrix.com/pirouette/
- 10.Mendiola-Lau, V., Mata, F.J.S., Martínez-Díaz, Y., Bustamante, I.T., de Marsico, M.: Automatic classification of herbal substances enhanced with an entropy criterion. In: Beltrán-Castañón, C., Nyström, I., Famili, F. (eds.) CIARP 2016. LNCS, vol. 10125, pp. 233–240. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-52277-7_29 CrossRefGoogle Scholar
- 11.Paclik, P., Duin, R.: Classifying spectral data using relational representation. NA (2003)Google Scholar
- 12.Plasencia Calaña, Y.: Prototype selection for classification in standard and generalized dissimilarity spaces. Ph.D. thesis, TU Delft (2015)Google Scholar
- 13.Porro Munoz, D.: Classification of continuous multi-way data via dissimilarity representation (2013)Google Scholar
- 16.Thodberg, H.H.: StatLib–Datasets Archive website (2017). http://lib.stat.cmu.edu/datasets/tecator