Advertisement

Bio-Chemical Data Classification by Dissimilarity Representation and Template Selection

  • Victor Mendiola-Lau
  • Francisco José Silva Mata
  • Yenisel Plasencia Calaña
  • Isneri Talavera Bustamante
  • Maria de Marsico
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10657)

Abstract

The identification and classification of bio-chemical substances are very important tasks in chemical, biological and forensic analysis. In this work we present a new strategy to improve the accuracy of the supervised classification of this type of data obtained from different analytical techniques that combine two processes: first, a dissimilarity representation of data and second, the selection of templates for the refinement of the representative samples in each class set.

In order to evaluate the performance of our proposal, a comparative study between three approaches is presented. As a baseline, entropy template selection (ETS) is performed in the original feature space and selected templates are used for training. The underlying concept of the other two alternatives, is the combination of Dissimilarity Representations and ETS. The first alternative performs ETS in the original feature space and uses the selected templates as prototypes for the generation of the dissimilarity space and as training set. The second one represents the data in the dissimilarity space, and next ETS is performed.

The experimental results showed that an adequate combination of the representation in the dissimilarity the space and the selection of templates based on entropy, outperformed the baseline in accuracy and/or efficiency for the majority of the problems studied.

Keywords

Dissimilarity representation Entropy Template selection Classification Bio-chemical data 

References

  1. 1.
    Quality and Technology website (2017). http://www.models.life.ku.dk/tablets
  2. 2.
    Quality and Technology website (2017). http://www.models.life.ku.dk/RAMANporkfat
  3. 3.
    Cha, S.H.: Comprehensive survey on distance/similarity measures between probability density functions. City 1(2), 1 (2007)MathSciNetGoogle Scholar
  4. 4.
    De Marsico, M., Nappi, M., Riccio, D., Tortora, G.: Entropy-based template analysis in face biometric identification systems. Sig. Image Video Process. 7(3), 493–505 (2013)CrossRefGoogle Scholar
  5. 5.
    Duin, R.P., Pekalska, E.: The dissimilarity space bridging structural and statistical pattern recognition. Pattern Recogn. Lett. 33(7), 826–832 (2012)CrossRefGoogle Scholar
  6. 6.
    Duin, R.P., et al.: The dissimilarity representation for pattern recognition: foundations and applications, vol. 64. World Scientific (2005)Google Scholar
  7. 7.
    Helland, I.S., Næs, T., Isaksson, T.: Related versions of the multiplicative scatter correction method for preprocessing spectroscopic data. Chemometr. Intell. Lab. Syst. 29(2), 233–241 (1995)CrossRefGoogle Scholar
  8. 8.
    Infometrix: Infometrix website (2017). https://infometrix.com/pirouette/
  9. 9.
    Kumar, V., Chhabra, J.K., Kumar, D.: Performance evaluation of distance metrics in the clustering algorithms. J. Comput. Sci. 13(1), 38–52 (2014)MathSciNetGoogle Scholar
  10. 10.
    Mendiola-Lau, V., Mata, F.J.S., Martínez-Díaz, Y., Bustamante, I.T., de Marsico, M.: Automatic classification of herbal substances enhanced with an entropy criterion. In: Beltrán-Castañón, C., Nyström, I., Famili, F. (eds.) CIARP 2016. LNCS, vol. 10125, pp. 233–240. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-52277-7_29 CrossRefGoogle Scholar
  11. 11.
    Paclik, P., Duin, R.: Classifying spectral data using relational representation. NA (2003)Google Scholar
  12. 12.
    Plasencia Calaña, Y.: Prototype selection for classification in standard and generalized dissimilarity spaces. Ph.D. thesis, TU Delft (2015)Google Scholar
  13. 13.
    Porro Munoz, D.: Classification of continuous multi-way data via dissimilarity representation (2013)Google Scholar
  14. 14.
    Porro-Muñoz, D., Talavera, I., Duin, R.P., Hernández, N., Orozco-Alzate, M.: Dissimilarity representation on functional spectral data for classification. J. Chemom. 25(9), 476–486 (2011)CrossRefGoogle Scholar
  15. 15.
    Press, W.H., Teukolsky, S.A.: Savitzky-Golay smoothing filters. Comput. Phys. 4(6), 669–672 (1990)CrossRefGoogle Scholar
  16. 16.
    Thodberg, H.H.: StatLib–Datasets Archive website (2017). http://lib.stat.cmu.edu/datasets/tecator

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Victor Mendiola-Lau
    • 1
  • Francisco José Silva Mata
    • 1
  • Yenisel Plasencia Calaña
    • 1
  • Isneri Talavera Bustamante
    • 1
  • Maria de Marsico
    • 2
  1. 1.Advanced Technologies Application CenterHavanaCuba
  2. 2.Universita degli Studi de RomaRomeItaly

Personalised recommendations