Feature Space Reduction for Graph-Based Image Classification

  • Niusvel Acosta-Mendoza
  • Andrés Gago-Alonso
  • Jesús Ariel Carrasco-Ochoa
  • José Francisco Martínez-Trinidad
  • José E. Medina-Pagola
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8258)


Feature selection is an essential preprocessing step for classifiers with high dimensional training sets. In pattern recognition, feature selection improves the performance of classification by reducing the feature space but preserving the classification capabilities of the original feature space. Image classification using frequent approximate subgraph mining (FASM) is an example where the benefits of features selections are needed. This is due using frequent approximate subgraphs (FAS) leads to high dimensional representations. In this paper, we explore the use of feature selection algorithms in order to reduce the representation of an image collection represented through FASs. In our results we report a dimensionality reduction of over 50% of the original features and we get similar classification results than those reported by using all the features.


Approximate graph mining approximate graph matching feature selection graph-based classification 


  1. 1.
    Acosta-Mendoza, N., Gago-Alonso, A., Medina-Pagola, J.E.: Frequent Approximate Subgraphs as Features for Graph-Based Image Classification. Knowledge-Based Systems 27, 381–392 (2012)CrossRefGoogle Scholar
  2. 2.
    Acosta-Mendoza, N., Morales-González, A., Gago-Alonso, A., García-Reyes, E.B., Medina-Pagola, J.E.: Image Classification Using Frequent Approximate Subgraphs. In: Alvarez, L., Mejail, M., Gomez, L., Jacobo, J. (eds.) CIARP 2012. LNCS, vol. 7441, pp. 292–299. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  3. 3.
    Bermejo, P., de la Ossa, L., Gámez, J.A., Miguel-Puerta, J.: Fast wrapper feature subset selection in high-dimensional datasets by means of filter re-ranking. Knowledge-Based Systems 25(1), 35–44 (2012)CrossRefGoogle Scholar
  4. 4.
    Bolón-Canedo, V., Sánchez-Maroño, N., Alonso-Betanzos, A.: A review of feature selection methods on synthetic data. Knowledge and Information Systems 34(3), 483–519 (2013)CrossRefGoogle Scholar
  5. 5.
    Duval, B., Hao, J.K., Hernandez, J.C.: A memetic algorithm for gene selection and molecular classification of cancer. In: Genetic and Evolutionary Computation Conference (GECCO 2009), pp. 201–208. ACM, Montreal (2009)Google Scholar
  6. 6.
    Ferreira, A.J., Figueiredo, M.A.T.: Efficient feature selection filters for high-dimensional data. Pattern Recognition Letters 33(13), 1794–1804 (2012)CrossRefGoogle Scholar
  7. 7.
    Gago-Alonso, A., Carrasco-Ochoa, J.A., Medina-Pagola, J.E., Martínez-Trinidad, J.F.: Duplicate Candidate Elimination and Fast Support Calculation for Frequent Subgraph Mining. In: Corchado, E., Yin, H. (eds.) IDEAL 2009. LNCS, vol. 5788, pp. 292–299. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  8. 8.
    García, S., Herrera, F.: An Extension on “Statistical Comparisons of Classifiers over Multiple Data Sets” for all Pairwise Comparisons. Journal of Machine Learning Research 9, 2677–2694 (2008)zbMATHGoogle Scholar
  9. 9.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11(1) (2009)Google Scholar
  10. 10.
    He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. Advances in Neural Information Processing Systems 18, 507–514 (2006)Google Scholar
  11. 11.
    Hommel, G.: A stagewise rejective multiple test procedure. Biometrika 75, 383–386 (1988)CrossRefzbMATHGoogle Scholar
  12. 12.
    Holder, L., Cook, D., Bunke, H.: Fuzzy substructure discovery. In: Proceedings of the 9th International Workshop on Machine Learning, San Francisco, CA, USA, pp. 218–223 (1992)Google Scholar
  13. 13.
    Holm, S.: A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics 6, 65–70 (1979)MathSciNetzbMATHGoogle Scholar
  14. 14.
    Jia, Y., Zhang, J., Huan, J.: An Efficient Graph-Mining Method for Complicated and Noisy Data with Real-World Applications. Knowledge Information Systems 28(2), 423–447 (2011)CrossRefGoogle Scholar
  15. 15.
    Norshafarina, O.B., Fantimatufaridah, J.B., Mohd-Shahizan, O.B., Roliana, I.B.: Review of feature selection for solving classification problems. Journal of Research and Innovation in Information Systems, 54–60 (2013)Google Scholar
  16. 16.
    Pudil, P., Novovicova, J., Kittler, J.: Floating search methods in feature selection. Pattern Recognition Letters 15, 1119–1125 (1994)CrossRefGoogle Scholar
  17. 17.
    Rodríguez-Bermúdez, G., García-Laencina, P.J., Roca-González, J., Roca-Dorda, J.: Efficient feature selection and linear discrimination of (eeg) signals. Neurocomputing 115(4), 161–165 (2013)CrossRefGoogle Scholar
  18. 18.
    Yan, X., Huan, J.: gSpan: Graph-Based Substructure Pattern Mining. In: Proceedings International Conference on Data Mining, Maebashi, Japan, pp. 721–724 (2002)Google Scholar
  19. 19.
    Ye, Y., Wu, Q., Huang, J.Z., Ng, M.K., Li, X.: Stratified sampling for feature subspace selection in random forests for high dimensional data. Pattern Recognition 46(3), 769–787 (2013)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Niusvel Acosta-Mendoza
    • 1
    • 2
  • Andrés Gago-Alonso
    • 1
  • Jesús Ariel Carrasco-Ochoa
    • 2
  • José Francisco Martínez-Trinidad
    • 2
  • José E. Medina-Pagola
    • 1
  1. 1.Advanced Technologies Application Center (CENATAV)HavanaCuba
  2. 2.National Institute of Astrophysics, Optics and Electronics (INAOE)PueblaMexico

Personalised recommendations