Algorithmic Optimizations in the HMAX Model Targeted for Efficient Object Recognition

  • Ahmad W. Bitar
  • Mohamad M. Mansour
  • Ali ChehabEmail author
Part of the Communications in Computer and Information Science book series (CCIS, volume 598)


In this paper, we propose various approximations aimed at increasing the accuracy of the S1, C1 and S2 layers of the original Gray HMAX model of the visual cortex. At layer S1, an image is convolved with 64 separable gabor filters in the spatial domain after removing some irrelevant information such as illumination and expression variations. At layer C1, some of the minimum scales values are exploited in addition to the maximum ones in order to increase the model’s accuracy. By applying the embedding space in the additive domain, the advantage of some of the minimum scales values is taken by embedding them into their corresponding maximum ones based on a weight value between 0 and 1. At layer S2, we apply clustering, which is considered one the most interesting research areas in the field of data mining, in order to enhance the manner by which all the prototypes are selected during the feature learning stage. This is achieved by using the Partitioning Around Medoid (PAM) clustering algorithm. The impact of these approximations in terms of accuracy and computational complexity was evaluated on the Caltech101 dataset containing a total of 9,145 images split between 101 distinct object categories in addition to a background category, and compared with the baseline performance using support vector machine (SVM) and nearest neighbor (NN) classifiers. The results show that our model provides significant improvement in accuracy at the S1 layer by more than 10 % where the computational complexity is also reduced. The accuracy is slightly increased for both approximations at the C1 and S2 layers.


HMAX Support vector machine Nearest neighbor Caltech101 


  1. 1.
    Serre, T., Wolf, L., Poggio, T.: Object recognition with features inspired by visual cortex. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2005), pp. 994–1000 (2005b)Google Scholar
  2. 2.
    Serre, T., Kouh, M., Cadieu, C., Knoblich, U., Kreiman, G., Poggio, T.: A theory of object recognition: computations and circuits in the feedforward path of the ventral stream in primate visual cortex. CBCL Paper #259/AI Memo #2005-036, Massachusetts Institute of Technology, Cambridge, MA (2005a)Google Scholar
  3. 3.
    Amayeh, G., Tavakkoli, A., Bebis, G.: Accurate and efficient computation of gabor features in real-time applications. In: Bebis, G., et al. (eds.) ISVC 2009, Part I. LNCS, vol. 5875, pp. 243–252. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  4. 4.
    Cadieu, C., Kouh, M., Riesenhuber, M., Poggio, T.: Shape representation in v4: Investigating position-specific tuning for boundary conformation with the standard model of object recognition. J. Vis. 5(8), 671 (2005)CrossRefGoogle Scholar
  5. 5.
    Bermudez-Contreras, E., Buxton, H., Spier, E.: Attention can improve a simple model for object recognition. Image Vis. Comput. 26, 776–787 (2008)CrossRefGoogle Scholar
  6. 6.
    Serre, T., Riesenhuber, M.: Realistic modeling of simple and complex cell tuning in the hmax model, and implications for invariant object recognition in cortex. Massachusetts Institute of Technology, Cambridge, MA. CBCL, Paper 239/Al Memo 2004–017 (2004)Google Scholar
  7. 7.
    Serre, T., Wolf, L., Bileschi, S., Riesenhuber, M., Poggio, T.: Robust object recognition with cortexlike mechanisms. In: IEEE Conference on Pattern Analysis and Machine Intelligence, vol. 29, pp. 411–426 (2007b)Google Scholar
  8. 8.
    Mutch, J., Lowe, D.G.: Multiclass object recognition with sparse, localized features. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, pp. 11–18 (2006)Google Scholar
  9. 9.
    Chikkerur, S., Poggio, T.: Approximations in the hmax model. MIT-CSAIL-TR-2011-021, CBCL-298, p. 12 (2011)Google Scholar
  10. 10.
    Holub, A., Welling, M.: Exploiting unlabelled data for hybrid object classification. In: Advances in Neural Information Processing Systems (NIPS 2005) Workshop in Inter-Class Transfer (2005)Google Scholar
  11. 11.
    Grauman, K., Darrell, T.: The pyramid match kernel: Discriminative classification with sets of image features. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), vol. 2, pp. 1458–1465 (2005)Google Scholar
  12. 12.
    Serre, T., Kreiman, G., Kouh, M., Cadieu, C., Knoblich, U., Poggio, T.: A quantitative theory of immediate visual recognition. Prog. Brain Res. Comput. Neurosci. Theor. Insights Brain Funct. 165, 33–56 (2007a)CrossRefGoogle Scholar
  13. 13.
    Sharif, M., Anis, S., Raza, M., Mohsin, S.: Enhanced SVD based face recognition. J. Appl. Comput. Sci. Math. 12, 49 (2012)Google Scholar
  14. 14.
    Kumar, P., Wasan, S.K.: Comparative study of k-means, pam and rough k-means algorithms using cancer datasets. In: Proceedings of CSIT: 2009 International Symposium on Computing, Communication, and Control (ISCCC) Singapore, 2011, pp. 136–140 (2011)Google Scholar
  15. 15.
    Crochiere, R., Webber, S., Flanagan, J.: Digital coding of speech in sub-bands. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 233–236 (1976)Google Scholar
  16. 16.
    Burt, P., Adelson, E.: The Laplacian pyramid as a compact image code. IEEE Trans. Commun. 31(4), 532–540 (1983)CrossRefGoogle Scholar
  17. 17.
    Vetterli, M., Le Gall, D.: Perfect reconstruction FIR filter banks: Some properties and factorizations. IEEE Trans. Acoust. Speech Sig. Process. 37(7), 1057–1071 (1989)CrossRefGoogle Scholar
  18. 18.
    Hubel, D.H., Freeman, W.H.: The Human Eye: Structure and Function. Sinauer Associates, Sunderland (1999)Google Scholar
  19. 19.
    Oyster, C.W.: Eye, Brain and Vision. vol. 12(1), pp. 40–41 (1989)Google Scholar
  20. 20.
    Purves, D.: Brains: How They Seem To Work. FT Press, Upper Saddle River (2010)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Ahmad W. Bitar
    • 1
  • Mohamad M. Mansour
    • 1
  • Ali Chehab
    • 1
    Email author
  1. 1.Department of Electrical and Computer EngineeringAmerican University of BeirutBeirutLebanon

Personalised recommendations