Off-the-Shelf CNN Features for Fine-Grained Classification of Vessels in a Maritime Environment

  • Fouad BousetouaneEmail author
  • Brendan Morris
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9475)


Convolutional Neural Networks (CNNs) have recently achie- ved spectacular performance on standard image classification benchmarks. Moreover, CNNs trained using large datasets such as ImageNet have performed effectively even on other recognition tasks and have been used as generic feature extraction tool for off-the-shelf classifiers. This paper, presents an experimental study to investigate the ability of off-the-shelf CNN features catch discriminative details of maritime vessels for fine-grained classification. An off-the-shelf classification scheme utilizing a linear support vector machine is applied to the high-level convolution features that come before fully connected layers in popular deep learning architectures. Extensive experimental evaluation compared OverFeat, GoogLeNet, VGG, and AlexNet architectures for feature extraction. Results showed that OverFeat features outperform the other architectures with a mAP = 0.7021 on the nine class fine-grained problem which was almost 0.02 better than its closest competitor, GoogLeNet, which performed best on smaller vessel types.


Support Vector Machine Convolutional Neural Network Convolutional Layer Deformable Part Model Fine Grained Classification 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



Thanks to ONR 311 and NRL for supporting this research.


  1. 1.
    Krause, J., Gebru, T., Deng, J., Li, L.J., Fei-Fei, L.: Learning features and parts for fine-grained recognition. In: 2014 22nd International Conference on Pattern Recognition (ICPR), pp. 26–33. IEEE (2014)Google Scholar
  2. 2.
    Xiao, T., Xu, Y., Yang, K., Zhang, J., Peng, Y., Zhang, Z.: The application of two-level attention models in deep convolutional neural network for fine-grained image classification (2014). arXiv preprint arXiv:1411.6447
  3. 3.
    Razavian, A.S., Azizpour, H., Sullivan, J., Carlsson, S.: Cnn features off-the-shelf: an astounding baseline for recognition. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 512–519. IEEE (2014)Google Scholar
  4. 4.
    Agrawal, P., Girshick, R., Malik, J.: Analyzing the performance of multilayer neural networks for object recognition. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part VII. LNCS, vol. 8695, pp. 329–344. Springer, Heidelberg (2014) Google Scholar
  5. 5.
    Yang, S., Bo, L., Wang, J., Shapiro, L.G.: Unsupervised template learning for fine-grained object recognition. In: Advances in Neural Information Processing Systems, pp. 3122–3130 (2012)Google Scholar
  6. 6.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR 2005, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 886–893. IEEE (2005)Google Scholar
  7. 7.
    Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1627–1645 (2010)CrossRefGoogle Scholar
  8. 8.
    Zhang, N., Farrell, R., Iandola, F., Darrell, T.: Deformable part descriptors for fine-grained recognition and attribute prediction. In: 2013 IEEE International Conference on Computer Vision (ICCV), pp. 729–736. IEEE (2013)Google Scholar
  9. 9.
    Chai, Y., Rahtu, E., Lempitsky, V., Van Gool, L., Zisserman, A.: TriCoS: a tri-level class-discriminative co-segmentation method for image classification. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part I. LNCS, vol. 7572, pp. 794–807. Springer, Heidelberg (2012) CrossRefGoogle Scholar
  10. 10.
    Bousetouane, F., Dib, L., Snoussi, H.: Improved mean shift integrating texture and color features for robust real time object tracking. Visual Comput. 29, 155–170 (2013)CrossRefGoogle Scholar
  11. 11.
    Shirazi, M.S., Morris, B.: Contextual combination of appearance and motion for intersection videos with vehicles and pedestrians. In: Bebis, G., et al. (eds.) ISVC 2014, Part I. LNCS, vol. 8887, pp. 708–717. Springer, Heidelberg (2014) Google Scholar
  12. 12.
    Bo, L., Ren, X., Fox, D.: Kernel descriptors for visual recognition. In: Advances in Neural Information Processing Systems, pp. 244–252 (2010)Google Scholar
  13. 13.
    Bousetouane, F., Vandewiele, F., Motamed, C.: Occlusion management in distributed multi-object tracking for visual-surveillance. Pattern Recognition and Image Analysis 25, 295–300 (2015)Google Scholar
  14. 14.
    Duan, K., Parikh, D., Crandall, D., Grauman, K.: Discovering localized attributes for fine-grained recognition. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3474–3481. IEEE (2012)Google Scholar
  15. 15.
    Chua, M., Aha, D.W., Auslander, B., Gupta, K.M., Morris, B.: Comparison of object detection algorithms on maritime vessels (2014)Google Scholar
  16. 16.
    Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: Integrated recognition, localization and detection using convolutional networks (2013). arXiv preprint arXiv:1312.6229
  17. 17.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp. 1097–1105 (2012)Google Scholar
  18. 18.
    Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions (2014). arXiv preprint arXiv:1409.4842
  19. 19.
    Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: Delving deep into convolutional nets (2014). arXiv preprint arXiv:1405.3531
  20. 20.
    Tang, Y.: Deep learning using linear support vector machines (2013). arXiv preprint arXiv:1306.0239
  21. 21.
    Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding (2014). arXiv preprint arXiv:1408.5093

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Electrical and Computer Engineering DepartmentUniversity of NevadaLas VegasUSA

Personalised recommendations