Skip to main content

Training Deep Models and Deriving Fisher Kernels: A Step Wise Approach

  • Chapter
  • First Online:
Book cover Composing Fisher Kernels from Deep Neural Models

Part of the book series: SpringerBriefs in Computer Science ((BRIEFSCOMPUTER))

  • 493 Accesses

Abstract

Deep networks and Fisher kernels are two competitive approaches showing strides of progress and improvement for computer vision tasks in specific the large scale object categorisation problem. One of the recent developments in this regard has been the use of a hybrid approach that encodes higher order statistics of deep models for Fisher vector encodings. In this chapter we shall discuss how to train a deep model for extracting Fisher kernel. The tips discussed here are validated by industrial practices and research community through mathematical proofs by LeCun et al. (Neural networks: tricks of the trade. Springer, pp 9–50 (1998), [1]), Bengio (Neural networks: tricks of the trade. Springer, pp 437–478 (2012), [2]) and case studies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. LeCun, Y., Bottou, L., Orr, G.B., et al.: Efficient backprop. In: Neural Networks: Tricks of the Trade, pp. 9–50. Springer (1998)

    Google Scholar 

  2. Bengio, Y.: Practical recommendations for gradient-based training of deep architectures. In: Neural Networks: Tricks of the Trade, pp. 437–478. Springer (2012)

    Google Scholar 

  3. Marchesi, M.: Megapixel size image creation using generative adversarial networks (2017). arXiv preprint arXiv:1706.00082

  4. Wang, J., Perez, L.: The effectiveness of data augmentation in image classification using deep learning (2017). arXiv preprint arXiv:1712.04621

  5. Ahmed, S., Azim, T.: Compression techniques for deep fisher vectors. In: ICPRAM, pp. 217–224 (2017)

    Google Scholar 

  6. Collobert, R., Bengio, S.: Links between perceptrons, MLPs and SVMs. In: Proceedings of the Twenty-First International Conference on Machine Learning. ACM (2004)

    Google Scholar 

  7. Jarrett, K., Kavukcuoglu, K., LeCun, Y., et al.: What is the best multi-stage architecture for object recognition? In: IEEE 12th International Conference on Computer Vision, pp. 2146–2153. IEEE (2009)

    Google Scholar 

  8. Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010)

    Google Scholar 

  9. Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp. 315–323 (2011)

    Google Scholar 

  10. Louizos, C., Welling, M., Kingma, D.: Learning sparse neural networks through \( {L\_0} \) regularization (2017). arXiv preprint arXiv:1712.01312

  11. Theis, L., Korshunova, I., Tejani, A., et al.: Faster gaze prediction with dense networks and fisher pruning (2018). arXiv preprint arXiv:1801.05787

  12. Blum, A.: Neural Networks in C++, vol. 697. Wiley, NY (1992)

    Google Scholar 

  13. Berry, M., Linoff, G.: Data Mining Techniques: For Marketing, Sales, and Customer Support. Wiley (1997)

    Google Scholar 

  14. Boger, Z., Guterman, H.: Knowledge extraction from artificial neural network models. In: IEEE International Conference on Systems, Man, and Cybernetics Computational Cybernetics and Simulation, vol. 4, pp. 3030–3035. IEEE (1997)

    Google Scholar 

  15. Ruder, S.: An overview of gradient descent optimization algorithms. Comput. Res. Repos. (2016). http://arxiv.org/abs/1609.04747. (CoRR) abs/1609.04747

  16. Bengio, Y.: Practical recommendations for gradient-based training of deep architectures. Comput. Res. Repos. (2012). CoRR abs/1206.5533

    Google Scholar 

  17. Singh, S., Hoiem, D., Forsyth, D.: Swapout: learning an ensemble of deep architectures. In: Advances in Neural Information Processing Systems, pp. 28–36 (2016)

    Google Scholar 

  18. Srivastava, N., Hinton, G., Krizhevsky, A., et al.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res., 1929–1958 (2014)

    Google Scholar 

  19. Huang, G., Sun, Y., Liu, Z., et al.: Deep networks with stochastic depth. In: European Conference on Computer Vision, pp. 646–661. Springer (2016)

    Google Scholar 

  20. Hinton, G.: Training products of experts by minimizing contrastive divergence. Neural Comput., 1771–1800 (2002)

    Google Scholar 

  21. Lin, J., Zhang, J.: A fast parameters selection method of support vector machine based on coarse grid search and pattern search. In: Fourth Global Congress on Intelligent Systems, pp. 77–81 (2013)

    Google Scholar 

  22. Staelin, C.: Parameter selection for support vector machines. Technical report, Hewlett-Packard Company, HPL-2002-354R1 (2003)

    Google Scholar 

  23. Salakhutdinov, R., Larochelle, H.: Efficient learning of deep Boltzmann machines. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 693–700 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tayyaba Azim .

Rights and permissions

Reprints and permissions

Copyright information

© 2018 The Author(s), under exclusive licence to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Azim, T., Ahmed, S. (2018). Training Deep Models and Deriving Fisher Kernels: A Step Wise Approach. In: Composing Fisher Kernels from Deep Neural Models. SpringerBriefs in Computer Science. Springer, Cham. https://doi.org/10.1007/978-3-319-98524-4_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-98524-4_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-98523-7

  • Online ISBN: 978-3-319-98524-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics