Advertisement

Convolutional Neural Network with Discriminant Criterion for Input of Each Neuron in Output Layer

  • Hidenori IdeEmail author
  • Takio Kurita
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11301)

Abstract

Deep convolutional neural network (CNN) performs the state-of-the-art performance in image classification problems. When the neural network is trained for a multi-classes classification problem, each neuron of the output layer of the network is trained to solve the 2 classes classification problem which classifies target class and the rest of the classes. Since the posterior probability of a class can be expressed as soft-max function when the class conditional probabilities of each class are given as the Gaussian distribution with the different means and the same variance, the classifier with soft-max activation function becomes ideal when the input of the neuron in the output layer are given as the Gaussian distributions. Thus it is expected that the former layers in the CNN are constructing the good input for each neuron in the output layer in the learning process. To improve or accelerate the discrimination at the neurons in the output layer, this paper proposes to apply the discriminant criterion to the input value of each neuron in the output layer. The proposed objective function of the deep CNN is given as the sum of the discriminant criterion and the standard cross entropy loss. The results of experiments on MNIST, CIFAR-10 and CIFAR-100 show that the proposed method can improve the performance of classification and classify each class clearly.

Notes

Acknowledgement

This work was partly supported by JSPS KAKENHI Grant Number 16K00239.

References

  1. 1.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Proceeding Conference on Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  2. 2.
    Lin, M., Chen, Q., Yan, S.: Network in network. arXiv 1312, 4400 (2013)Google Scholar
  3. 3.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large scale visual recognition. In: International Conference on Learning Representations (2015)Google Scholar
  4. 4.
    Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)Google Scholar
  5. 5.
    Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)Google Scholar
  6. 6.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)Google Scholar
  7. 7.
    Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Hum. Genet. 7(2), 179–188 (1936)Google Scholar
  8. 8.
    Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst., Man, Cybern. 9, 62–66 (1979)CrossRefGoogle Scholar
  9. 9.
    Fukui, K., Yamaguchi, O.: Facial feature point extraction method based on combination of shape extraction and pattern matching. Syst. Comput. Jpn. 29(6), 49–58 (1998)CrossRefGoogle Scholar
  10. 10.
    Osman, H., Fahmy, M.M.: On the discriminatory power of adaptive feed-forward layered networks. IEEE Trans. Pattern Anal. Mach. Intell. 16(8), 837–842 (1994)CrossRefGoogle Scholar
  11. 11.
    Chen, K., Xiang, Y., Huisheng, C.: Combining linear discriminant functions with neural networks for supervised learning. Neural Comput. Appl. 6(1), 19–41 (1997)CrossRefGoogle Scholar
  12. 12.
    Dorfer, M., Kelz, R., Widmer, G.: Deep linear discriminant analysis. arXiv 1511, 04707 (2015)Google Scholar
  13. 13.
    Wu, L., Shen, C., van den Hengel, A.: Deep linear discriminant analysis on fisher networks: a hybrid architecture for person re-identification. Pattern Recognit. 65, 238–250 (2017)CrossRefGoogle Scholar
  14. 14.
    Kurita, T., Asoh, H., Otsu, N.: Nonlinear discriminant features constructed by using outputs of multilayer perceptron. In: Proceedings of International Symposiumon Speech, Image Processing, and Neural Networks, Hong Kong, pp. 417–420 (1994)Google Scholar
  15. 15.
    Giryes, R., Sapiro, G., Bronstein, A.M.: Deep neural networks with random Gaussian weights: a universal classification strategy? IEEE Trans. Signal Process. 64(13), 3444–3457 (2016)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Boston (2006).  https://doi.org/10.1007/978-1-4615-7566-5CrossRefzbMATHGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.The Department of Information Engineering, Graduate School of EngineeringHiroshima UniversityHigashi-Hiroshima CityJapan

Personalised recommendations