Convolutional Neural Network with Discriminant Criterion for Input of Each Neuron in Output Layer
Deep convolutional neural network (CNN) performs the state-of-the-art performance in image classification problems. When the neural network is trained for a multi-classes classification problem, each neuron of the output layer of the network is trained to solve the 2 classes classification problem which classifies target class and the rest of the classes. Since the posterior probability of a class can be expressed as soft-max function when the class conditional probabilities of each class are given as the Gaussian distribution with the different means and the same variance, the classifier with soft-max activation function becomes ideal when the input of the neuron in the output layer are given as the Gaussian distributions. Thus it is expected that the former layers in the CNN are constructing the good input for each neuron in the output layer in the learning process. To improve or accelerate the discrimination at the neurons in the output layer, this paper proposes to apply the discriminant criterion to the input value of each neuron in the output layer. The proposed objective function of the deep CNN is given as the sum of the discriminant criterion and the standard cross entropy loss. The results of experiments on MNIST, CIFAR-10 and CIFAR-100 show that the proposed method can improve the performance of classification and classify each class clearly.
This work was partly supported by JSPS KAKENHI Grant Number 16K00239.
- 1.Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Proceeding Conference on Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
- 2.Lin, M., Chen, Q., Yan, S.: Network in network. arXiv 1312, 4400 (2013)Google Scholar
- 3.Simonyan, K., Zisserman, A.: Very deep convolutional networks for large scale visual recognition. In: International Conference on Learning Representations (2015)Google Scholar
- 4.Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)Google Scholar
- 5.Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)Google Scholar
- 6.He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)Google Scholar
- 7.Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Hum. Genet. 7(2), 179–188 (1936)Google Scholar
- 12.Dorfer, M., Kelz, R., Widmer, G.: Deep linear discriminant analysis. arXiv 1511, 04707 (2015)Google Scholar
- 14.Kurita, T., Asoh, H., Otsu, N.: Nonlinear discriminant features constructed by using outputs of multilayer perceptron. In: Proceedings of International Symposiumon Speech, Image Processing, and Neural Networks, Hong Kong, pp. 417–420 (1994)Google Scholar