Direct Error Driven Learning for Classification in Applications Generating Big-Data

Krishnan, R.; Jagannathan, S.; Samaranayake, V. A.

doi:10.1007/978-3-030-31764-5_1

R. Krishnan⁴,
S. Jagannathan⁴ &
V. A. Samaranayake⁵

Part of the book series: Studies in Computational Intelligence ((SCI,volume 867))

1716 Accesses

Abstract

In this chapter, a comprehensive methodology is presented to address important data-driven challenges within the context of classification. First, it is demonstrated that challenges, such as heterogeneity and noise observed with big/large data-sets, affect the efficiency of a deep neural network (DNN)-based classifiers. To obviate these issues, a two-step classification framework is introduced where unwanted attributes (variables) are systematically removed through a preprocessing step and a DNN-based classifier is introduced to address heterogeneity in the learning process. Specifically, a multi-stage nonlinear dimensionality reduction (NDR) approach is described in this chapter to remove unwanted variables and a novel optimization framework is presented to address heterogeneity. In NDR, the dimensions are first divided into groups (grouping stage) and redundancies are then systematically removed in each group (transformation stage). The two-stage NDR procedure is repeated until a user-defined criterion controlling information loss is satisfied. The reduced dimensional data is finally used for classification with a DNN-based framework where direct error-driven learning regime is introduced. Within this framework, an approximation of generalization error is obtained by generating additional samples from the data. An overall error, which consists of learning and approximation of generalization error, is determined and utilized to derive a performance measure for each layer in the DNN. A novel layer-wise weight-tuning law is finally obtained through the gradient of this layer-wise performance measure where the overall error is directly utilized for learning. The efficiency of this two-step classification approach is demonstrated using various data-sets.

This research was supported in part by an NSF I/UCRC award IIP 1134721 and Intelligent Systems Center.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.00; Price excludes VAT (USA)

Softcover Book: USD 179.99; Price excludes VAT (USA)

Hardcover Book: USD 179.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Adragni, K.P., Al-Najjar, E., Martin, S., Popuri, S.K., Raim, A.M.: Group-wise sufficient dimension reduction with principal fitted components. Comput. Stat. 31(3), 923–941 (2016)
Article MathSciNet Google Scholar
Balasubramanian, M., Schwartz, E.L.: The isomap algorithm and topological stability. Science 295(5552), 7 (2002)
Article Google Scholar
Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15(6), 1373–1396 (2003)
Article Google Scholar
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)
MATH Google Scholar
Bulatov, Y.: Notmnist dataset. Google (Books/OCR), Technical Report. http://yaroslavvb.blogspot.it/2011/09/notmnist-dataset.html (2011)
Clarke, R., Ressom, H.W., Wang, A., Xuan, J., Liu, M.C., Gehan, E.A., Wang, Y.: The properties of high-dimensional data spaces: implications for exploring gene and protein expression data. Nat. Rev. Cancer 8(1), 37–49 (2008)
Article Google Scholar
David, S., Ruey, S., et al.: Independent component analysis via distance covariance. J. Am. Stat. Assoc. (2017)
Google Scholar
Donoho, D.L., Grimes, C.: Hessian eigenmaps: locally linear embedding techniques for high-dimensional data. Proc. Natl. Acad. Sci. 100(10), 5591–5596 (2003)
Article MathSciNet Google Scholar
Fan, J., Han, F., Liu, H.: Challenges of big data analysis. Natl. Sci. Rev. 1(2), 293–314 (2014)
Article Google Scholar
Feng, H.: et al.: Gene classification using parameter-free semi-supervised manifold learning. IEEE/ACM Trans. Comput. Biol. Bioinform. 9(3), 818–827 (2012)
Google Scholar
Fodor, I.K.: A survey of dimension reduction techniques (2002)
Google Scholar
Giraud, C.: Introduction to High-Dimensional Statistics, vol. 138. CRC Press (2014)
Google Scholar
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010)
Google Scholar
Goldberger, J., Ben-Reuven, E.: Training deep neural-networks using a noise adaptation layer (2016)
Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning, vol. 1. MIT Press, Cambridge (2016)
MATH Google Scholar
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples (2014). arXiv preprint arXiv:1412.6572
Guo, Z., Li, L., Lu, W., Li, B.: Groupwise dimension reduction via envelope method. J. Am. Stat. Assoc. 110(512), 1515–1527 (2015)
Article MathSciNet Google Scholar
Guyon, I., Gunn, S., Ben-Hur, A., Dror, G.: Result analysis of the nips 2003 feature selection challenge. In: Advances in Neural Information Processing Systems, pp. 545–552 (2005)
Google Scholar
Hardt, M.: 3.12 train faster, generalize better: stability of stochastic gradient descent. Math. Comput. Found. Learn. Theor. 64 (2015)
Google Scholar
Ing, C.K., Lai, T.L., Shen, M., Tsang, K., Yu, S.H.: Multiple testing in regression models with applications to fault diagnosis in big data era. Technometrics (just-accepted) (2016)
Google Scholar
Johnson, R.A., Wichern, D.W.: Applied Multivariate Statistical Analysis, vol. 4. Prentice Hall, Englewood Cliffs (1992)
MATH Google Scholar
Johnson, R.A., Wichern, D.W.: Applied Multivariate Statistical Analysis, vol. 5. Prentice Hall, Upper Saddle River (2002)
MATH Google Scholar
Jolliffe, I.: Principal Component Analysis. Wiley Online Library (2002)
Google Scholar
Khan, F., Kari, D., Karatepe, I.A., Kozat, S.S.: Universal nonlinear regression on high dimensional data using adaptive hierarchical trees. IEEE Trans. Big Data 2(2), 175–188 (2016)
Article Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization (2014). arXiv preprint arXiv:1412.6980
Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images (2009)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)
Article Google Scholar
Lee, D.H., Zhang, S., Fischer, A., Bengio, Y.: Difference target propagation. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 498–515. Springer, Cham (2015)
Chapter Google Scholar
Lichman, M.: UCI machine learning repository. http://archive.ics.uci.edu/ml (2013)
Lillicrap, T.P., Cownden, D., Tweed, D.B., Akerman, C.J.: Random synaptic feedback weights support error backpropagation for deep learning. Nat. Commun. 713276 (2016)
Google Scholar
Mishkin, D., Matas, J.: All you need is a good init (2015). arXiv preprint arXiv:1511.06422
Niyogi, P., Girosi, F.: On the relationship between generalization error, hypothesis complexity, and sample complexity for radial basis functions. Neural Comput. 8(4), 819–842 (1996)
Article Google Scholar
Nøkland, A.: Direct feedback alignment provides learning in deep neural networks. In: Advances in Neural Information Processing Systems, pp. 1037–1045 (2016)
Google Scholar
Pascanu, R., Mikolov, T., Bengio, Y.: On the difficulty of training recurrent neural networks. In: International Conference on Machine Learning, pp. 1310–1318 (2013)
Google Scholar
Krishnan, R., Samaranayake, V.A., Jagannathan, S.: A multi-step nonlinear dimension-reduction approach with applications to big data. IEEE Trans. Knowl. Data Eng. (2018)
Google Scholar
Reed, S., Lee, H., Anguelov, D., Szegedy, C., Erhan, D., Rabinovich, A.: Training deep neural networks on noisy labels with bootstrapping (2014). arXiv preprint arXiv:1412.6596
Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)
Article Google Scholar
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)
Article MathSciNet Google Scholar
Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)
Article Google Scholar
Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Nello (2004)
Book Google Scholar
Soylemezoglu, A., Jagannathan, S., Saygin, C.: Mahalanobis taguchi system (MTS) as a prognostics tool for rolling element bearing failures. J. Manuf. Sci. Eng. 132(5), 051014 (2010)
Article Google Scholar
Soylemezoglu, A., Jagannathan, S., Saygin, C.: Mahalanobis-taguchi system as a multi-sensor based decision making prognostics tool for centrifugal pump failures. IEEE Trans. Reliab. 60(4), 864–878 (2011)
Article Google Scholar
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
MathSciNet MATH Google Scholar
Sun, K., Huang, S.H., Wong, D.S.H., Jang, S.S.: Design and application of a variable selection method for multilayer perceptron neural network with LASSO. IEEE Trans. Neural Netw. Learn. Syst. 28(6), 1386–1396, June 2017. ISSN 2162-237X. https://doi.org/10.1109/TNNLS.2016.2542866
Article Google Scholar
Székely, G.J., Rizzo, M.L., Bakirov, N.K., et al.: Measuring and testing dependence by correlation of distances. Ann. Stat. 35(6), 2769–2794 (2007)
Article MathSciNet Google Scholar
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11(Dec), 3371–3408 (2010)
Google Scholar
Ward, A.D., Hamarneh, G.: The groupwise medial axis transform for fuzzy skeletonization and pruning. IEEE Trans. Pattern Anal. Mach. Intell. 32(6), 1084–1096 (2010)
Article Google Scholar
Witten, D.M., Tibshirani, R., Hastie, T.: A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics 10(3), 515–534 (2009)
Article Google Scholar
Wu, X., Zhu, X., Wu, G.Q., Ding, W.: Data mining with big data. IEEE Trans. Knowl. Data Eng. 26(1), 97–107 (2014)
Article Google Scholar
Xie, J., Xu, L., Chen, E.: Image denoising and inpainting with deep neural networks. In: Advances in Neural Information Processing Systems, pp. 341–349 (2012)
Google Scholar
Xu, N., Hong, J., Fisher, T.C.: Generalization error minimization: a new approach to model evaluation and selection with an application to penalized regression (2016). arXiv preprint arXiv:1610.05448
Yu, Z., Li, L., Liu, J., Han, G.: Hybrid adaptive classifier ensemble. IEEE Trans. Cybern. 45(2), 177–190 (2015). ISSN 2168-2267. https://doi.org/10.1109/TCYB.2014.2322195
Article Google Scholar
Zhang, L., Lin, J., Karim, R.: Sliding window-based fault detection from high-dimensional data streams. IEEE Trans. Syst. Man Cybern. Syst. (2016)
Google Scholar
Zhou, J.K., Wu, J., Zhu, L.: Overlapped groupwise dimension reduction. Sci. China Math. 59(12), 2543–2560 (2016)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, Missouri University of Science and Technology, Rolla, MO, USA
R. Krishnan & S. Jagannathan
Department of Mathematics and Statistics, Missouri University of Science and Technology, Rolla, MO, USA
V. A. Samaranayake

Authors

R. Krishnan
View author publications
You can also search for this author in PubMed Google Scholar
S. Jagannathan
View author publications
You can also search for this author in PubMed Google Scholar
V. A. Samaranayake
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to R. Krishnan .

Editor information

Editors and Affiliations

Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB, Canada
Witold Pedrycz
Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
Shyi-Ming Chen

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Krishnan, R., Jagannathan, S., Samaranayake, V.A. (2020). Direct Error Driven Learning for Classification in Applications Generating Big-Data. In: Pedrycz, W., Chen, SM. (eds) Development and Analysis of Deep Learning Architectures. Studies in Computational Intelligence, vol 867. Springer, Cham. https://doi.org/10.1007/978-3-030-31764-5_1

Download citation

DOI: https://doi.org/10.1007/978-3-030-31764-5_1
Published: 02 November 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-31763-8
Online ISBN: 978-3-030-31764-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics