Neural Prediction in Industry: Increasing Reliability through Use of Confidence Measures and Model Combination

Edwards, P. J.; Papadopoulos, G.; Murray, A. F.

doi:10.1007/978-94-010-0678-1_5

P. J. Edwards,
G. Papadopoulos &
A. F. Murray

Part of the book series: International Series in Intelligent Technologies ((ISIT,volume 16))

163 Accesses

Abstract

This chapter describes the application of neural networks to the prediction of “paper curl”, an important quality metric in the papermaking industry. In particular we address the issue of reliability in neural network training and prediction. Model combination is used to compensate for the limitations of non-linear optimization algorithms used for neural network training. In addition, confidence measures are used to characterize prediction uncertainty. Reliability enhancement though model combination enables training to be automated. The provision of a confidence measure along with the prediction facilitates the user in knowing whether to trust the prediction or not.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bishop, C.M. (1994), “Novelty detection and neural network validation,”IEE Proceedings in Vision Image and Signal Processingvol. 141, pp. 217–222.
Article Google Scholar
Bishop, C.M. (1995)Neural Networks for Pattern RecognitionOxford University Press.
Google Scholar
Breiman, L. (1996), “Bagging predictors,”Machine Learningvol. 26, no. 2, pp. 123–140.
Google Scholar
Breiman, L. (1999), “Using adaptive bagging to debias regressions,” Technical Report 547, University of California at Berkeley.Available (www.stat.berkeley.edu/pub/users/breiman)
Google Scholar
Donaldson, J.R. and Schnabel, R.B. (1987), “Computational experience with confidence regions and confidence intervals for nonlinear least squares,”Technometricsvol. 29, no. 1, pp. 67–82.
Article MathSciNet MATH Google Scholar
Edwards, P.J. and Murray, A.F. (1998), “Toward optimally distributed computation,”Neural Computationvol. 10, pp. 997–1015.
Article Google Scholar
Edwards, P.J. and Murray, A.F. (2000), “Committee formation for reliable and accurate neural prediction in industry,”Proceedings of the European Symposium on Artificial Neural Networkspp. 141–146, Bruges, Belgium.
Google Scholar
Edwards, P.J. and Murray, A.F. (2000), “A study of early stopping and model selection applied to the papermaking industry,”International Journal of Neural Systems.To appear.
Google Scholar
Edwards, P.J., Murray, A.F., and Papadopoulos, G. (1999), “Cranking: neural network committee formation in the context of high predictive loss,”IEEE Transactions on Pattern Analysis and Machine Intelligence.Submitted.
Google Scholar
Edwards, P.J., Murray, A.F., Papadopoulos, G., Wallace, A.R., Barnard, J., and Smith, G. (1999), “The application of neural networks to the papermaking industry,”IEEE Transactions on neural networksvol. 10, no. 6, pp. 1456–1464.
Article Google Scholar
Efron, B. and Tibshirani, R.J. (1993)An introduction to the bootstrapChapman & Hall, New York.
MATH Google Scholar
Eriksson, L-E., Cavlin, S., Fellers, C., and L.Carlsson (1987), “Curl and twist of paperboard — theory and measurement,”Nordic Pulp and Paper Research Journalvol. 2, no. 2, pp. 66–70.
Article Google Scholar
Freund, Y. (1995), “Boosting a weak learning algorithm by majority,”Inform. and Comput.vol. 21, pp. 256–285.
Article MathSciNet Google Scholar
Goldner, P. (), “Drying systems for curl control,”TAPPI Journalvol. 47, no. 7, pp. 168A–170A.
Google Scholar
Hashem, S. (1994)Optimal Linear Combinations of neural networksPh.D. thesis, Purdue University.
Google Scholar
Heskes, T. (1997), “Balancing between bumping and bagging,”Proc. Neural Information Processing Systems (NIPS) Conferencepp. 466–472, Cambridge, Massachusetts. MIT Press.
Google Scholar
Heskes, T. (1997), “Practical confidence and prediction intervals,”Proc. Neural Information Processing Systems (NIPS) Conferencepp. 176–182, Cambridge, Massachusetts. MIT Press.
Google Scholar
Ho, T.K., Hull, J.J., and S.N.Srihari (1992), “A computational model for recognition of multifont words images,”Machine vision and applicationsvol. 5, pp. 157–168.
Article Google Scholar
Huang, Y.S. and Suen, C.Y. (1995), “A method of combining multiple experts for the recognition of unconstrained handwritten numerals,”IEEE Trans. Pattern Analysis and Machine Intelligencevol. 17, no. 1, pp. 90–94.
Article Google Scholar
Hwang, J.T.G. and Ding, A.A. (1995), “Prediction intervals for artificial neural networks,”Journal of the American Statistical Associationvol. 92, no. 438, pp. 748–757.
Article MathSciNet Google Scholar
Kittler, J., Hatef, M., Duin, R.P.W., and Matas, J. (1998), “On combining classifiers,”IEEE Trans. Pattern Analysis and Machine Intelligencevol. 20, no. 3, pp. 226–239.
Article Google Scholar
Kittler, J., Matas, J., Jonsson, K., and Sánchez, M.U. Ramos (1997), “Combining evidence in personal identity verification systems,”Pattern Recognition Lettersvol. 18, no. 9, pp. 845–852.
Article Google Scholar
Kleinberg, E.M. (1996), “An overtraining-resistant stochastic modelling method for pattern recognition,”The Annals of Statisticsvol. 24, no. 6, pp. 2319–2349.
Article MathSciNet MATH Google Scholar
Krogh, A. and Vedelsby, J. (1995), “Neural network ensembles, cross validation and active learning,” in Tesauro, G., Touretzky, D.S., and Leen, T.K. (Eds.)Proc. Neural Information Processing Systems (NIPS) Conferencepp. 231–238. MIT Press.
Google Scholar
Langevin, E.T. and Giguere, W. (1994), “Online curl measurement and control,”TAPPI Journalvol. 77, no. 8, pp. 105–110.
Google Scholar
Lebel, R. and Stadal, M. (1982), “Control of fine paper curl in papermaking,”Pulp and Paper-Canadavol. 83, no. 6, pp. 112–117.
Google Scholar
Lyne, M.B. (1988), “Paper requirements for non-impact,”International Printing and Graphic Arts Conference Proceedingspp. 8997. TAPPI Press.
Google Scholar
MacKay, D.J.C. (1992), “Bayesian framework for backpropagation networks,”Neural Computationvol. 4, no. 3, pp. 448–472.
Article Google Scholar
MacKay, D.J.C. (1992), “Evidence framework applied to classification networks,”Neural Computationvol. 4, no. 5, pp. 720–736.
Article Google Scholar
Mann, K.C. and Huff, L.A. (1992), “Curl control with a Coanda actuator system,”TAPPI Journalvol. 75, no. 5, pp. 133–137.
Google Scholar
Meir, R. (1995), “Bias, variance and the combination of estimators,” in Tesauro, G., Tourestzky, D., and Leen, T. (Eds.)Proc. Neural Information Processing Systems (NIPS) Conference7, pp. 295–302. MIT Press.
Google Scholar
Merz, C.J. (1998), “Combining classifiers using correspondence analysis,” in Jordan, M., Kearns, M.J., and Solla, S.A. (Eds.)Proc. Neural Information Processing Systems (NIPS) Conference 10pp. 591–597. MIT.
Google Scholar
Munro, P.W. and Parmanto, B. (1997), “Competition among networks improves committee performance,”Proc. Neural Information Processing Systems (NIPS) Conference 9pp. 592–598, Cambridge, Massachusetts. MIT Press.
Google Scholar
Neal, R.M. (1996)Bayesian learning for neural networksSpringer—Verlag, New York.
Book MATH Google Scholar
Nix, D.A. and Weigend, A.S. (1995), “Learning local error bars for nonlinear regression,” in Tesauro, G., Tourestzky, D., and Leen, T. (Eds.)Proc. Neural Information Processing Systems (NIPS) Conferencepp. 489–496. MIT Press.
Google Scholar
Nordstrom, A., Carlsson, L.A., and Hagglund, J.E. (1997), “Measuring curl of thin papers,”TAPPI Journalvol. 80, no. 1, pp. 238–244.
Google Scholar
Papadopoulos, G., Edwards, P.J., and Murray, A.F. (2000), “Confidence estimation methods for neural networks: a practical comparison,”Proceedings of the European Symposium on Artificial Neural Networkspp. 75–80, Bruges, Belgium.
Google Scholar
Parmanto, B., Munro, P.W., and Doyle, H.R. (1996), “Improving committee diagnosis with resampling techniques,” in Tourestzky, D.S., Mozer, M.C., and Hasselmo, M.E. (Eds.)Proc. Neural Information Processing Systems (NIPS) Conference 8pp. 882–888. MIT Press.
Google Scholar
Perrone, M.P. and Cooper, L.N. (1993)When networks disagree: ensemble methods for hybrid neural networkspp. 126–142. Chapman & Hall, London, UK.
Google Scholar
Press, W.H., Teukolsky, S.A., Vetterling, W.T., and Flannery, B.P (1992)Numerical Recipes in CCambridge University Press.
Google Scholar
Qazaz, C. (1996)Bayesian Error Bars for RegressionPhD thesis, Aston University.
Google Scholar
Rosen, B.E. (1996), “Ensemble learning using decorrelated neural networks,”Connection Sciencevol. 8, no. 3, pp. 373–383.
Article Google Scholar
Schapire, R.E. (1990), “The strength of weak learnability,” Machinelearningvol. 5, pp. 197–227.
Google Scholar
Sharkey, A.J.C. (1996), “On combining artificial neural nets,”Connection Sciencevol. 8, no. 3, pp. 299–313.
Article Google Scholar
Silverman, B.W. (1986)Density Estimation for Statistics and Data AnalysisChapman & Hall, London.
MATH Google Scholar
Thodberg, H.H. (1996), “A review of Bayesian neural networks with an application to near infrared spectroscopy,”IEEE Trans. Neural Networksvol. 7, no. 1, pp. 56–72.
Article Google Scholar
Tibshirani, R.J. (1996), “A comparison of some error estimates for neural network models,”Neural Computationvol. 8, no. 1, pp. 152–163.
Article Google Scholar
Tresp, V., Ahmad, S., and Neuneier, R. (1994), “Training neural networks with deficient data,”Proc. Neural Information Processing Systems (NIPS) Conference 6pp. 128–135. Morgan Kaufmann.
Google Scholar
Tresp, V. and Taniguchi, M. (1995), “Combining estimators using non-constant weighting functions,”Proc. Neural Information Processing Systems (NIPS) Conference 7pp. 419–426, Cambridge, Massachusetts. MIT Press.
Google Scholar
Ueda, N. and Nakano, R. (), “Generalisation error of ensemble estimators,”Proc. International Conference on Neural Networksvol. 1, pp. 90–95, Washington D.C.
Google Scholar
Viitaharju, P., Kajanto, I., and Niskanen, K. (1997), “Heavy papers and curl measurement,”Paper and Timbervol. 79, no. 2, pp. 115–120.
Google Scholar
Wolpert, D. and Macready, W. (1996), “Combining stacking with bagging to improve a learning algorithm,” Technical Report SF 1TR-96–03-123, Santa Fe Institute.
Google Scholar
Wolpert, D.H. (1992), “Stacked generalisation,”Neural Networksvol. 5, no. 2, pp. 241–259.
Article MathSciNet Google Scholar
Zhang, J. (1999), “Inferential estimation of polymer quality using bootstrap aggregated networks,” NeuralNetworksvol. 12, pp. 927–938.
Google Scholar

Download references

Authors

P. J. Edwards
View author publications
You can also search for this author in PubMed Google Scholar
G. Papadopoulos
View author publications
You can also search for this author in PubMed Google Scholar
A. F. Murray
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of South Australia, Adelaide, Australia
Lakhmi Jain
University of London, London, UK
Philippe De Wilde

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Edwards, P.J., Papadopoulos, G., Murray, A.F. (2001). Neural Prediction in Industry: Increasing Reliability through Use of Confidence Measures and Model Combination. In: Jain, L., De Wilde, P. (eds) Practical Applications of Computational Intelligence Techniques. International Series in Intelligent Technologies, vol 16. Springer, Dordrecht. https://doi.org/10.1007/978-94-010-0678-1_5

Download citation

DOI: https://doi.org/10.1007/978-94-010-0678-1_5
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-010-3868-3
Online ISBN: 978-94-010-0678-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics