Abstract
This chapter describes the application of neural networks to the prediction of “paper curl”, an important quality metric in the papermaking industry. In particular we address the issue of reliability in neural network training and prediction. Model combination is used to compensate for the limitations of non-linear optimization algorithms used for neural network training. In addition, confidence measures are used to characterize prediction uncertainty. Reliability enhancement though model combination enables training to be automated. The provision of a confidence measure along with the prediction facilitates the user in knowing whether to trust the prediction or not.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bishop, C.M. (1994), “Novelty detection and neural network validation,”IEE Proceedings in Vision Image and Signal Processingvol. 141, pp. 217–222.
Bishop, C.M. (1995)Neural Networks for Pattern RecognitionOxford University Press.
Breiman, L. (1996), “Bagging predictors,”Machine Learningvol. 26, no. 2, pp. 123–140.
Breiman, L. (1999), “Using adaptive bagging to debias regressions,” Technical Report 547, University of California at Berkeley.Available (www.stat.berkeley.edu/pub/users/breiman)
Donaldson, J.R. and Schnabel, R.B. (1987), “Computational experience with confidence regions and confidence intervals for nonlinear least squares,”Technometricsvol. 29, no. 1, pp. 67–82.
Edwards, P.J. and Murray, A.F. (1998), “Toward optimally distributed computation,”Neural Computationvol. 10, pp. 997–1015.
Edwards, P.J. and Murray, A.F. (2000), “Committee formation for reliable and accurate neural prediction in industry,”Proceedings of the European Symposium on Artificial Neural Networkspp. 141–146, Bruges, Belgium.
Edwards, P.J. and Murray, A.F. (2000), “A study of early stopping and model selection applied to the papermaking industry,”International Journal of Neural Systems.To appear.
Edwards, P.J., Murray, A.F., and Papadopoulos, G. (1999), “Cranking: neural network committee formation in the context of high predictive loss,”IEEE Transactions on Pattern Analysis and Machine Intelligence.Submitted.
Edwards, P.J., Murray, A.F., Papadopoulos, G., Wallace, A.R., Barnard, J., and Smith, G. (1999), “The application of neural networks to the papermaking industry,”IEEE Transactions on neural networksvol. 10, no. 6, pp. 1456–1464.
Efron, B. and Tibshirani, R.J. (1993)An introduction to the bootstrapChapman & Hall, New York.
Eriksson, L-E., Cavlin, S., Fellers, C., and L.Carlsson (1987), “Curl and twist of paperboard — theory and measurement,”Nordic Pulp and Paper Research Journalvol. 2, no. 2, pp. 66–70.
Freund, Y. (1995), “Boosting a weak learning algorithm by majority,”Inform. and Comput.vol. 21, pp. 256–285.
Goldner, P. (), “Drying systems for curl control,”TAPPI Journalvol. 47, no. 7, pp. 168A–170A.
Hashem, S. (1994)Optimal Linear Combinations of neural networksPh.D. thesis, Purdue University.
Heskes, T. (1997), “Balancing between bumping and bagging,”Proc. Neural Information Processing Systems (NIPS) Conferencepp. 466–472, Cambridge, Massachusetts. MIT Press.
Heskes, T. (1997), “Practical confidence and prediction intervals,”Proc. Neural Information Processing Systems (NIPS) Conferencepp. 176–182, Cambridge, Massachusetts. MIT Press.
Ho, T.K., Hull, J.J., and S.N.Srihari (1992), “A computational model for recognition of multifont words images,”Machine vision and applicationsvol. 5, pp. 157–168.
Huang, Y.S. and Suen, C.Y. (1995), “A method of combining multiple experts for the recognition of unconstrained handwritten numerals,”IEEE Trans. Pattern Analysis and Machine Intelligencevol. 17, no. 1, pp. 90–94.
Hwang, J.T.G. and Ding, A.A. (1995), “Prediction intervals for artificial neural networks,”Journal of the American Statistical Associationvol. 92, no. 438, pp. 748–757.
Kittler, J., Hatef, M., Duin, R.P.W., and Matas, J. (1998), “On combining classifiers,”IEEE Trans. Pattern Analysis and Machine Intelligencevol. 20, no. 3, pp. 226–239.
Kittler, J., Matas, J., Jonsson, K., and Sánchez, M.U. Ramos (1997), “Combining evidence in personal identity verification systems,”Pattern Recognition Lettersvol. 18, no. 9, pp. 845–852.
Kleinberg, E.M. (1996), “An overtraining-resistant stochastic modelling method for pattern recognition,”The Annals of Statisticsvol. 24, no. 6, pp. 2319–2349.
Krogh, A. and Vedelsby, J. (1995), “Neural network ensembles, cross validation and active learning,” in Tesauro, G., Touretzky, D.S., and Leen, T.K. (Eds.)Proc. Neural Information Processing Systems (NIPS) Conferencepp. 231–238. MIT Press.
Langevin, E.T. and Giguere, W. (1994), “Online curl measurement and control,”TAPPI Journalvol. 77, no. 8, pp. 105–110.
Lebel, R. and Stadal, M. (1982), “Control of fine paper curl in papermaking,”Pulp and Paper-Canadavol. 83, no. 6, pp. 112–117.
Lyne, M.B. (1988), “Paper requirements for non-impact,”International Printing and Graphic Arts Conference Proceedingspp. 8997. TAPPI Press.
MacKay, D.J.C. (1992), “Bayesian framework for backpropagation networks,”Neural Computationvol. 4, no. 3, pp. 448–472.
MacKay, D.J.C. (1992), “Evidence framework applied to classification networks,”Neural Computationvol. 4, no. 5, pp. 720–736.
Mann, K.C. and Huff, L.A. (1992), “Curl control with a Coanda actuator system,”TAPPI Journalvol. 75, no. 5, pp. 133–137.
Meir, R. (1995), “Bias, variance and the combination of estimators,” in Tesauro, G., Tourestzky, D., and Leen, T. (Eds.)Proc. Neural Information Processing Systems (NIPS) Conference7, pp. 295–302. MIT Press.
Merz, C.J. (1998), “Combining classifiers using correspondence analysis,” in Jordan, M., Kearns, M.J., and Solla, S.A. (Eds.)Proc. Neural Information Processing Systems (NIPS) Conference 10pp. 591–597. MIT.
Munro, P.W. and Parmanto, B. (1997), “Competition among networks improves committee performance,”Proc. Neural Information Processing Systems (NIPS) Conference 9pp. 592–598, Cambridge, Massachusetts. MIT Press.
Neal, R.M. (1996)Bayesian learning for neural networksSpringer—Verlag, New York.
Nix, D.A. and Weigend, A.S. (1995), “Learning local error bars for nonlinear regression,” in Tesauro, G., Tourestzky, D., and Leen, T. (Eds.)Proc. Neural Information Processing Systems (NIPS) Conferencepp. 489–496. MIT Press.
Nordstrom, A., Carlsson, L.A., and Hagglund, J.E. (1997), “Measuring curl of thin papers,”TAPPI Journalvol. 80, no. 1, pp. 238–244.
Papadopoulos, G., Edwards, P.J., and Murray, A.F. (2000), “Confidence estimation methods for neural networks: a practical comparison,”Proceedings of the European Symposium on Artificial Neural Networkspp. 75–80, Bruges, Belgium.
Parmanto, B., Munro, P.W., and Doyle, H.R. (1996), “Improving committee diagnosis with resampling techniques,” in Tourestzky, D.S., Mozer, M.C., and Hasselmo, M.E. (Eds.)Proc. Neural Information Processing Systems (NIPS) Conference 8pp. 882–888. MIT Press.
Perrone, M.P. and Cooper, L.N. (1993)When networks disagree: ensemble methods for hybrid neural networkspp. 126–142. Chapman & Hall, London, UK.
Press, W.H., Teukolsky, S.A., Vetterling, W.T., and Flannery, B.P (1992)Numerical Recipes in CCambridge University Press.
Qazaz, C. (1996)Bayesian Error Bars for RegressionPhD thesis, Aston University.
Rosen, B.E. (1996), “Ensemble learning using decorrelated neural networks,”Connection Sciencevol. 8, no. 3, pp. 373–383.
Schapire, R.E. (1990), “The strength of weak learnability,” Machinelearningvol. 5, pp. 197–227.
Sharkey, A.J.C. (1996), “On combining artificial neural nets,”Connection Sciencevol. 8, no. 3, pp. 299–313.
Silverman, B.W. (1986)Density Estimation for Statistics and Data AnalysisChapman & Hall, London.
Thodberg, H.H. (1996), “A review of Bayesian neural networks with an application to near infrared spectroscopy,”IEEE Trans. Neural Networksvol. 7, no. 1, pp. 56–72.
Tibshirani, R.J. (1996), “A comparison of some error estimates for neural network models,”Neural Computationvol. 8, no. 1, pp. 152–163.
Tresp, V., Ahmad, S., and Neuneier, R. (1994), “Training neural networks with deficient data,”Proc. Neural Information Processing Systems (NIPS) Conference 6pp. 128–135. Morgan Kaufmann.
Tresp, V. and Taniguchi, M. (1995), “Combining estimators using non-constant weighting functions,”Proc. Neural Information Processing Systems (NIPS) Conference 7pp. 419–426, Cambridge, Massachusetts. MIT Press.
Ueda, N. and Nakano, R. (), “Generalisation error of ensemble estimators,”Proc. International Conference on Neural Networksvol. 1, pp. 90–95, Washington D.C.
Viitaharju, P., Kajanto, I., and Niskanen, K. (1997), “Heavy papers and curl measurement,”Paper and Timbervol. 79, no. 2, pp. 115–120.
Wolpert, D. and Macready, W. (1996), “Combining stacking with bagging to improve a learning algorithm,” Technical Report SF 1TR-96–03-123, Santa Fe Institute.
Wolpert, D.H. (1992), “Stacked generalisation,”Neural Networksvol. 5, no. 2, pp. 241–259.
Zhang, J. (1999), “Inferential estimation of polymer quality using bootstrap aggregated networks,” NeuralNetworksvol. 12, pp. 927–938.
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer Science+Business Media New York
About this chapter
Cite this chapter
Edwards, P.J., Papadopoulos, G., Murray, A.F. (2001). Neural Prediction in Industry: Increasing Reliability through Use of Confidence Measures and Model Combination. In: Jain, L., De Wilde, P. (eds) Practical Applications of Computational Intelligence Techniques. International Series in Intelligent Technologies, vol 16. Springer, Dordrecht. https://doi.org/10.1007/978-94-010-0678-1_5
Download citation
DOI: https://doi.org/10.1007/978-94-010-0678-1_5
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-010-3868-3
Online ISBN: 978-94-010-0678-1
eBook Packages: Springer Book Archive