New Learning Paradigms in Soft Computing pp 368-403 | Cite as

# Using Unlabeled Data for Learning Classification Problems

## Abstract

This chapter presents an approach of using unlabeled data for learning classification problems. The chapter consists of two parts. In the first part of the chapter, an approach of using both labeled and unlabeled data to train a multilayer perceptron is presented. The approach banks on the assumption that regions of low pattern density usually separate data classes. The unlabeled data are iteratively preprocessed by a perceptron being trained to obtain the soft class label estimates. It is demonstrated that substantial gains in classification performance may be achieved by using the approach when the labeled data do not adequately represent the entire class distributions. In the second part of the chapter, we propose a quality function for learning decision boundary between data clusters from unlabeled data. The function is based on third order polynomials. The objective of the quality function is to find a place in the input space sparse in data points. By maximizing the quality function, we find a decision boundary between data clusters. A superiority of the proposed quality function over the other similar functions as well as the conventional clustering algorithms tested has been observed in the experiments.

## Keywords

Quality Function Decision Boundary Multilayer Perceptron Label Data Unlabeled Data## Preview

Unable to display preview. Download preview PDF.

## References

- [1]Atiya, A.F. (1990), “An unsupervised learning technique for artificial neural networks,”
*Neural Networks*,*vol*. 3, pp. 707–711.CrossRefGoogle Scholar - [2]Barni, M., Cappellini, V., and Mecocci, A. (1997), “Color-based detection of defects on chicken meat,”
*Image and Vision Computing*,*vol*. 15, pp. 549–556.CrossRefGoogle Scholar - [3]Bensaid, A.M. and Hall, L.O. (1996), “Partially supervised clustering for image segmentation,”
*Pattern Recognition*, vol. 29, pp. 859–871.CrossRefGoogle Scholar - [4]Bezdek, J.C. (1981),
*Pattern recognition with fuzzy objective function algorithms*, Plenum Press, New York.MATHCrossRefGoogle Scholar - [5]Castelli, V. and Cover, T.M. (1995), “On the exponential value of labeled samples,”
*Pattern Recognition Letters*, vol. 16, pp. 105–111.CrossRefGoogle Scholar - [6]Cataltepe, Z. and Magdon-Ismail, M. (1998), “Incorporating test inputs into learning,” Jordan, M.I., Kearns, M.J., and Solla, S.A. (Eds.),
*Advances in Neural Information Processing Systems**10*, MIT Press, pp. 437–443.Google Scholar - [7]Duda, R. and Hart, P. (1973),
*Pattern Classification and Scene Analysis*, Wiley Interscience, New York.MATHGoogle Scholar - [8]Friedman, J.M. and Stuetzle, W. (1981), “Projection pursuit regression,”
*Journal of the American Statistical Association*, vol. 76, pp. 817–823.MathSciNetCrossRefGoogle Scholar - [9]Ghosh, A. and Pal, S.K. (1992), “Neural network, self-organization and object extraction,”
*Pattern Recognition Letters*, vol. 13, pp. 387–397.CrossRefGoogle Scholar - [10]Ghosh, J. and Shin, Y. (1992), “Efficient higher-order neural networks for classification and function approximation,”
*International Journal of Neural**Systems*,*pp*. 323–350.Google Scholar - [11]Grossberg, S. (1976), “Adaptive pattern classification and universal recording: Part I. Parallel development and coding of neural feature detectors,”
*Biological Cybernetics*, vol. 23, pp. 121–134.MathSciNetMATHCrossRefGoogle Scholar - [12]Hyvärinen, A. and Oja, E. (1997), “One-unit learning rules for independent component analysis,” Mozer, M.C., Jordan, M.I., and Petsche, T. (Eds.),
*Advances in Neural Information Processing Systems*9, MIT Press, pp. 480–486.Google Scholar - [13]Huang, L.K. and Wang, M.J.J. (1995), “Image thresholding by minimizing the measures of fuzziness,”
*Pattern Recognition*, vol. 28, pp. 41–51.CrossRefGoogle Scholar - [14]Intrator, N. and Cooper, L.N. (1992), “Objective function formulation of the BCM theory of visual cortical plasticity: statistical connections, stability conditions,”
*Neural Networks*, vol. 5, pp. 3–17.CrossRefGoogle Scholar - [15]Intrator, N., Reisfeld, D., and Yeshurun, Y. (1996), “Face recognition using a hybrid supervised/unsupervised neural network,”
*Pattern Recognition Letters*, vol. 17, pp. 67–76.CrossRefGoogle Scholar - [16]Kankanhalli, M.S., Mehtre, B.M., and Huang, H.Y. (1999), “Color and spatial feature for content-based image retrieval,”
*Pattern Recognition Letters*, vol. 20, pp. 109–118.MATHCrossRefGoogle Scholar - [17]Kerhagias, A. and Petridis, V. (1997), “Time-Series segmentation using predictive modular neural networks,”
*Neural Computation*, vol. 9, pp. 1691–1709.CrossRefGoogle Scholar - [18]Kohonen, T. (1984),
*Self-Organization and Associative Memory*, Springer-Verlag, Berlin.MATHGoogle Scholar - [19]Kosko, B. (1992),
*Neural Networks and Fuzzy Systems. A Dynamical Systems Approach to Machine Intelligence*,Prentice-Hall.Google Scholar - [20]Lippman, R.P. (1987), “An introduction to computing with neural nets,”
*IEEE ASSP Magazine*, pp. 4–22.Google Scholar - [21]Lippman, R.P. (1989), “Pattern classification using neural networks,”
*IEEE Communications Magazine*, vol. 27, pp. 47–64.CrossRefGoogle Scholar - [22]Mansfield, J.R., Sowa, M.G., Payette, J.R., Abdulrauf, B., Stranc, M.F., and Mantsch, H.H. (1998), “Tissue viability by multispectral near infrared imaging: A fuzzy C-means clustering analysis,”
*IEEE Trans. Medical Imaging*, vol. 17, pp. 1011–1018.CrossRefGoogle Scholar - [23]Miller, D.J. and Uyar, H.S. (1998), “Combined learning and use for a mixture model equivalent to the RBF classifier,”
*Neural Computation*, vol. 10, pp. 281–293.CrossRefGoogle Scholar - [24]Miller, D.J. and Uyar, H.S. (1997), “A mixture of experts classifier with learning based on both labeled and unlabeled data,” Mozer, M.C., Jordan, M.I., and Petsche, T. (Eds.),
*Advances in Neural Information Processing Systems**9*, MIT Press, pp. 571–577.Google Scholar - [25]Nakano, K. (1997), “Application of neural networks to the color grading of apples,”
*Computers and Electronics in Agriculture*, vol. 18, pp. 105–116.CrossRefGoogle Scholar - [26]Osterberg, M. (1994),
*Quality functions for parallel selective principal component analysis*, Ph.D. Thesis, Linköping University, UniTryck, Linköping.Google Scholar - [27]Otsu, N. (1979), “A threshold selection method from gray-level histograms,”
*IEEE Trans. Systems Man and Cybernetics*, vol. 9, pp. 62–66.CrossRefGoogle Scholar - [28]Pal, S.K. and Rosenfeld, A. (1988), “Image enhancement and thresholding by optimization of fuzzy compactness,”
*Pattern Recognition Letters*, vol. 7, pp. 77–86.MATHCrossRefGoogle Scholar - [29]Ruck, D.W., Rogers, S.K., Kabrisky, M., Oxley, M.E., and Suter, B.W. (1990), “The multilayer perceptron as an approximation to a Bayes optimal discriminant function,”
*IEEE Trans. Neural Networks*, vol. 1, pp. 296–298.CrossRefGoogle Scholar - [30]Shashahani, B. and Landgrebe, D. (1992), “On the asymptotic improvement of supervised learning by utilizing additional unlabeled samples; normal mixture density case,”
*SPIE**1766*, pp. 143–155.CrossRefGoogle Scholar - [31]Shashahani, B. and Landgrebe, D. (1994), “The effect on unlabeled samples in reducing the small sample size problem and mitigating the huger phenomenon,”
*IEEE Transactions Geoscience and Remote Sensing*, vol. 32, pp. 1087–1095.CrossRefGoogle Scholar - [32]Schmid, P. (1999), “Segmentation of digitized dermatoscopic images by two-dimensional color clustering,”
*IEEE Trans Medical Imaging*, vol. 18, pp. 164–171.CrossRefGoogle Scholar - [33]Towell, G. (1997), “Using unlabeled data for supervised learning,” Mozer, M.C., Jordan, M.I., and Petsche, T. (Eds.),
*Advances in Neural Information Processing Systems**9*, MIT Press, pp. 647–653.Google Scholar - [34]Tsao, C.E., Bezdek, J.C., and Pal, N.R. (1996), “Fuzzy Kohonen clustering networks,”
*Pattern Recognition*, vol. 29, pp. 757–764.Google Scholar - [35]Verikas, A., Malmqvist, K., Bergman, L., and Sighnal, M. (1998), “Color classification by neural networks in graphic arts,”
*Neural Computing and Applications*, vol. 7, pp. 52–64.MATHCrossRefGoogle Scholar - [36]Verikas, A., Malmqvist, K., and Bergman, L. (1997), “Color image segmentation by modular neural network,”
*Pattern Recognition Letters*, vol. 18, pp. 173–185.CrossRefGoogle Scholar - [37]Verikas, A., Gelzinis, A., and Malmqvist, K. (1997), “A random search technique for training neural networks,”
*Proceedings of the fourth International Conference on Neural Information Processing*, ICONIP’97, Dunedin, pp. 322–325.Google Scholar - [38]Verikas, A., Malmqvist, K., Malmqvist, L., and Bergman, L. (1999), “A new method for color measurements in graphic arts,”
*Color Research and Application*, vol. 24, pp. 185–196.CrossRefGoogle Scholar - [39]Wan, E.A. (1990), “Neural network classification: A Bayesian interpretation,”
*IEEE Trans. Neural Networks*, vol. 1, pp. 303–305.CrossRefGoogle Scholar - [40]Waxman, A.M., Seibert, M., Gove, A.N., Fay, D.A., Cunningham, R.K., and Bachelder, I.A. (1994), “Visual learning of objects: Neural models of shape, color, motion and space,” Zurada, J.M., Marks, R.J., and Robinson, C.J. (Eds.),
*Computational Intelligence Imitating Life*, IEEE Press, pp. 237–251.Google Scholar - [41]Xu, L., Jackowski, M., Goshtasby, A., Roseman, D., Bines, S., Yu, C., Dhawan, A., and Huntley, A. (1999), “Segmentation of skin cancer images,”
*Image and Vision Computing*, vol. 17, pp. 65–74.CrossRefGoogle Scholar