Adaptive Feature Selection and Feature Fusion for Semi-supervised Classification

  • Wei Du
  • Ronald Phlypo
  • Tülay Adalı


Labeling of data is often difficult, expensive, and time consuming since efforts of experienced human annotators are required, and often we have large number of samples and noisy data. Co-training is a practical and powerful semi-supervised learning method as it yields high classification accuracy with a training data set containing only a small set of labeled data. For successful co-training performance, two important conditions need to be satisfied for the features: diversity and sufficiency. In this paper, we propose a novel mutual information based approach inspired by the idea of dependent component analysis to achieve feature splits that are maximally independent between-subsets (diverse) or within-subsets (sufficient). In addition, we demonstrate the application of the method to a real world problem, classification of laser tread mapping tire data. We introduce several features that are designed to highlight physical characteristics of the tire data, as well as local or global descriptors, such as histograms, gradients, or representations in other domains. Results from both simulations and tire image classification confirm that co-training with the proposed feature set and feature splits consistently yields higher accuracy than supervised classification, when using only a small set of labeled training data is available. The proposed method presents a very promising complement to time consuming and subjective expert labeling of data, reducing expert efforts to a minimum. Further results show that by using a probabilistic multi-layer perceptron classifier as the base learner in co-training, our method leads to very meaningful continuous measures for the progression of irregular wear on tire surface.


Co-training Semi-supervised learning Feature extraction Feature fusion Image classification LTM tire images 


  1. 1.
    Abney, S. (2002). Bootstrapping. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics (pp. 360–367).Google Scholar
  2. 2.
    Ahmed, N., Natarajan, T., Rao, K. (1974). Discrete cosine transform. IEEE Transactions on Computers, C-23(1), 90–93.MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Bach, F.R., & Jordan, M.I. (2003). Beyond independent components: trees and clusters. Journal of Machine Learning Research, 4, 1205–1233.MathSciNetzbMATHGoogle Scholar
  4. 4.
    Balcan, M.F., Blum, A., Yang, K. (2004). Co-training and expansion: towards bridging theory and practice. In Neural information processing systems (NIPS).Google Scholar
  5. 5.
    Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B (Methodological), 57(1), 289–300.MathSciNetzbMATHGoogle Scholar
  6. 6.
    Bishop, C.M. (1995). Neural Neworks for Pattern Recognition. New York: Oxford Univerity Press.Google Scholar
  7. 7.
    Blum, A., & Mitchell, T. (1998). Combining labeled and unlabeled data with co-training. In Proceedings of the 11th annual conference on computational learning theory (pp. 92–100). New York.Google Scholar
  8. 8.
    Cardoso, J.F. (1998). Multidimensional independent component analysis. In 1998 IEEE international conference on acoustics, speech and signal processing (ICASSP) (Vol. 4, pp. 1941–1944).Google Scholar
  9. 9.
    Chan, J., Koprinska, I., Poon, J. (2004). Co-training with a single natural feature set applied to email classification. In IEEE/WIC/ ACM international conference on web intelligence (WI’04).Google Scholar
  10. 10.
    Chapelle, O., Haffner, P., Vapnik, V. (1999). Support vector machines for histogram-based image classification. IEEE Transactions on Neural Networks, 10(5), 1055–1064.CrossRefGoogle Scholar
  11. 11.
    Comon, P. (1994). Independent component analysis, a new concept? Signal Processing, 36(3), 287–314.CrossRefzbMATHGoogle Scholar
  12. 12.
    Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In IEEE Computer Society conference on computer vision and pattern recognition, 2005. CVPR 2005 (Vol. 1, pp. 886–893).Google Scholar
  13. 13.
    Du, W., Calhoun, V.D., Li, H., Ma, S., Eichele, T., Kiehl, K.A., Pearlson, G.D., Adalı, T. (2012). High classification accuracy for schizophrenia with rest and task fMRI data. Frontiers in Human Neuroscience, 6(145), 1–12.Google Scholar
  14. 14.
    Du, W., Phlypo, R., Adalı, T. (2013). Adaptive feature split selection for co-training: application to tire irregular wear classification. In IEEE international conference on acoustics, speech and signal processing (ICASSP) (p. 2013). Vancouver.Google Scholar
  15. 15.
    Feger, F., & Koprinska, I. (2006). Co-training using RBF nets and different feature splits. In Proceedings of 2006 international joint conference on neural network (pp. 1878–1885).Google Scholar
  16. 16.
    Fisher, R. (1936). The use of multiple measures in taxonomic problems. Annals of Eugenics, 7, 179–188.CrossRefGoogle Scholar
  17. 17.
    Gonzalez, C.R., & Woods, R.E. (2007). Digital image processing, 3rd edn. Englewood Cliffs: Pearson Prentice Hall.Google Scholar
  18. 18.
    Grubbs, F.E. (1969). Procedures for detecting outlying observations in samples. Technometrics, 11(1), 1–21.CrossRefGoogle Scholar
  19. 19.
    Huang, Y., Xu, D., Nie, F. (2012). Semi-supervised dimension reduction using trace ratio criterion. IEEE Transactions on Neural Networks and Learning Systems, 23(3), 519–526.CrossRefGoogle Scholar
  20. 20.
    Hyvärinen, A., & Hoyer, P. (2000). Emergence of phase- and shift-invariant features by decomposition of natural images into independent feature subspaces. Neural Computation, 12(7), 1705–1720.CrossRefGoogle Scholar
  21. 21.
    Hyvärinen, A., & Oja, E. (2000). Independent component analysis: algorithms and applications. Neural Networks, 13, 411–430.CrossRefGoogle Scholar
  22. 22.
    Kaljahi, R.S.Z., & Baba, M.S. (2011). Investigation of Co-training views and variations for semantic role labeling. In Proceedings of workshop on robust unsupervised and semi-supervised methods in natural language processing.Google Scholar
  23. 23.
    Kullback, S., Burnham, K.P., Laubscher, N.F., Dallal, G.E., Wilkinson, L., Morrison, D.F., Loyer, M.W., Eisenberg, B., Ghosh, S., Jolliffe, I.T., Simonoff, J.S. (1987). Letters to the editor: the kullback-Leibler distance. The American Statistician, 41(4), 340–341.Google Scholar
  24. 24.
    Kullback, S., & Leibler, R.A. (1951). On information and sufficiency. Annals of Mathematical Statistics, 22 (1), 79–86.MathSciNetCrossRefzbMATHGoogle Scholar
  25. 25.
    Lowe, D.G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60, 91–110.CrossRefGoogle Scholar
  26. 26.
    Ma, S., Li, X.L., Correa, N., Adalı, T., Calhoun, V. (2010). Independent subspace analysis with prior information for fMRI data. In 2010 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 1922–1925).Google Scholar
  27. 27.
    MacKay, D. (1992). The evidence framework applied to classification networks. Neural Computation, 4(5), 720–736.CrossRefGoogle Scholar
  28. 28.
    MacKay, D. (1992). A practical Bayesian framework for backpropagation networks. Neural Computation, 4, 448–472.CrossRefGoogle Scholar
  29. 29.
    Nigam, K., & Ghani, R. (2000). Analyzing the effectiveness and applicability of co-training. In Proceedings of the 9th international conference on information and knowledge management (pp. 86–93). New York.Google Scholar
  30. 30.
    Podbielski, S.E., & Clothiaux, J.D. (2004). Alignment device for rotating tire laser mapping machine. Patent number: 6802130, Bridgestone/Firestone North American Tire, LLC.Google Scholar
  31. 31.
    Salaheldin, A., & El-Gayar, N. (2010). New feature splitting criteria for co-training using genetic algorithm optimization. In Multiple classifier systems (Vol. 5997, pp. 22–32).Google Scholar
  32. 32.
    Salaheldin, A., & El-Gayar, N. (2012). Complementary feature splits for co-training. In Information science, signal processing and their applications (ISSPA) (pp. 1303–1308).Google Scholar
  33. 33.
    Schölkopf, B., Smola, A., Müller, K.R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10(5), 1299–1319.CrossRefGoogle Scholar
  34. 34.
    Slivka, J., Kovačević, A., Konjovic, Z. (2010). Co-training based algorithm for datasets without the natural feature split. In 2010 8th international symposium on intelligent systems and informatics (SISY) (pp. 279–284).Google Scholar
  35. 35.
    Terabe, M., & Hashimoto, K. (2008). Evaluation criteria of feature splits for co-training. In Proceedings of the international multi-conference of engineers and computer scientists (pp. 540– 545).Google Scholar
  36. 36.
    Wang, W., & hua Zhou, Z. (2007). Analyzing co-training style algorithms. In Proceedings of the 18th European conference on machine learning (pp. 454–465).Google Scholar
  37. 37.
    Wang, W., & hua Zhou, Z. (2010). A new analysis of co-training. In Proceedings of the 27th international conference on machine learning (ICML-10) (pp. 1135–1142).Google Scholar
  38. 38.
    Wang, Y., Chen, S., Zhou, Z.H. (2012). New semi-supervised classification method based on modified cluster assumption. IEEE Transactions on Neural Networks and Learning Systems, 23(5), 689–702.CrossRefGoogle Scholar
  39. 39.
    Xu, Z., King, I., Lyu, M.T., Jin, R. (2010). Discriminative semi-supervised feature selection via manifold regularization. IEEE Transactions on Neural Networks, 21(7), 1033–1047.CrossRefGoogle Scholar
  40. 40.
    Zhang, W., & Zheng, Q. (2009). TSFS: a novel algorithm for single view co-training. In International joint conference on computational sciences and optimization, 2009. CSO 2009 (Vol. 1, pp. 492–496).Google Scholar
  41. 41.
    Zhu, X. (2005). Semi-supervised learning literature survey. Tech. Rep. 1530, Computer Sciences, University of Wisconsin-Madison.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of CSEEUniversity of MarylandBaltimore CountyUSA

Personalised recommendations