Skip to main content

Semi-supervised Learning Based on Improved Co-training by Committee

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9243))

Abstract

As a popular machine learning technique, semi-supervised learning can make full use of a large pool of unlabeled samples in addition to a small number of labeled ones to improve the performance of supervised learning. In co-training by committee, a semi-supervised learning algorithm, the class probability values predicted by committee may repeat, which brings a negative influence on the improvement of the classification performance. We propose a method to deal with this problem, which assign different class probability estimations for different unlabeled samples. Naïve Bayes is employed to help estimate the class probabilities of unlabeled samples. To prove that our method can reduce the introduction of noise, a data editing technique is employed to make a comparison with our method. Experimental results verify the effectiveness of our method and the data editing technique, and also indicate that our method is generally better than the data editing technique.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. He, Z., Li, X., Hu, W.: A boosted semi-supervised learning framework for web page filtering. In: IEEE International Conference on Systems, Man and Cybernetics, pp. 2133–2136 (2009)

    Google Scholar 

  2. Sun, Z., Ye, Y., Zhang, X., Huang, Z., Chen, S., Liu, Z.: Batch-mode active learning with semi-supervised cluster tree for text classification. In: IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology (WI-IAT), pp. 388–395 (2012)

    Google Scholar 

  3. Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the 11th Annual Conference on Computational Learning Theory, Madison, WI, pp. 92–100 (1998)

    Google Scholar 

  4. Lu, H., Zhou, Q., Wang, D., Xiang, R.: A co-training framework for visual tracking with multiple instance learning. In: IEEE International Conference on Automatic Face & Gesture Recognition and Workshops (FG), pp. 539–544 (2011)

    Google Scholar 

  5. Carneiro, G., Nascimento, J.C.: The use of on-line co-training to reduce the training set size in pattern recognition methods: application to left ventricle segmentation in ultrasound. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 948–955 (2012)

    Google Scholar 

  6. Dai, P., Liu, K., Xie, Y., Li, C.: Online co-training ranking SVM for visual tracking. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6568–6572 (2014)

    Google Scholar 

  7. Liu, B., Feng, J., Liu, M., et al.: Predicting the quality of user-generated answers using co-training in community-based question answering portals. Pattern Recogn. Lett. 58, 29–34 (2015)

    Article  Google Scholar 

  8. Fan, M., Qian, T., Chen, L., Liu, B., Zhong, M., He, G.: Authorship attribution with very few labeled data: a co-training approach. In: Li, F., Li, G., Hwang, S.-W., Yao, B., Zhang, Z. (eds.) WAIM 2014. LNCS, vol. 8485, pp. 657–668. Springer, Heidelberg (2014)

    Google Scholar 

  9. Zhang, Y., Wen, J., Wang, X., et al.: Semi-supervised learning combining co-training with active learning. Expert Syst. Appl. 41(5), 2372–2378 (2014)

    Article  Google Scholar 

  10. Li, Y., Liu, W., Wang, Y.: Laplacian regularized co-training signal processing (ICSP). In: 12th International Conference on IEEE, pp. 1408–1412 (2014)

    Google Scholar 

  11. Katz, G., Shabtai, A., Rokach, L.: Adapted Features and Instance Selection for Improving Co-training. In: Holzinger, Andreas, Jurisica, Igor (eds.) Interactive Knowledge Discovery and Data Mining in Biomedical Informatics. LNCS, vol. 8401, pp. 81–100. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  12. Hady, M., Schwenker, F.: Co-training by committee: a new semi- supervised learning framework. In: Proceedings of the IEEE International Conference on Data Mining Workshops, pp. 563–572 (2008)

    Google Scholar 

  13. Wang, S., Wu, L., Jiao, L., et al.: Improve the performance of co-training by committee with refinement of class probability estimations. Neurocomputing 136, 30–40 (2014)

    Article  Google Scholar 

  14. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  MATH  Google Scholar 

  15. Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)

    MATH  Google Scholar 

  16. Freund, Y., Schapire, R.: A decision-theoretic generalization of online learning and an application to boosting. In: Proceedings of the 2nd European Conference on Computational Learning Theory, Barcelona, Spain, pp. 23–37 (1995)

    Google Scholar 

  17. Ho, T.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20(8), 832–844 (1998)

    Article  Google Scholar 

  18. Blake, C., Keogh, E., Merz, C.: UCI repository of machine learning databases. Department of Information and Computer Science, University of California, Irvine, http://www.ics.uci.edu/~mlearn/MLRepository.html (1998)

  19. Quinlan, R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo (1993)

    Google Scholar 

  20. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (1999)

    Google Scholar 

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (No. 61173092, No. 61271302), the Program for New Century Excellent Talents in University (No.NCET-11-0692), the Program for New Scientific and Technological Star of Shaanxi Province (No. 2013KJXX-64), the Fund for Foreign Scholars in University Research and Teaching Programs (No. B07048), and the Program for Cheung Kong Scholars and Innovative Research Team in University(No. IRT1170).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shuang Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Liu, K., Guo, Y., Wang, S., Wu, L., Yue, B., Hou, B. (2015). Semi-supervised Learning Based on Improved Co-training by Committee. In: He, X., et al. Intelligence Science and Big Data Engineering. Big Data and Machine Learning Techniques. IScIDE 2015. Lecture Notes in Computer Science(), vol 9243. Springer, Cham. https://doi.org/10.1007/978-3-319-23862-3_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-23862-3_41

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-23861-6

  • Online ISBN: 978-3-319-23862-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics