Abstract
Semi-supervised classification methods are based on the use of unlabeled data in combination with a smaller set of labeled examples, in order to increase the classification rate compared with the supervised methods, in which the total training is executed only by the usage of labeled data. In this work, a self-train Logitboost algorithm is presented. The self-train process improves the results by using the accurate class probabilities for which the Logitboost regression tree model is more confident at the unlabeled instances. We performed a comparison with other well-known semi-supervised classification methods on standard benchmark datasets and the presented technique had better accuracy in most cases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Rosenberg, C., Hebert, M., Schneiderman, H.: Semi-supervised self-training of object detection models. In: 7th IEEE Workshop on Applications of Computer Vision, pp. 29–36 (2005)
Friedhelm, S., Edmondo, T.: Pattern classification and clustering: A review of partially supervised learning approaches. Pattern Recognition Letters 37, 4–14 (2014)
Zhou, Z.-H., Li, M.: Tri-Training: Exploiting Unlabeled Data Using Three Classifiers. IEEE Trans. on Knowledge and Data Engg. 17(11), 1529–1541 (2005)
Chapelle, O., Schölkopf, B., Zien, A.: Semi-supervised learning. MIT Press, Cambridge (2006)
Wang, S., Wu, L., Jiao, L., Liu, H.: Improve the performance of co-training by committee with refinement of class probability estimations. Neurocomputing 136, 30–40 (2014)
Xu, J., He, H., Man, H.: DCPE co-training for classification. Neurocomputing 86, 75–85 (2012)
Li, M., Zhou, Z.: Improve computer-aided diagnosis with machine learning techniques using undiagnosed samples. IEEE Trans. Syst. Man Cybernet, 1088–1098 (2007)
Hady, M., Schwenker, F.: Co-training by committee: a new semi-supervised learning framework. In: Proceedings of the IEEE International Conference on Data Mining Workshops, pp. 563–572 (2008)
Zhou, Y., Goldman, S.: Democratic co-learning. In: Ictai, 16th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2004), pp. 594–202 (2004)
Sun, S., Jin, F.: Robust co-training. Int. J. Pattern Recognit. Artif. Intell. 25, 1113–1126 (2011)
Sun, S.: A survey of multi-view machine learning. Neural Computing and Applications 23(7–8), 2031–2038 (2013)
Deng, C., Guo, M.Z.: A new co-training-style random forest for computer aided diagnosis. Journal of Intelligent Information Systems 36, 253–281 (2011)
Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: a statistical view of boosting. Ann. Statist. 28(2), 337–407 (2000)
Torgo, L.: Inductive learning of tree-based regression models. AI Communications 13(2), 137–138 (2000)
Jiang, Z., Zhang, S., Zeng, J.: A hybrid generative/discriminative method for semi-supervised classification. Knowledge-Based Systems 37, 137–145 (2013)
Didaci, L., Fumera, G., Roli, F.: Analysis of co-training algorithm with very small training sets. In: Gimel’farb, G., Hancock, E., Imiya, A., Kuijper, A., Kudo, M., Omachi, S., Windeatt, T., Yamada, K. (eds.) SSPR&SPR 2012. LNCS, vol. 7626, pp. 719–726. Springer, Heidelberg (2012)
Guo, T., Li, G.: Improved tri-training with unlabeled data. In: Wu, Y. (ed.) Software Engineering and Knowledge Engineering: Vol. 2. AISC, vol. 115, pp. 139–148. Springer, Heidelberg (2012)
Zhang, M.-L., Zhou, Z.-H.: CoTrade: Confident co-training with data editing. IEEE Trans. Syst. Man Cybernet, Part B: Cybernetics 41(6), 1612–1626 (2011)
Sun, S., Zhang, Q.: Multiple-View Multiple-Learner Semi-Supervised Learning. Neural Process. Lett. 34, 229–240 (2011)
Du, J., Ling, C.X., Zhou, Z.-H.: When. does cotraining work in real data? IEEE Trans. on Knowledge and Data Engg. 23(5), 788–799 (2011)
Zhu, X., Goldberg, A.: Introduction to semi-supervised learning. Synthesis Lectures on Artificial Intelligence and Machine Learning. Morgan & Claypool (2009)
Liu, C., Yuen, P.C.: A boosted co-training algorithm for human action recognition. IEEE Trans. on Circuits and Systems for Video Technology 21(9), 1203–1213 (2011). 5739520
Alcalá-Fdez, J., Fernandez, A., Luengo, J., Derrac, J., García, S., Sánchez, L., Herrera, F.: KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework. Journal of Multiple-Valued Logic and Soft Computing 17(2–3), 255–287 (2011)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11(1) (2009)
Triguero, I., Garca, S., Herrera, F.: Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study. Knowledge and Information Systems 42(2), 245–284 (2015)
García, S., Fernández, A., Luengo, J., Herrera, F.: Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power. Inf. Sciences 180(10), 2044–2064 (2010)
Keerthi, S.S., Shevade, S.K., Bhattacharyya, C., Murthy, K.R.K.: Improvements to Platt’s SMO Algorithm for SVM Classifier Design. Neural Computation 13(3), 637–649 (2001)
Mease, D., Wyner, A.J., Buja, A.: Boosted classification trees and class probability/quantile estimation. J. Mach. Learn. Res. 8, 409–439 (2007)
Provost, F.J., Domingos, P.: Tree induction for probability based ranking. Mach. Learn. 52, 199–215 (2003)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Karlos, S., Fazakis, N., Kotsiantis, S., Sgarbas, K. (2015). Self-Train LogitBoost for Semi-supervised Learning. In: Iliadis, L., Jayne, C. (eds) Engineering Applications of Neural Networks. EANN 2015. Communications in Computer and Information Science, vol 517. Springer, Cham. https://doi.org/10.1007/978-3-319-23983-5_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-23983-5_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23981-1
Online ISBN: 978-3-319-23983-5
eBook Packages: Computer ScienceComputer Science (R0)