Learning techniques have proven their capacity to treat large amount of data. Most statistical learning approaches use specific size learning sets and create static models. Withal, in certain some situations such as incremental or active learning the learning process can work with only a smal amount of data. In this case, the search for algorithms capable of producing models with only a few examples begin to be necessary. Generally, the literature relative to classifiers are evaluated according to criteria such as their classification performance, their ability to sort data. But this taxonomy of classifiers can singularly evolve if one is interested in their capabilities in the presence of some few examples. From our point of view, few studies have been carried out on this issue. It is in sense that this paper seeks to study a wider range of learning algorithms as well as data sets in order to show the power of every chosen algorithm that manipulates data. It also appears from this study, problem of algorithm’s choice to process small or large amount of data. And in order to resolve this, we will show that there are algorithms able of generating models with little data. In this case we look to select the smallest amount of data allowing the best learning to be achieved. We also wanted to show that some algorithms are capable of making good predictions with little data that is therefore necessary in order to have the least costly labeling procedure possible. And to concretize this, we will talk first about learning speed and typology of the tested algorithms to know the ability of a classifier to obtain an “interesting” solution to a classification problem using a minimum of examples present in learning, and we will know some various families of classification models based on parameter learning. After that, we will test all the classifiers mentioned previously such as linear and Non-linear classifiers. Then, we will seek to study the behavior these algorithms as a function of learning set’s size trough the experimental protocol in which various datasets will be Splited, manipulated and evaluated from the classification field in order to give results that merge from our experimental protocol. After that, we will discuss the obtained results through the global analysis section, and then conclude with recommendations.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
Tax calculation will be finalised during checkout.
Bauer, E., & Kohavi, R. (1999). An Empirical Comparison Of Voting Classification Algorithms: Bagging, boosting, and variants. Machine Learning, 36(1–2), 105–139.
Beluch, W. H., Genewein, T., Nürnberger, A., & Köhler, J. M. (2018). The power of ensembles for active learning in image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 9368–9377).
Blake, C. L. & Merz, C. J. (1998). UCI repository of machine learning databases. University of California, Irvine, Dept. of Information and Computer Sciences.
Bouchard, G. & Triggs, B. (2004). August. The tradeoff between generative and discriminative classifiers, pp.721–728.
Bouckaert, R. R. (2004). Bayesian network classifiers in weka.
Boulle, M. (2004). Khiops: A Statistical Discretization Method Of Continuous Attributes. Machine Learning, 55(1), 53–69.
Boullé, M. (2005). A grouping method for categorical attributes having very large number of values. In International Workshop on Machine Learning and Data Mining in Pattern Recognition (pp. 228–242). Berlin, Heidelberg: Springer.
Boullé, M. (2006a). MODL: A Bayes optimal discretization method for continuous attributes. Machine Learning, 65(1), 131–165.
Boullé, M. (2006b). Regularization and averaging of the selective Na ï ve Bayes classifier. In The 2006 IEEE International Joint Conference on Neural Network Proceedings (pp. 1680–1688) IEEE.
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.
Breiman, L., Friedman, J., Stone, C. J. & Olshen, R. A. (1984). Classification and regression trees. Boca Raton: CRC press.
Cervantes, A., Gagné, C., Isasi, P. & Parizeau, M. (2018). Evaluating and characterizing incremental learning from non-stationary data. arXiv preprint arXiv:1806.06610.
Chen, S., Webb, G. I., Liu, L., & Ma, X. (2019). A novel selective Naïve Bayes Algorithm. Knowledge-Based Systems, 105361.
Cucker, F., & Smale, S. (2002). Best Choices For Regularization Parameters in learning theory: on the bias-variance problem. Foundations of Computational Mathematics, 2(4), 413–428.
Demiröz, G., & Güvenir, H. A. (1997). Classification by voting feature intervals. In European Conference on Machine Learning (pp. 85–92). Berlin, Heidelberg: Springer.
Domingos, P., & Hulten, G. (2000). Mining high-speed data streams. In Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 71–80).
Domingos, P., & Pazzani, M. (1997). On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning, 29(2–3), 103–130.
Fawcett, T. (2004). ROC graphs: notes and practical considerations for researchers. Machine Learning, 31(1), 1–38.
Féraud, R., Boullé, M., Clérot, F., Fessant, F., & Lemaire, V. (2010). The orange customer analysis platform. In Industrial Conference on Data Mining (pp. 584–594). Springer, Berlin, Heidelberg.
Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7(2), 179–188.
Freund, Y., & Mason, L. (1999). The alternating decision tree learning algorithm. In icml (Vol. 99, pp. 124–133).
Gama, J., Rocha, R., & Medas, P. (2003). Accurate decision trees for mining high-speed data streams. In proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining (pp. 523–528).
Gama, J., Medas, P., & Rodrigues, P. (2005). Learning decision trees from dynamic data streams. In proceedings of the 2005 ACM symposium on applied computing (pp. 573–577).
Guyon, I., Lemaire, V., Boullé, M., Dror, G., & Vogel, D. (2009). Analysis of the kdd cup 2009: Fast scoring on a large orange customer database. In KDD-Cup 2009 Competition (pp. 1–22).
Guyon, I., Cawley, G. C., Dror, G., & Lemaire, V. (2011). Results of the active learning challenge. In Active Learning and Experimental Design workshop. In conjunction with AISTATS 2010 (pp. 19–45).
Han, T., Jiang, D., Zhao, Q., Wang, L., & Yin, K. (2018). Comparison of random forest, artificial neural networks and support vector machine for intelligent diagnosis of rotating machinery. Transactions of the Institute of Measurement and Control, 40(8), 2681–2693.
Hyndman, R. J., & Athanasopoulos, G. (2018). Forecasting: principles and practice. OTexts.
John, G. H. & Langley, P. (1995). Estimating continuous distributions in Bayesian classifiers. In proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence.
Langley, P., Iba, W. & Thomas, K. (1992). An analysis of Bayesian classi er. In proceedings of the Tenth National Conference of Artificial Intelligence.
Le Cessie, S., & Van Houwelingen, J. C. (1992). Ridge estimators in logistic regression. Journal of the Royal Statistical Society: Series C (Applied Statistics), 41(1), 191–201.
Lim, T. S., Loh, W. Y., & Shih, Y. S. (2000). A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms. Machine Learning, 40(3), 203–228.
Losing, V., Hammer, B., & Wersing, H. (2018). Incremental on-iine learning: a review and comparison of state of the art algorithms. Neurocomputing, 275, 1261–1274.
Michalski, R. S., Mozetic, I., Hong, J. & Lavrac, N. (1986). The multi-purpose incremental learning system Aq15 and its testing application to three medical domains. Proc. AAAI 1986, pp.1–041.
Mohamad, S., Sayed-Mouchaweh, M., & Bouchachia, A. (2018). Active learning for classifying data streams with unknown number of classes. Neural Networks, 98, 1–15.
Quinlan, J. R. (1993). C4. 5: programs for machine learning. Morgan Kaufmann, San Francisco. C4. 5: Programs for machine learning. Morgan Kaufmann, San Francisco.
Settles, B. (2010). Active learning literature survey. University of Wisconsin. Madison: Computer Science technical report 1648 52, 55-66.
Wang, J., Zhang, L., Cao, J. J., & Han, D. (2018). NBWELM: Naive Bayesian based weighted extreme learning machine. International Journal of Machine Learning and Cybernetics, 9(1), 21–35.
Wen, J., Fang, X., Cui, J., Fei, L., Yan, K., Chen, Y., & Xu, Y. (2018). Robust sparse linear discriminant analysis. IEEE Transactions on Circuits and Systems for Video Technology, 29(2), 390–403.
Witten, I. H., & Frank, E. (2002). Data mining: practical machine learning tools and techniques with java implementations. ACM SIGMOD Record, 31(1), 76–77.
Wolpert, D. H. (2018). The relationship between PAC, the statistical physics framework, the Bayesian framework, and the VC framework. In The mathematics of generalization (pp. 117–214). CRC press.
Xu, J., Xu, C., Zou, B., Tang, Y. Y., Peng, J., & You, X. (2018). New incremental learning algorithm with support vector machines. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 49(11), 2230–2241.
Palachy, S. (2019). Detecting stationarity in time series data. Available on line at: https://towardsdatascience.com/detecting-stationarity-in-time-series-data-d29e0a21e638.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article has been retracted. Please see the retraction notice for more detail: https://doi.org/10.1007/s10639-020-10422-x
About this article
Cite this article
Korchi, A., Dardor, M. & Mabrouk, E.H. RETRACTED ARTICLE: Impact of the learning set’s size. Educ Inf Technol 25, 4637–4657 (2020). https://doi.org/10.1007/s10639-020-10165-9