Abstract
One of the significant problems in classification is class noise which has numerous potential consequences such as reducing the overall accuracy and increasing the complexity of the induced model. Subsequently, finding and eliminating misclassified instances are known as important phases in machine learning and data mining. The predictions of classifiers can be applied to detect noisy instances, inconsistent data and errors, what is called classification filtering. It creates a new set of dataset to develop a reliable and precise classification model. In this paper we analyze the effect of class noise on six supervised learning algorithms. To evaluate the performance of the classification filtering algorithms, several experiments were conducted on six real datasets. Finally, the noisy instances are removed and relabeled and the performance was then measured using evaluation criteria. The findings of this study show that classification filtering have a potential capability to detect class noise.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Sluban, B., Lavrač, N.: Relating ensemble diversity and performance: a study in class noise detection. Neurocomputing 160, 120–131 (2015)
Frénay, B., Verleysen, M.: Classification in the presence of label noise: a survey. IEEE Trans. Neural Netw. Learn. Syst. 25, 845–869 (2014)
Zhu, X., Wu, X.: Class noise vs. attribute noise: a quantitative study of their impacts, pp. 177–210 (2004)
Lowongtrakool, C., Hiransakolwong, N.: Noise filtering in unsupervised clustering using computation intelligence. Int. J. Math. Anal. 6(59), 2911–2920 (2012)
Brodley, C.E., Friedl, M.A.: Identifying mislabeled training data. J. Artif. Intell. Res. 11(1), 131–167 (1999)
Gamberger, D., Lavrac, N., Groselj, C.: Experiments with noise filtering in a medical domain. In: ICML, pp. 143–51. Citeseer (1999)
Khoshgoftaar, T.M., Rebours, P.: Generating multiple noise elimination filters with the ensemble-partitioning filter. In: Proceedings of the 2004 IEEE International Conference on Information Reuse and Integration, IRI 2004. IEEE (2004)
Vapnik, V.N., Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)
Yuan, L.: An improved Naive Bayes text classification algorithm in Chinese information processing. In: Proceedings of the Third International Symposium on Computer Science and Computational Technology (ISCSCT 2010) (2010)
Folorunsho, O.: Comparative study of different data mining techniques performance in knowledge discovery from medical database. Int. J. 3(3) (2013)
Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13, 21–27 (1967)
Thongkam, J., Xu, G., Zhang, Y., Huang, F.: Toward breast cancer survivability prediction models through improving training space. Expert Syst. Appl. 36, 12200–12209 (2009)
Jeatrakul, P., Wong, K.W., Fung, C.C.: Classification of imbalanced data by combining the complementary neural network and SMOTE algorithm. In: Wong, K.W., Mendis, B.U., Bouzerdoum, A. (eds.) ICONIP 2010, Part II. LNCS, vol. 6444, pp. 152–159. Springer, Heidelberg (2010)
Sluban, B., Gamberger, D., Lavra, N.: Advances in class noise detection. Front. Artif. Intell. Appl. 215, 1105–1106 (2010)
Miranda, A.L., Garcia, L.P.F., Carvalho, A.C., Lorena, A.C.: Use of classification algorithms in noise detection and elimination. In: Corchado, E., Wu, X., Oja, E., Herrero, Á., Baruque, B. (eds.) HAIS 2009. LNCS, vol. 5572, pp. 417–424. Springer, Heidelberg (2009)
Segata, N., Blanzieri, E., Cunningham, P.: A scalable noise reduction technique for large case-based systems. In: McGinty, L., Wilson, D.C. (eds.) ICCBR 2009. LNCS, vol. 5650, pp. 328–342. Springer, Heidelberg (2009)
Segata, N., Blanzieri, E., Delany, S.J., Cunningham, P.: Noise reduction for instance-based learning with a local maximal margin approach. J. Intell. Inf. Syst. 35(2), 301–331 (2010)
Angelova, A., Abu-Mostafa, Y., Perona, P.: Pruning training sets for learning of object categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, pp. 494–501. IEEE (2005)
Frank, A., Asuncion, A.: UCI machine learning repository. University of California, School of Information and Computer Science, Irvine, CA (2010). http://archive.ics.uci.edu/ml
Garcia, L.P.F., Lorena, A.C., Carvalho, A.C.: A study on class noise detection and elimination. In: 2012 Brazilian Symposium on Neural Networks (SBRN), pp. 13–18. IEEE (2012)
Nematzadeh, Z., Ibrahim, R., Selamat, A.: A method for class noise detection based on k-means and SVM algorithms. In: Fujita, H., Guizzi, G. (eds.) SoMeT 2015. CCIS, vol. 532, pp. 308–318. Springer, Heidelberg (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Nematzadeh, Z., Ibrahim, R., Selamat, A. (2017). Class Noise Detection Using Classification Filtering Algorithms. In: Phon-Amnuaisuk, S., Au, TW., Omar, S. (eds) Computational Intelligence in Information Systems. CIIS 2016. Advances in Intelligent Systems and Computing, vol 532. Springer, Cham. https://doi.org/10.1007/978-3-319-48517-1_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-48517-1_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-48516-4
Online ISBN: 978-3-319-48517-1
eBook Packages: EngineeringEngineering (R0)