Abstract
A classification problem involves selecting a training dataset with class labels, developing an accurate description or a model for each class using the attributes available in the data, and then evaluating the prediction quality of the induced model. In this paper, we focus on supervised classification and models which have been obtained from datasets with few examples in relation with the number of attributes. Specifically, we propose a fuzzy discretization method of numerical attributes from datasets with few examples. The discretization of numerical attributes can be a crucial step since there are classifiers that cannot deal with numerical attributes, and there are other classifiers that exhibit better performance when these attributes are discretized. Also we show the benefits of the fuzzy discretization method from dataset with few examples by means of several experiments. The experiments have been validated by means of statistical tests.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Antonelli, M., Ducange, P., Lazzerini, B., Marcelloni, F.: Learning knowledge bases of multi-objective evolutionary fuzzy systems by simultaneously optimizing accuracy, complexity and partition integrity. Soft Comput. 15, 2335–2354 (2011)
Armengol, E. García-Cerdana, A.: Refining discretizations of continuous-valued attributes. In: The 9th International Conference on Modeling Decisions for Artificial Intelligence, pp. 258–269 (2012)
Au, W.H., Chan, K.C., Wong, A.: A fuzzy approach to partitioning continuous attributes for classification. IEEE Trans. Knowl. Data Eng. 18(5), 715–719 (2006)
Bonissone, P.P., Cadenas, J.M., Garrido, M.C., Díaz-Valladares, R.A.: A fuzzy random forest. Int. J. Approx. Reason. 51(7), 729–747 (2010)
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996a)
Breiman, L.: Heuristics of instability and stabilization in model selection. Ann. Stat. 24(6), 2350–2383 (1996b)
Cadenas, J.M., Garrido, M.C., Martínez, R., Bonissone, P.P.: Extending information processing in a fuzzy random forest ensemble. Soft Comput. 16(5), 845–861 (2012a)
Cadenas, J.M., Garrido, M.C., Martínez, R., Bonissone, P.P.: Ofp\_class: a hybrid method to generate optimized fuzzy partitions for classification. Soft Comput. 16, 667–682 (2012b)
Cox, E.: Fuzzy Modeling and Genetic Algorithms for Data Mining and Exploration. Morgan Kaufmann Publishers, New York (2005)
Diaz-Uriarte, R., de Andrés, S.A.: Gene selection and classification of microarray data using random forest. BMC Bioinform. 7(3) (2006)
Frank, A., Asuncion, A.: UCI Machine Learning Repository. School of Information and Computer Sciences, University of California, Irvine (2010)
García, S., Fernández, A., Luengo, J., Herrera, F.: A study statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability. Soft Comput. 13(10), 959–977 (2009)
Ihaka, R., Gentleman, R.R.: A language for data analysis and graphics. J. Comput. Graph. Stat. 5(3), 299–314 (1996)
Jain, A.K.: Statistical pattern recognition: a review. IEEE Transa. Pattern Anal. Mach. Intell. 22, 4–37 (2000)
Kianmehr, K., Alshalalfa, M., Alhajj, R.: Fuzzy clustering-based discretization for gene expression classification. Knowl. Inf. Syst. 24, 441–465 (2010)
Qureshi, T., Zighed, D.A.: A soft discretization technique for fuzzy decision trees using resampling. Intelligent Data Engineering and Automated Learning—IDEAL 2009. Lecture Notes in Computer Science, vol. 5788, pp. 586–593 (2009)
Unler, A., Murat, A.: A discrete particle swarm optimization method for feature selection in binary classification problems. Eur. J. Oper. Res. 206, 528–539 (2010)
Wang, C., Wang, M., She, Z., Cao, L.: CD: a coupled discretization algorithm. Advances in Knowledge Discovery and Data Mining. Lecture Notes in Computer Science, vol. 7302, pp. 407–418 (2012)
Acknowledgments
Supported by the project TIN2011-27696-C02-02 of the Ministry of Economy and Competitiveness of Spain. Thanks also to “Fundación Séneca - Agencia de Ciencia y Tecnología de la Región de Murcia” (Spain) for the support given to Raquel Martínez by the scholarship program FPI.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Cadenas, J.M., Garrido, M.C., Martínez, R. (2016). Fuzzy Discretization Process from Small Datasets. In: Madani, K., Dourado, A., Rosa, A., Filipe, J., Kacprzyk, J. (eds) Computational Intelligence. IJCCI 2013. Studies in Computational Intelligence, vol 613. Springer, Cham. https://doi.org/10.1007/978-3-319-23392-5_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-23392-5_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23391-8
Online ISBN: 978-3-319-23392-5
eBook Packages: EngineeringEngineering (R0)