Abstract
This chapter presents a new methodology to extract a Fuzzy System by using Genetic Algorithms for the classification of imbalanced datasets when the intelligibility of the Fuzzy Rules is an issue. We propose a method for fuzzy variable construction, based on modifying the set of fuzzy variables obtained by the DDA/RecBF clustering algorithm. Afterwards, these variables are recombined to obtain Fuzzy Rules by means of a Genetic Algorithm. The method has been developed for the prenatal Down’s syndrome detection during the secondtrimester of pregnancy. We present empirical results showing its accuracy for this task. Furthermore, we provide more generic experimental results over UCI datasets proving that the method can have a wider applicability on imbalanced datasets.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Japkowicz, N., Stephen, S.: The Class Imbalance Problem: A Systematic Study. Intelligent Data Analysis 6(5), 429–450 (2002)
Chawla, N., Japkowicz, N., Kolcz, A. (eds.): Learning from Imbalanced Data Sets, ACM SIGKDD Explorations 6(1) (June 2004) (special issue)
Chawla, N., Bowyer, K., Hall, L., Kegelmeyer, W.: SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research 16, 321–357 (2002)
Japkowicz, N.: The Class Imbalance Problem: Significance and Strategies. In: Proceedings of the 2000 International Conference on Artificial Intelligence: Special Track on Inductive Learning, Las Vegas, Nevada (2000)
Kecman, V.: Learning & Soft Computing, Support Vector Machines, Neural Networks and Fuzzy Logic Systems. MIT Press, Cambridge (2001)
Akbani, R., Kwek, S., Japkowicz, N.: Applying Support Vector Machines to Imbalanced Datasets. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201. Springer, Heidelberg (2004)
Wu, G., Chang, E.Y.: KBA: Kernel Boundary Alignment Considering Imbalanced Data Distribution. IEEE Transactions on knowledge and data engineering (2005)
Domingos, P.: MetaCost: a general method for making classifiers cost-sensitive. In: Vth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 1999), San Diego, USA, pp. 155–164 (1999)
Zadrozny, B., Elkan, C.: Learning and making decisions when costs and probabilities are both unknown. In: VII ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2001), San Francisco, USA, pp. 204–213 (2001)
Merler, S., Furlanello, C., Larcher, B., Sboner, A.: Automatic model selection in costsensitive boosting. Information Fusion 4(1), 3–10 (2003)
Kubat, M., Holte, R., Matwin, S.: Learning when negative examples abound. In: van Someren, M., Widmer, G. (eds.) ECML 1997. LNCS, vol. 1224, pp. 146–153. Springer, Heidelberg (1997)
Zhang, J., Bloedorn, E., Rosen, L., Venese, D.: Learning rules from highly unbalanced data sets. In: IVth IEEE International Conference on Data Mining (ICDM 2004), Brighton, UK, pp. 571–574 (2004)
Berthold, M., Huber, K.P.: Constructing Fuzzy Graphs from Examples. Intelligent Data Analysis 3, 37–53 (1999)
Goldberg, D.E.: Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley, Reading (1989)
Wu, T.P., Chen, S.M.: A New Method for Constructing membership functions and Fuzzy Rules from Training Examples. IEEE Transactions on Systems, Man and Cybernetics-Part B: Cybernetics 29(1), 25–40 (1999)
Wang, L.X., Mendel, J.M.: Generating Fuzzy Rules by Learning from Examples. IEEE Transactions on Systems, Man and Cybernetics 22(6), 1414–1427 (1992)
Soler, V., Roig, J., Prim, M.: Adapting Fuzzy Points for Very-Imbalanced Datasets. In: NAFIPS 2006 Conference (2006)
Sabrià, J.: Screening bioquímico del segundo trimestre. Nuestra experiencia, Progresos en Diagnóstico Prenatal 10(4), 147–153 (1998)
Norgaard-Pedersen, L., et al.: Maternal serum markers in screening for Down syndrome (1990)
Soler, V., Roig, J., Prim, M.: Finding Exceptions to Rules in Fuzzy Rule Extraction. In: KES 2002, Knowledge-based Intelligent Information Engineering Systems, Part 2, pp. 1115–1119 (2002)
Sordo, M.: Neural Nets for Detection of Down’s Syndrome. MSc Thesis, Department of Artificial Intelligence, University of Edinburgh, UK (1995)
Batista, G.E.A.P.A., Prati, R.C., Monard, M.C.: A study of the behavior of several methods for balancing machine learning training data. SIGKDD Explorations 6(1), 20–29 (2004)
Vapnik, V.: Statistical learning theory. Wiley, New York (1998)
Visa, S., Ralescu, A.: Learning imbalanced and overlapping classes using fuzzy sets. In: Workshop on Learning from Imbalanced Datasets (ICML 2003), Washington, DC, USA (2003)
Blake, C.L., Merz, C.J.: UCI repository of machine learning databases, University of California Irvine, USA (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
Veropoulos, K., Campbell, C., Cristianini, N.: Controlling the sensitivity of support vector machines. In: Proceedings of the International Joint Conference on AI, pp. 55–60 (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Soler, V., Prim, M. (2009). Extracting a Fuzzy System by Using Genetic Algorithms for Imbalanced Datasets Classification: Application on Down’s Syndrome Detection. In: Zighed, D.A., Tsumoto, S., Ras, Z.W., Hacid, H. (eds) Mining Complex Data. Studies in Computational Intelligence, vol 165. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88067-7_2
Download citation
DOI: https://doi.org/10.1007/978-3-540-88067-7_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88066-0
Online ISBN: 978-3-540-88067-7
eBook Packages: EngineeringEngineering (R0)