Abstract
With the growing usage of credit card transactions, financial fraud crimes have also been drastically increased leading to the loss of huge amounts in the finance industry. Having an efficient fraud detection method has become a necessity for all banks in order to minimize such losses. In fact, credit card fraud detection system involves a major challenge: the credit card fraud data sets are highly imbalanced since the number of fraudulent transactions is much smaller than the legitimate ones. Thus, many of traditional classifiers often fail to detect minority class objects for these skewed data sets. This paper aims first: to enhance classified performance of the minority of credit card fraud instances in the imbalanced data set, for that we propose a sampling method based on the K-means clustering and the genetic algorithm. We used K-means algorithm to cluster and group the minority kind of sample, and in each cluster we use the genetic algorithm to gain the new samples and construct an accurate fraud detection classifier.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Dal Pozzolo, A., Johnson, R.A., Caelen, O., Waterschoot, S., Chawla, N.V., Bontempi, G.: Using HDDT to avoid instances propagation in unbalanced and evolving data streams. In: Proceedings of the International Joint Conference on Neural Networks, pp. 588–594 (2014)
Dal Pozzolo, A., Caelen, O., Bontempi, G.: When is undersampling effective in unbalanced classification tasks? In: Machine Learning and Knowledge Discovery in Databases. Springer, Cambridge (2015)
Chawla, N., Bowyer, K., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. Artif. Intell. Res. 16, 321–357 (2002)
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
Yoav, F., Schapire, R.E.: Experiments with a new boosting algorithm. In: Machine Learning: Proceedings of the Thirteenth International Conference, pp. 148–156 (1996)
Sun, Y., Kamel, M.S., Wong, A.K., Wang, Y.: Cost-sensitive boosting for classification of imbalanced data. Pattern Recogn. 40(12), 3358–3378 (2007)
Ali, A., Shamsuddin, S.M., Ralescu, A.L.: Classification with class imbalance problem: a review. Int. J. Adv. Soft. Comput. Appl. 7(3), 176–204 (2015)
He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)
Mehta, J., Majumdar, A.: RODEO: robust de-aliasing autoencoder for real-time medical image reconstruction. Pattern Recogn. 63, 449–510 (2017)
Zhuang, F., et al.: Representation learning via semi-supervised autoencoder for multi-task learning. In: EEE International Conference on Data Mining (2015)
úIrsoy, O., Alpaydõn, E.: Unsupervised feature extraction with autoencoder trees. Neurocomputing (2017). https://doi.org/10.1016/j.neucom.2017.02.075
Douzi, S., Amar, M., El Ouahidi, B.: Advanced phishing filter using autoencoder and denoising autoencoder. In: Proceedings of the International Conference on Big Data and Internet of Thing, pp. 125–129 (2017)
úIrsoy, O., Holland, J.: Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor (1975)
Goldberg, D.: Computer-aided gas pipeline operation using genetic algorithms and rule learning. Ph.D. thesis. University of Michigan, Ann Arbor (1983)
Ben Amor, H., Rettinger, A.: Intelligent exploration for genetic algorithms: using self-organizing maps in evolutionary computation. In: Proceedings of the 2005 Conference on Genetic and Evolutionary Computation, pp. 1531–1538 (2005)
Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learning, 1st edn. Addison-Wesley Longman Publishing Co., Inc., Boston (1989)
Baker, J.E.: Reducing bias and inefficiency in the selection algorithm. In: Proceedings of the Second International Conference of Genetic Algorithms and Their Application, pp. 14–21 (1987)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Benchaji, I., Douzi, S., El Ouahidi, B. (2019). Using Genetic Algorithm to Improve Classification of Imbalanced Datasets for Credit Card Fraud Detection. In: Khoukhi, F., Bahaj, M., Ezziyyani, M. (eds) Smart Data and Computational Intelligence. AIT2S 2018. Lecture Notes in Networks and Systems, vol 66. Springer, Cham. https://doi.org/10.1007/978-3-030-11914-0_24
Download citation
DOI: https://doi.org/10.1007/978-3-030-11914-0_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-11913-3
Online ISBN: 978-3-030-11914-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)