Using Genetic Algorithm to Improve Classification of Imbalanced Datasets for Credit Card Fraud Detection

Benchaji, Ibtissam; Douzi, Samira; El Ouahidi, Bouabid

doi:10.1007/978-3-030-11914-0_24

Using Genetic Algorithm to Improve Classification of Imbalanced Datasets for Credit Card Fraud Detection

Ibtissam Benchaji⁵,
Samira Douzi⁵ &
Bouabid El Ouahidi⁵

Conference paper
First Online: 01 March 2019

490 Accesses
10 Citations

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 66))

Abstract

With the growing usage of credit card transactions, financial fraud crimes have also been drastically increased leading to the loss of huge amounts in the finance industry. Having an efficient fraud detection method has become a necessity for all banks in order to minimize such losses. In fact, credit card fraud detection system involves a major challenge: the credit card fraud data sets are highly imbalanced since the number of fraudulent transactions is much smaller than the legitimate ones. Thus, many of traditional classifiers often fail to detect minority class objects for these skewed data sets. This paper aims first: to enhance classified performance of the minority of credit card fraud instances in the imbalanced data set, for that we propose a sampling method based on the K-means clustering and the genetic algorithm. We used K-means algorithm to cluster and group the minority kind of sample, and in each cluster we use the genetic algorithm to gain the new samples and construct an accurate fraud detection classifier.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Dal Pozzolo, A., Johnson, R.A., Caelen, O., Waterschoot, S., Chawla, N.V., Bontempi, G.: Using HDDT to avoid instances propagation in unbalanced and evolving data streams. In: Proceedings of the International Joint Conference on Neural Networks, pp. 588–594 (2014)
Google Scholar
Dal Pozzolo, A., Caelen, O., Bontempi, G.: When is undersampling effective in unbalanced classification tasks? In: Machine Learning and Knowledge Discovery in Databases. Springer, Cambridge (2015)
Chapter Google Scholar
Chawla, N., Bowyer, K., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. Artif. Intell. Res. 16, 321–357 (2002)
Article Google Scholar
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
MATH Google Scholar
Yoav, F., Schapire, R.E.: Experiments with a new boosting algorithm. In: Machine Learning: Proceedings of the Thirteenth International Conference, pp. 148–156 (1996)
Google Scholar
Sun, Y., Kamel, M.S., Wong, A.K., Wang, Y.: Cost-sensitive boosting for classification of imbalanced data. Pattern Recogn. 40(12), 3358–3378 (2007)
Article Google Scholar
Ali, A., Shamsuddin, S.M., Ralescu, A.L.: Classification with class imbalance problem: a review. Int. J. Adv. Soft. Comput. Appl. 7(3), 176–204 (2015)
Google Scholar
He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)
Article Google Scholar
Mehta, J., Majumdar, A.: RODEO: robust de-aliasing autoencoder for real-time medical image reconstruction. Pattern Recogn. 63, 449–510 (2017)
Article Google Scholar
Zhuang, F., et al.: Representation learning via semi-supervised autoencoder for multi-task learning. In: EEE International Conference on Data Mining (2015)
Google Scholar
úIrsoy, O., Alpaydõn, E.: Unsupervised feature extraction with autoencoder trees. Neurocomputing (2017). https://doi.org/10.1016/j.neucom.2017.02.075
Article Google Scholar
Douzi, S., Amar, M., El Ouahidi, B.: Advanced phishing filter using autoencoder and denoising autoencoder. In: Proceedings of the International Conference on Big Data and Internet of Thing, pp. 125–129 (2017)
Google Scholar
úIrsoy, O., Holland, J.: Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor (1975)
Google Scholar
Goldberg, D.: Computer-aided gas pipeline operation using genetic algorithms and rule learning. Ph.D. thesis. University of Michigan, Ann Arbor (1983)
Google Scholar
Ben Amor, H., Rettinger, A.: Intelligent exploration for genetic algorithms: using self-organizing maps in evolutionary computation. In: Proceedings of the 2005 Conference on Genetic and Evolutionary Computation, pp. 1531–1538 (2005)
Google Scholar
Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learning, 1st edn. Addison-Wesley Longman Publishing Co., Inc., Boston (1989)
MATH Google Scholar
Baker, J.E.: Reducing bias and inefficiency in the selection algorithm. In: Proceedings of the Second International Conference of Genetic Algorithms and Their Application, pp. 14–21 (1987)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer IPSS, Faculty of Sciences, Mohammed V University, Rabat, Morocco
Ibtissam Benchaji, Samira Douzi & Bouabid El Ouahidi

Authors

Ibtissam Benchaji
View author publications
You can also search for this author in PubMed Google Scholar
Samira Douzi
View author publications
You can also search for this author in PubMed Google Scholar
Bouabid El Ouahidi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ibtissam Benchaji .

Editor information

Editors and Affiliations

Faculty of Sciences and Technologies, Mohammedia, Morocco
Faddoul Khoukhi
Faculty of Sciences and Technologies, Settat, Morocco
Mohamed Bahaj
Faculty of Sciences and Technologies, Boukhalef Tangier, Morocco
Mostafa Ezziyyani

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Benchaji, I., Douzi, S., El Ouahidi, B. (2019). Using Genetic Algorithm to Improve Classification of Imbalanced Datasets for Credit Card Fraud Detection. In: Khoukhi, F., Bahaj, M., Ezziyyani, M. (eds) Smart Data and Computational Intelligence. AIT2S 2018. Lecture Notes in Networks and Systems, vol 66. Springer, Cham. https://doi.org/10.1007/978-3-030-11914-0_24

Download citation

DOI: https://doi.org/10.1007/978-3-030-11914-0_24
Published: 01 March 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-11913-3
Online ISBN: 978-3-030-11914-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics