Skip to main content

Using Genetic Algorithm to Improve Classification of Imbalanced Datasets for Credit Card Fraud Detection

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 66))

Abstract

With the growing usage of credit card transactions, financial fraud crimes have also been drastically increased leading to the loss of huge amounts in the finance industry. Having an efficient fraud detection method has become a necessity for all banks in order to minimize such losses. In fact, credit card fraud detection system involves a major challenge: the credit card fraud data sets are highly imbalanced since the number of fraudulent transactions is much smaller than the legitimate ones. Thus, many of traditional classifiers often fail to detect minority class objects for these skewed data sets. This paper aims first: to enhance classified performance of the minority of credit card fraud instances in the imbalanced data set, for that we propose a sampling method based on the K-means clustering and the genetic algorithm. We used K-means algorithm to cluster and group the minority kind of sample, and in each cluster we use the genetic algorithm to gain the new samples and construct an accurate fraud detection classifier.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Dal Pozzolo, A., Johnson, R.A., Caelen, O., Waterschoot, S., Chawla, N.V., Bontempi, G.: Using HDDT to avoid instances propagation in unbalanced and evolving data streams. In: Proceedings of the International Joint Conference on Neural Networks, pp. 588–594 (2014)

    Google Scholar 

  2. Dal Pozzolo, A., Caelen, O., Bontempi, G.: When is undersampling effective in unbalanced classification tasks? In: Machine Learning and Knowledge Discovery in Databases. Springer, Cambridge (2015)

    Chapter  Google Scholar 

  3. Chawla, N., Bowyer, K., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. Artif. Intell. Res. 16, 321–357 (2002)

    Article  Google Scholar 

  4. Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)

    MATH  Google Scholar 

  5. Yoav, F., Schapire, R.E.: Experiments with a new boosting algorithm. In: Machine Learning: Proceedings of the Thirteenth International Conference, pp. 148–156 (1996)

    Google Scholar 

  6. Sun, Y., Kamel, M.S., Wong, A.K., Wang, Y.: Cost-sensitive boosting for classification of imbalanced data. Pattern Recogn. 40(12), 3358–3378 (2007)

    Article  Google Scholar 

  7. Ali, A., Shamsuddin, S.M., Ralescu, A.L.: Classification with class imbalance problem: a review. Int. J. Adv. Soft. Comput. Appl. 7(3), 176–204 (2015)

    Google Scholar 

  8. He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)

    Article  Google Scholar 

  9. Mehta, J., Majumdar, A.: RODEO: robust de-aliasing autoencoder for real-time medical image reconstruction. Pattern Recogn. 63, 449–510 (2017)

    Article  Google Scholar 

  10. Zhuang, F., et al.: Representation learning via semi-supervised autoencoder for multi-task learning. In: EEE International Conference on Data Mining (2015)

    Google Scholar 

  11. úIrsoy, O., Alpaydõn, E.: Unsupervised feature extraction with autoencoder trees. Neurocomputing (2017). https://doi.org/10.1016/j.neucom.2017.02.075

    Article  Google Scholar 

  12. Douzi, S., Amar, M., El Ouahidi, B.: Advanced phishing filter using autoencoder and denoising autoencoder. In: Proceedings of the International Conference on Big Data and Internet of Thing, pp. 125–129 (2017)

    Google Scholar 

  13. úIrsoy, O., Holland, J.: Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor (1975)

    Google Scholar 

  14. Goldberg, D.: Computer-aided gas pipeline operation using genetic algorithms and rule learning. Ph.D. thesis. University of Michigan, Ann Arbor (1983)

    Google Scholar 

  15. Ben Amor, H., Rettinger, A.: Intelligent exploration for genetic algorithms: using self-organizing maps in evolutionary computation. In: Proceedings of the 2005 Conference on Genetic and Evolutionary Computation, pp. 1531–1538 (2005)

    Google Scholar 

  16. Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learning, 1st edn. Addison-Wesley Longman Publishing Co., Inc., Boston (1989)

    MATH  Google Scholar 

  17. Baker, J.E.: Reducing bias and inefficiency in the selection algorithm. In: Proceedings of the Second International Conference of Genetic Algorithms and Their Application, pp. 14–21 (1987)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ibtissam Benchaji .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Benchaji, I., Douzi, S., El Ouahidi, B. (2019). Using Genetic Algorithm to Improve Classification of Imbalanced Datasets for Credit Card Fraud Detection. In: Khoukhi, F., Bahaj, M., Ezziyyani, M. (eds) Smart Data and Computational Intelligence. AIT2S 2018. Lecture Notes in Networks and Systems, vol 66. Springer, Cham. https://doi.org/10.1007/978-3-030-11914-0_24

Download citation

Publish with us

Policies and ethics