Abstract
Credit card fraudulent transactions are causing businesses and banks to lose time and money. Detecting fraudulent transactions before a transaction is finalized will help businesses and banks to save resources. This research aims to compare the fraud detection accuracy of different sampling techniques and classification algorithms. An efficient method of detecting fraud using machine learning is proposed. Anonymized data set from Kaggle was used for detecting fraudulent transactions. Each transaction has been labeled as either a fraudulent transaction or not. The severe imbalance between fraud and non-fraudulent data caused the algorithms to under-perform. This was addressed with the application of sampling techniques. The combination of undersampling and SMOTE raised the recall accuracy of the classification algorithm. k-NN algorithm showed the highest recall accuracy compared to the other algorithms.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Jha, S., Westland, J.C.: A descriptive study of credit card fraud pattern. Glob. Bus. Rev. 14, 373–384 (2013)
Liñares-Zegarra, J., Wilson, J.O.S.: Credit card interest rates and risk: new evidence from US survey data. Eur. J. Financ. 20, 892–914 (2014)
Lepoivre, M.R., Avanzini, C.O., Bignon, G., Legendre, L., Piwele, A.K.: Credit card fraud detection with unsupervised algorithms (Report). J. Adv. Inf. Technol. 7, 34 (2016)
Bhattacharyya, S., Jha, S., Tharakunnel, K., Westland, J.C.: Data mining for credit card fraud: a comparative study. Dec. Support Syst. 50, 602–613 (2011)
Prakash, C.: A parameter optimized approach for improving credit card fraud detection. Int. J. Comput. Sci. Issues 10, 360–366 (2013)
Venkata Ratnam, G., Siva Naga Prasad, M.: Credit card fraud detection using anti-k nearest neighbor algorithm. Int. J. Comput. Sci. Eng. 4, 1035–1039 (2012)
Correa Bahnsen, A., Aouada, D., Stojanovic, A., Ottersten, B.: Feature engineering strategies for credit card fraud detection. Exp. Syst. Appl. 51, 134–142 (2016)
Dal Pozzolo, A., Caelen, O., Le Borgne, Y.-A., Waterschoot, S., Bontempi, G.: Learned lessons in credit card fraud detection from a practitioner perspective. Exp. Syst. Appl. 41, 4915–4928 (2014)
Lee, Y.J., Yeh, Y.R., Wang, Y.C.F.: Anomaly detection via online oversampling principal component analysis. IEEE Trans. Knowl. Data Eng. 25, 1460–1470 (2013)
http://setosa.io/ev/principal-component-analysis/. Accessed 11 Nov 2017
Dal Pozzolo, A., Caelen, O., Johnson, R.A., Bontempi, G.: Calibrating probability with undersampling for unbalanced classification. In: 2015 IEEE Symposium Series on Computational Intelligence, pp. 159–166. IEEE (2015)
Liu, X.-Y., Wu, J., Zhou, Z.-H.: Exploratory undersampling for class-imbalance learning. IEEE Trans. Syst. Man Cybern. Part B (Cybernetics) 39, 539–550 (2009)
Akbani, R., Kwek, S., Japkowicz, N.: Applying support vector machines to imbalanced datasets. In: European Conference on Machine Learning, pp. 39–50. Springer, Heidelberg (2004)
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Khyati, C., Bhawna, M.: Exploration of Data mining techniques in fraud detection: credit card. Int. J. Electron. Comput. Sci. Eng. 1, 1765–1771 (2012)
Chawla, N.V.: Data mining for imbalanced datasets: an overview. In: Data Mining and Knowledge Discovery Handbook, pp. 853–867. Springer, Boston (2005)
Nadarajan, S., Ramanujam, B.: Encountering imbalance in credit card fraud detection with metaheuristics. Adv. Nat. Appl. Sci. 10, 33–41 (2016)
Liaw, A., Wiener, M.: Classification and regression by randomForest. R News 2, 18–22 (2002)
Le Cessie, S., Van Houwelingen, J.C.: Ridge estimators in logistic regression. Appl. Stat. 41, 191–201 (1992)
Excel Master Series. http://blog.excelmasterseries.com/2014/06/logistic-regression-performed-in-excel.html. Accessed 13 Nov 2017
MedCalc. https://www.medcalc.org/manual/logistic_regression.php. Accessed 13 Nov 2017
Analytics Vidhya. https://www.analyticsvidhya.com/blog/2015/10/basics-logistic-regression/. Accessed 15 Nov 2017
Hand, D.J., Mannila, H., Smyth, P.: Principles of Data Mining. MIT Press, Cambridge (2001)
Analytics Vidhya. https://www.analyticsvidhya.com/blog/2017/09/naive-bayes-explained/. Accessed 15 Nov 2017
Acknowledgements
We would like to thank School of Engineering and IT, Charles Darwin University for providing funding and assistance for this research.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Manlangit, S., Azam, S., Shanmugam, B., Kannoorpatti, K., Jonkman, M., Balasubramaniam, A. (2018). An Efficient Method for Detecting Fraudulent Transactions Using Classification Algorithms on an Anonymized Credit Card Data Set. In: Abraham, A., Muhuri, P., Muda, A., Gandhi, N. (eds) Intelligent Systems Design and Applications. ISDA 2017. Advances in Intelligent Systems and Computing, vol 736. Springer, Cham. https://doi.org/10.1007/978-3-319-76348-4_41
Download citation
DOI: https://doi.org/10.1007/978-3-319-76348-4_41
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-76347-7
Online ISBN: 978-3-319-76348-4
eBook Packages: EngineeringEngineering (R0)