Refined Weighted Random Forest and Its Application to Credit Card Fraud Detection

Xuan, Shiyang; Liu, Guanjun; Li, Zhenchuan

doi:10.1007/978-3-030-04648-4_29

Shiyang Xuan¹⁷,
Guanjun Liu¹⁷ &
Zhenchuan Li¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11280))

Included in the following conference series:

International Conference on Computational Social Networks

2079 Accesses
13 Citations

Abstract

Random forest (RF) is widely used in many applications due to good classification performance. However, its voting mechanism assumes that all base classifiers have the same weight. In fact, it is more reasonable that some have relatively high weights while some have relatively low weights because the randomization of bootstrap sampling and attributes selecting cannot guarantee all trees have the same ability of making decision. We mainly focus on the weighted voting mechanism and then propose a novel weighted RF in this paper. Experiments on 6 public datasets illustrate that our method outperforms the RF and another weighted RF. We apply our method to credit card fraud detection and experiments also show that our method is the best.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Gupta, S., Johari, R.: A new framework for credit card transactions involving mutual authentication between cardholder and merchant. In: 2011 International Conference on Communication Systems and Network Technologies, pp. 22–26. IEEE (2011)
Google Scholar
Thomas, K., Grier, C., Ma, J., Paxson, V., Song, D.: Design and evaluation of a real-time url spam filtering service. In: Security and Privacy, vol. 42, pp. 447–462. IEEE (2011)
Google Scholar
Zhang, Y., Liu, G., Luan, W., Yan, C., Jiang, C.: An approach to class imbalance problem based on stacking and inverse random under sampling methods. In: 2018 IEEE 15th International Conference on Networking, Sensing and Control (ICNSC), pp. 1–6. IEEE (2018)
Google Scholar
Bolton, R.J., Hand, D.J.: Unsupervised profiling methods for fraud detection. In: Credit Scoring and Credit Control VII, pp. 235–255 (2001)
Google Scholar
Gmbh, Y., Co, K.G.: Global online payment methods: Full year 2016, Technical report (2016)
Google Scholar
Seyedhossein, L., Hashemi, M.R.: Mining information from credit card time series for timelier fraud detection. In: 2010 5th International Symposium on Telecommunications (IST), pp. 619–624. IEEE (2010)
Google Scholar
Zheng, L., Liu, G., Yan, C., Jiang, C.: Transaction fraud detection based on total order relation and behavior diversity. IEEE Trans. Comput. Soc. Syst. 99, 1–11 (2018)
Google Scholar
Srivastava, A., Kundu, A., Sural, S., Majumdar, A.: Credit card fraud detection using hidden Markov model. IEEE Trans. Dependable Secure Comput. 5(1), 37–48 (2008)
Article Google Scholar
Drummond, C., Holte, R.C.: C4.5, class imbalance, and cost sensitivity: why under-sampling beats oversampling. In: Proceedings of the ICML Workshop on Learning from Imbalanced Datasets II, pp. 1–8 (2003)
Google Scholar
Quah, J.T.S., Sriganesh, M.: Real-time credit card fraud detection using computational intelligence. Expert Syst. Appl. 35(4), 1721–1732 (2008)
Article Google Scholar
Kundu, A., Panigrahi, S., Sural, S., Majumdar, A.K.: Blast-ssaha hybridization for credit card fraud detection. IEEE Trans. Dependable Secure Comput. 6(4), 309–315 (2009)
Article Google Scholar
Xuan, S., Liu, G., Li, Z., Zheng, L., Wang, S., Jiang, C.: Random forest for credit card fraud detection. In: 2018 IEEE 15th International Conference on Networking, Sensing and Control (ICNSC), pp. 1–6. IEEE (2018)
Google Scholar
Bhattacharyya, S., Jha, S., Tharakunnel, K., Westland, J.C.: Data mining for credit card fraud: a comparative study. Decis. Support Syst. 50(3), 602–613 (2011)
Article Google Scholar
Mota, G., Fernandes, J., Belo, O.: Usage signatures analysis an alternative method for preventing fraud in E-Commerce applications. In: International Conference on Data Science and Advanced Analytics, pp. 203–208. IEEE (2014)
Google Scholar
Behdad, M., Barone, L., Bennamoun, M., French, T.: Nature-inspired techniques in the context of fraud detection. IEEE Trans. Syst. Man Cyber. Part C 42(6), 1273–1290 (2012)
Article Google Scholar
Bolton, R.J., Hand, D.J.: Statistical fraud detection: a review. Stat. Sci. 17(3), 235–249 (2002)
Article MathSciNet Google Scholar
Chan, P.K., Fan, W., Prodromidis, A.L., Stolfo, S.J.: Distributed data mining in credit card fraud detection. IEEE Intell. Syst. Appl. 14(6), 67–74 (2002)
Article Google Scholar
Chen, R.C., Chen, T.S., Lin, C.C.: A new binary support vector system for increasing detection rate of credit card fraud. Int. J. Pattern Recognit. Artif. Intell. 20(02), 227–239 (2006)
Article Google Scholar
Mcdonald, D.W., Ackerman, M.S.: Expertise recommender:a flexible recommendation system and architecture. In: ACM Conference on Computer Supported Cooperative Work, pp. 231–240. ACM (2000)
Google Scholar
Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41(3), 1–58 (2009)
Article Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article Google Scholar
Dietterich, T.G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-45014-9_1
Chapter Google Scholar
Quinlan, J.R.: Induction on decision tree. Mach. Learn. 1(1), 81–106 (1986)
Google Scholar
Breiman, L., Friedman, J.H., Olshen, R., Stone, C.J.: Classification and regression trees. Biometrics 40(3), 358 (1984)
MathSciNet MATH Google Scholar
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
MathSciNet MATH Google Scholar
Albrecht, W.S., Albrecht, C., Albrecht, C.C.: Current trends in fraud and its detection. Inf. Syst. Secur. 17(1), 2–12 (2008)
MATH Google Scholar
Li, H.B., Wang, W., Ding, H.W., Dong, J.: Trees weighting random forest method for classifying high-dimensional noisy data. In: IEEE, International Conference on E-Business Engineering, pp. 160–163. IEEE (2011)
Google Scholar
Zhou, Q., Zhou, H., Li, T.: Cost-sensitive feature selection using random forest: selecting low-cost subsets of informative features. Knowl. Based Syst. 95, 1–11 (2016)
Article Google Scholar
Harris, J.R., Grunsky, E.C.: Predictive lithological mapping of Canada’s North using random forest classification applied to geophysical and geochemical data. Comput. Geosci. 80, 9–25 (2015)
Article Google Scholar
Singh, K., Guntuku, S.C., Thakur, A., et al.: Big data analytics framework for peer-to-peer botnet detection using random forests. Inform. Sci. 278(19), 488–497 (2014)
Article Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27:1–27:27 (2011)
Article Google Scholar
Fanelli, G., Dantone, M., Gall, J., et al.: Random forests for real time 3D face analysis. Int. J. Comput. Vis. 101(3), 437–458 (2013)
Article Google Scholar
Winham, S.J., Freimuth, R.R., Biernacka, J.M.: A weighted random forests approach to improve predictive performance. Stat. Anal. Data Min. ASA Data Sci. J. 6(6), 496–505 (2013)
Article MathSciNet Google Scholar
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5), 1189–1232 (2001)
Article MathSciNet Google Scholar
UCI Homepage. http://archive.ics.uci.edu/ml/datasets.html
Scikit-learn Homepage. http://scikit-learn.org/stable/

Download references

Acknowledgments

Authors would like to thank reviewers for their helpful comments, and also thank Professor Changjun Jiang who provides authors a lot of assistance on data and experiments. This paper is supported in part by the National Natural Science Foundation of China under grand no. 61572360 and in part by the Shanghai Shuguang Program under grant no. 15SG18. Corresponding author is G.J. Liu.

Author information

Authors and Affiliations

Department of Computer Science and Technology, Tongji University, Shanghai, 201804, China
Shiyang Xuan, Guanjun Liu & Zhenchuan Li

Authors

Shiyang Xuan
View author publications
You can also search for this author in PubMed Google Scholar
Guanjun Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zhenchuan Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guanjun Liu .

Editor information

Editors and Affiliations

Texas Southern University, Houston, TX, USA
Xuemin Chen
Ira A. Fulton School of Engineering, Tempe, AZ, USA
Arunabha Sen
Texas Southern University, Houston, TX, USA
Wei Wayne Li
University of Florida, Gainesville, AL, USA
My T. Thai

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xuan, S., Liu, G., Li, Z. (2018). Refined Weighted Random Forest and Its Application to Credit Card Fraud Detection. In: Chen, X., Sen, A., Li, W., Thai, M. (eds) Computational Data and Social Networks. CSoNet 2018. Lecture Notes in Computer Science(), vol 11280. Springer, Cham. https://doi.org/10.1007/978-3-030-04648-4_29

Download citation

DOI: https://doi.org/10.1007/978-3-030-04648-4_29
Published: 18 November 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04647-7
Online ISBN: 978-3-030-04648-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics