Graph-Based Fraud Detection with the Free Energy Distance
Abstract
This paper investigates a real-world application of the free energy distance between nodes of a graph [14, 20] by proposing an improved extension of the existing Fraud Detection System named APATE [36]. It relies on a new way of computing the free energy distance based on paths of increasing length, and scaling on large, sparse, graphs. This new approach is assessed on a real-world large-scale e-commerce payment transactions dataset obtained from a major Belgian credit card issuer. Our results show that the free-energy based approach reduces the computation time by one half while maintaining state-of-the art performance in term of Precision@100 on fraudulent card prediction.
Keywords
Credit card fraud detection Network science Network data analysis Free energy distance Semi-supervised learningNotes
Acknowledgements
This work was partially supported by the Immediate funded by Wallon Region project and by the Defeatfrauds project funded by Innoviris. We thank these institutions for giving us the opportunity to conduct both fundamental and applied research. We also thank Worldline SA/NV, R&D, for providing us the data and expertise.
References
- 1.Abdallah, A., Maarof, M.A., Zainal, A.: Fraud detection system: a survey. J. Network Comput. Appl. 68, 90–113 (2016)CrossRefGoogle Scholar
- 2.Bahnsen, A.C., Stojanovic, A., Aouada, D., Ottersten, B.: Cost sensitive credit card fraud detection using bayes minimum risk. In: 2013 12th International Conference on Machine Learning and Applications, vol. 1, pp. 333–338. IEEE (2013)Google Scholar
- 3.Bhusari, V., Patil, S.: Study of hidden markov model in credit card fraudulent detection. Int. J. Comput. Appl. 20(5), 33–36 (2011)Google Scholar
- 4.Bolton, R.J., Hand, D.J.: Statistical fraud detection: a review. Stat. Sci. 1, 235–249 (2002)MathSciNetzbMATHGoogle Scholar
- 5.Callut, J., Francoisse, K., Saerens, M., Dupont, P.: Semi-supervised classification from discriminative random walks. In: W. Daelemans, K. Morik (eds.) Proceedings of the 19th European Conference on Machine Learning (ECML 2008), Lecture Notes in Artificial Intelligence, vol. 5211, pp. 162–177. Springer, Berlin (2008)Google Scholar
- 6.Cao, B., Mao, M., Viidu, S., Yu, P.: Collective fraud detection capturing inter-transaction dependency. In: KDD 2017 Workshop on Anomaly Detection in Finance, pp. 66–75 (2018)Google Scholar
- 7.Chan, P.K., Fan, W., Prodromidis, A.L., Stolfo, S.J.: Distributed data mining in credit card fraud detection. IEEE Intell. Syst. 14(6), 67–74 (1999)CrossRefGoogle Scholar
- 8.Consultants, H.: The nilson report issue 1142 (2018). https://nilsonreport.com
- 9.Dal Pozzolo, A., Boracchi, G., Caelen, O., Alippi, C., Bontempi, G.: Credit card fraud detection: a realistic modeling and a novel learning strategy. IEEE Trans. Neural Networks Learn. Syst. 29(8), 3784–3797 (2018)CrossRefGoogle Scholar
- 10.Dal Pozzolo, A., Caelen, O., Le Borgne, Y.A., Waterschoot, S., Bontempi, G.: Learned lessons in credit card fraud detection from a practitioner perspective. Expert Syst. Appl. 41(10), 4915–4928 (2014)CrossRefGoogle Scholar
- 11.Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. learn. res. 7, 1–30 (2006)MathSciNetzbMATHGoogle Scholar
- 12.Duman, E., Elikucuk, I.: Solving credit card fraud detection problem by the new metaheuristics migrating birds optimization. In: International Work-Conference on Artificial Neural Networks, pp. 62–71. Springer, Berlin (2013)CrossRefGoogle Scholar
- 13.Fouss, F., Saerens, M., Shimbo, M.: Algorithms and Models for Network Data and Link Analysis. Cambridge University Press, Cambridge (2016)CrossRefGoogle Scholar
- 14.Françoisse, K., Kivimäki, I., Mantrach, A., Rossi, F., Saerens, M.: A bag-of-paths framework for network data analysis. Neural Networks 90, 90–111 (2017)CrossRefGoogle Scholar
- 15.Gondran, M., Minoux, M.: Graphs and Algorithms. Wiley, Hoboken (1984)zbMATHGoogle Scholar
- 16.Guex, G., Courtain, S., Saerens, M.: Covariance and correlation kernels on a graph in the generalized bag-of-paths formalism. arXiv preprint arXiv:1902.03002 (2019)
- 17.Huang, X., Ariki, Y., Jack, M.: Hidden Markov Models for Speech Recognition. Edinburgh University Press, Edinburgh (1990)Google Scholar
- 18.Jurgovsky, J., Granitzer, M., Ziegler, K., Calabretto, S., Portier, P.E., He-Guelton, L., Caelen, O.: Sequence classification for credit-card fraud detection. Expert Syst. Appl. 100, 234–245 (2018)CrossRefGoogle Scholar
- 19.Kivimäki, I.: Distances, centralities and model estimation methods based on randomized shortest paths for network data analysis. Ph.D. thesis, UCL-Université Catholique de Louvain (2018)Google Scholar
- 20.Kivimäki, I., Shimbo, M., Saerens, M.: Developments in the theory of randomized shortest paths with a comparison of graph node distances. Physica A 393, 600–616 (2014)CrossRefGoogle Scholar
- 21.Lebichot, B., Braun, F., Caelen, O., Saerens, M.: A graph-based, semi-supervised, credit card fraud detection system. In: Cherifi, H., Gaito, S., Quattrociocchi, W., Sala, A. (eds.) International Workshop on Complex Networks and their Applications, pp. 721–733. Springer, Cham (2016)Google Scholar
- 22.Liu, Q., Wu, Y.: Supervised learning. Encyclopedia of the Sciences of Learning, pp. 3243–3245 (2012)CrossRefGoogle Scholar
- 23.Mantrach, A., Van Zeebroeck, N., Francq, P., Shimbo, M., Bersini, H., Saerens, M.: Semi-supervised classification and betweenness computation on large, sparse, directed graphs. Pattern Recogn. 44(6), 1212–1224 (2011)CrossRefGoogle Scholar
- 24.Molloy, I., Chari, S., Finkler, U., Wiggerman, M., Jonker, C., Habeck, T., Park, Y., Jordens, F., van Schaik, R.: Graph analytics for real-time scoring of cross-channel transactional fraud. In: Grossklags, J., Preneel, B. (eds.) International Conference on Financial Cryptography and Data Security, vol. 9603, pp. 22–40. Springer, Berlin (2016)CrossRefGoogle Scholar
- 25.Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press, Cambridge (2012)zbMATHGoogle Scholar
- 26.Ngai, E.W., Hu, Y., Wong, Y.H., Chen, Y., Sun, X.: The application of data mining techniques in financial fraud detection: a classification framework and an academic review of literature. Decision Support Systems 50(3), 559–569 (2011)CrossRefGoogle Scholar
- 27.Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. Technical report, Stanford InfoLab (1999)Google Scholar
- 28.Ramaki, A.A., Asgari, R., Atani, R.E.: Credit card fraud detection based on ontology graph. Int. J. Secur. Priv. Trust Manag. (IJSPTM) 1(5), 1–12 (2012)CrossRefGoogle Scholar
- 29.Sánchez, D., Vila, M., Cerda, L., Serrano, J.M.: Association rules applied to credit card fraud detection. Expert syst. appl. 36(2), 3630–3640 (2009)CrossRefGoogle Scholar
- 30.Shen, A., Tong, R., Deng, Y.: Application of classification models on credit card fraud detection. In: 2007 International conference on service systems and service management, pp. 1–4. IEEE (2007)Google Scholar
- 31.Sommer, F., Fouss, F., Saerens, M.: Comparison of graph node distances on clustering tasks. In: Artificial Neural Networks and Machine Learning – Proceedings of ICANN 2016. Lecture Notes in Computer Science, vol. 9886, 192–201. Springer Cham (2016)Google Scholar
- 32.Sommer, F., Fouss, F., Saerens, M.: Modularity-driven kernel k-means for community detection. Artificial Neural Networks and Machine Learning (Proceedings of ICANN 2016. Lecture Notes in Computer Science, vol. 10614, pp. 423–433. Springer, Cham (2017)CrossRefGoogle Scholar
- 33.Srivastava, A., Kundu, A., Sural, S., Majumdar, A.: Credit card fraud detection using hidden markov model. IEEE Trans. dependable secure comput. 5(1), 37–48 (2008)CrossRefGoogle Scholar
- 34.Theodoridis, S., Koutroumbas, K.: Pattern Recognition, 4th edn. Academic Press Inc., Cambridge (2008)zbMATHGoogle Scholar
- 35.Tong, H., Faloutsos, C., Pan, J.Y.: Fast random walk with restart and its applications. In: Sixth International Conference on Data Mining (ICDM 2006), pp. 613–622. IEEE (2006)Google Scholar
- 36.Van Vlasselaer, V., Bravo, C., Caelen, O., Eliassi-Rad, T., Akogu, L., Snoeck, M., Baesens, B.: Apate: a novel approach for automated credit card transaction fraud detection using network-based extensions. Decis. Support Syst. 75, 38–48 (2015)CrossRefGoogle Scholar
- 37.Weston, D.J., Hand, D.J., Adams, N.M., Whitrow, C., Juszczak, P.: Plastic card fraud detection using peer group analysis. Adv. Data Anal. Classif. 2(1), 45–62 (2008)MathSciNetCrossRefGoogle Scholar
- 38.Wheeler, R., Aitken, S.: Multiple algorithms for fraud detection. In: Ellis, R., Moulton, M., Coenen, F. (eds.) Applications and Innovations in Intelligent Systems VII, pp. 219–231. Springer, London (2000)CrossRefGoogle Scholar
- 39.Zaslavsky, V., Strizhak, A.: Credit card fraud detection using self-organizing maps. Inf. Secur. 18, 48 (2006)Google Scholar
- 40.Zhang, Z., Zhou, X., Zhang, X., Wang, L., Wang, P.: A model based onconvolutional neural network for online transaction fraud detection. Secur. Commun. Networks 2018, 9 (2018)Google Scholar
- 41.Zhou, X., Cheng, S., Zhu, M., Guo, C., Zhou, S., Xu, P., Xue, Z., Zhang, W.: A state of the art survey of data mining-based fraud detection and credit scoring. In: MATEC Web of Conferences, vol. 189. EDP Sciences, Les Ulis (2018)CrossRefGoogle Scholar